Sample records for calculation method sequence

  1. A New Method for Setting Calculation Sequence of Directional Relay Protection in Multi-Loop Networks

    NASA Astrophysics Data System (ADS)

    Haijun, Xiong; Qi, Zhang

    2016-08-01

    Workload of relay protection setting calculation in multi-loop networks may be reduced effectively by optimization setting calculation sequences. A new method of setting calculation sequences of directional distance relay protection in multi-loop networks based on minimum broken nodes cost vector (MBNCV) was proposed to solve the problem experienced in current methods. Existing methods based on minimum breakpoint set (MBPS) lead to more break edges when untying the loops in dependent relationships of relays leading to possibly more iterative calculation workloads in setting calculations. A model driven approach based on behavior trees (BT) was presented to improve adaptability of similar problems. After extending the BT model by adding real-time system characters, timed BT was derived and the dependency relationship in multi-loop networks was then modeled. The model was translated into communication sequence process (CSP) models and an optimization setting calculation sequence in multi-loop networks was finally calculated by tools. A 5-nodes multi-loop network was applied as an example to demonstrate effectiveness of the modeling and calculation method. Several examples were then calculated with results indicating the method effectively reduces the number of forced broken edges for protection setting calculation in multi-loop networks.

  2. MRI T2 Mapping of the Knee Articular Cartilage Using Different Acquisition Sequences and Calculation Methods at 1.5 Tesla.

    PubMed

    Mars, Mokhtar; Bouaziz, Mouna; Tbini, Zeineb; Ladeb, Fethi; Gharbi, Souha

    2018-06-12

    This study aims to determine how Magnetic Resonance Imaging (MRI) acquisition techniques and calculation methods affect T2 values of knee cartilage at 1.5 Tesla and to identify sequences that can be used for high-resolution T2 mapping in short scanning times. This study was performed on phantom and twenty-nine patients who underwent MRI of the knee joint at 1.5 Tesla. The protocol includes T2 mapping sequences based on Single Echo Spin Echo (SESE), Multi-Echo Spin Echo (MESE), Fast Spin Echo (FSE) and Turbo Gradient Spin Echo (TGSE). The T2 relaxation times were quantified and evaluated using three calculation methods (MapIt, Syngo Offline and monoexponential fit). Signal to Noise Ratios (SNR) were measured in all sequences. All statistical analyses were performed using the t-test. The average T2 values in phantom were 41.7 ± 13.8 ms for SESE, 43.2 ± 14.4 ms for MESE, 42.4 ± 14.1 ms for FSE and 44 ± 14.5 ms for TGSE. In the patient study, the mean differences were 6.5 ± 8.2 ms, 7.8 ± 7.6 ms and 8.4 ± 14.2 ms for MESE, FSE and TGSE compared to SESE respectively; these statistical results were not significantly different (p > 0.05). The comparison between the three calculation methods showed no significant difference (p > 0.05). t-Test showed no significant difference between SNR values for all sequences. T2 values depend not only on the sequence type but also on the calculation method. None of the sequences revealed significant differences compared to the SESE reference sequence. TGSE with its short scanning time can be used for high-resolution T2 mapping. ©2018The Author(s). Published by S. Karger AG, Basel.

  3. Rényi continuous entropy of DNA sequences.

    PubMed

    Vinga, Susana; Almeida, Jonas S

    2004-12-07

    Entropy measures of DNA sequences estimate their randomness or, inversely, their repeatability. L-block Shannon discrete entropy accounts for the empirical distribution of all length-L words and has convergence problems for finite sequences. A new entropy measure that extends Shannon's formalism is proposed. Renyi's quadratic entropy calculated with Parzen window density estimation method applied to CGR/USM continuous maps of DNA sequences constitute a novel technique to evaluate sequence global randomness without some of the former method drawbacks. The asymptotic behaviour of this new measure was analytically deduced and the calculation of entropies for several synthetic and experimental biological sequences was performed. The results obtained were compared with the distributions of the null model of randomness obtained by simulation. The biological sequences have shown a different p-value according to the kernel resolution of Parzen's method, which might indicate an unknown level of organization of their patterns. This new technique can be very useful in the study of DNA sequence complexity and provide additional tools for DNA entropy estimation. The main MATLAB applications developed and additional material are available at the webpage . Specialized functions can be obtained from the authors.

  4. Volume calculation of CT lung lesions based on Halton low-discrepancy sequences

    NASA Astrophysics Data System (ADS)

    Li, Shusheng; Wang, Liansheng; Li, Shuo

    2017-03-01

    Volume calculation from the Computed Tomography (CT) lung lesions data is a significant parameter for clinical diagnosis. The volume is widely used to assess the severity of the lung nodules and track its progression, however, the accuracy and efficiency of previous studies are not well achieved for clinical uses. It remains to be a challenging task due to its tight attachment to the lung wall, inhomogeneous background noises and large variations in sizes and shape. In this paper, we employ Halton low-discrepancy sequences to calculate the volume of the lung lesions. The proposed method directly compute the volume without the procedure of three-dimension (3D) model reconstruction and surface triangulation, which significantly improves the efficiency and reduces the complexity. The main steps of the proposed method are: (1) generate a certain number of random points in each slice using Halton low-discrepancy sequences and calculate the lesion area of each slice through the proportion; (2) obtain the volume by integrating the areas in the sagittal direction. In order to evaluate our proposed method, the experiments were conducted on the sufficient data sets with different size of lung lesions. With the uniform distribution of random points, our proposed method achieves more accurate results compared with other methods, which demonstrates the robustness and accuracy for the volume calculation of CT lung lesions. In addition, our proposed method is easy to follow and can be extensively applied to other applications, e.g., volume calculation of liver tumor, atrial wall aneurysm, etc.

  5. The Use of a Software-Assisted Method to Estimate Fetal Weight at and Near Term Using Magnetic Resonance Imaging.

    PubMed

    Kadji, Caroline; De Groof, Maxime; Camus, Margaux F; De Angelis, Riccardo; Fellas, Stéphanie; Klass, Magdalena; Cecotti, Vera; Dütemeyer, Vivien; Barakat, Elie; Cannie, Mieke M; Jani, Jacques C

    2017-01-01

    The aim of this study was to apply a semi-automated calculation method of fetal body volume and, thus, of magnetic resonance-estimated fetal weight (MR-EFW) prior to planned delivery and to evaluate whether the technique of measurement could be simplified while remaining accurate. MR-EFW was calculated using a semi-automated method at 38.6 weeks of gestation in 36 patients and compared to the picture archiving and communication system (PACS). Per patient, 8 sequences were acquired with a slice thickness of 4-8 mm and an intersection gap of 0, 4, 8, 12, 16, or 20 mm. The median absolute relative errors for MR-EFW and the time of planimetric measurements were calculated for all 8 sequences and for each method (assisted vs. PACS), and the difference between the methods was calculated. The median delivery weight was 3,280 g. The overall median relative error for all 288 MR-EFW calculations was 2.4% using the semi-automated method and 2.2% for the PACS method. Measurements did not differ between the 8 sequences using the assisted method (p = 0.313) or the PACS (p = 0.118), while the time of planimetric measurement decreased significantly with a larger gap (p < 0.001) and in the assisted method compared to the PACS method (p < 0.01). Our simplified MR-EFW measurement showed a dramatic decrease in time of planimetric measurement without a decrease in the accuracy of weight estimates. © 2017 S. Karger AG, Basel.

  6. Reference voltage calculation method based on zero-sequence component optimisation for a regional compensation DVR

    NASA Astrophysics Data System (ADS)

    Jian, Le; Cao, Wang; Jintao, Yang; Yinge, Wang

    2018-04-01

    This paper describes the design of a dynamic voltage restorer (DVR) that can simultaneously protect several sensitive loads from voltage sags in a region of an MV distribution network. A novel reference voltage calculation method based on zero-sequence voltage optimisation is proposed for this DVR to optimise cost-effectiveness in compensation of voltage sags with different characteristics in an ungrounded neutral system. Based on a detailed analysis of the characteristics of voltage sags caused by different types of faults and the effect of the wiring mode of the transformer on these characteristics, the optimisation target of the reference voltage calculation is presented with several constraints. The reference voltages under all types of voltage sags are calculated by optimising the zero-sequence component, which can reduce the degree of swell in the phase-to-ground voltage after compensation to the maximum extent and can improve the symmetry degree of the output voltages of the DVR, thereby effectively increasing the compensation ability. The validity and effectiveness of the proposed method are verified by simulation and experimental results.

  7. Statistical theory for protein combinatorial libraries. Packing interactions, backbone flexibility, and the sequence variability of a main-chain structure.

    PubMed

    Kono, H; Saven, J G

    2001-02-23

    Combinatorial experiments provide new ways to probe the determinants of protein folding and to identify novel folding amino acid sequences. These types of experiments, however, are complicated both by enormous conformational complexity and by large numbers of possible sequences. Therefore, a quantitative computational theory would be helpful in designing and interpreting these types of experiment. Here, we present and apply a statistically based, computational approach for identifying the properties of sequences compatible with a given main-chain structure. Protein side-chain conformations are included in an atom-based fashion. Calculations are performed for a variety of similar backbone structures to identify sequence properties that are robust with respect to minor changes in main-chain structure. Rather than specific sequences, the method yields the likelihood of each of the amino acids at preselected positions in a given protein structure. The theory may be used to quantify the characteristics of sequence space for a chosen structure without explicitly tabulating sequences. To account for hydrophobic effects, we introduce an environmental energy that it is consistent with other simple hydrophobicity scales and show that it is effective for side-chain modeling. We apply the method to calculate the identity probabilities of selected positions of the immunoglobulin light chain-binding domain of protein L, for which many variant folding sequences are available. The calculations compare favorably with the experimentally observed identity probabilities.

  8. GUTSS: An Alignment-Free Sequence Comparison Method for Use in Human Intestinal Microbiome and Fecal Microbiota Transplantation Analysis.

    PubMed

    Brittnacher, Mitchell J; Heltshe, Sonya L; Hayden, Hillary S; Radey, Matthew C; Weiss, Eli J; Damman, Christopher J; Zisman, Timothy L; Suskind, David L; Miller, Samuel I

    2016-01-01

    Comparative analysis of gut microbiomes in clinical studies of human diseases typically rely on identification and quantification of species or genes. In addition to exploring specific functional characteristics of the microbiome and potential significance of species diversity or expansion, microbiome similarity is also calculated to study change in response to therapies directed at altering the microbiome. Established ecological measures of similarity can be constructed from species abundances, however methods for calculating these commonly used ecological measures of similarity directly from whole genome shotgun (WGS) metagenomic sequence are lacking. We present an alignment-free method for calculating similarity of WGS metagenomic sequences that is analogous to the Bray-Curtis index for species, implemented by the General Utility for Testing Sequence Similarity (GUTSS) software application. This method was applied to intestinal microbiomes of healthy young children to measure developmental changes toward an adult microbiome during the first 3 years of life. We also calculate similarity of donor and recipient microbiomes to measure establishment, or engraftment, of donor microbiota in fecal microbiota transplantation (FMT) studies focused on mild to moderate Crohn's disease. We show how a relative index of similarity to donor can be calculated as a measure of change in a patient's microbiome toward that of the donor in response to FMT. Because clinical efficacy of the transplant procedure cannot be fully evaluated without analysis methods to quantify actual FMT engraftment, we developed a method for detecting change in the gut microbiome that is independent of species identification and database bias, sensitive to changes in relative abundance of the microbial constituents, and can be formulated as an index for correlating engraftment success with clinical measures of disease. More generally, this method may be applied to clinical evaluation of human microbiomes and provide potential diagnostic determination of individuals who may be candidates for specific therapies directed at alteration of the microbiome.

  9. Subgrouping Automata: automatic sequence subgrouping using phylogenetic tree-based optimum subgrouping algorithm.

    PubMed

    Seo, Joo-Hyun; Park, Jihyang; Kim, Eun-Mi; Kim, Juhan; Joo, Keehyoung; Lee, Jooyoung; Kim, Byung-Gee

    2014-02-01

    Sequence subgrouping for a given sequence set can enable various informative tasks such as the functional discrimination of sequence subsets and the functional inference of unknown sequences. Because an identity threshold for sequence subgrouping may vary according to the given sequence set, it is highly desirable to construct a robust subgrouping algorithm which automatically identifies an optimal identity threshold and generates subgroups for a given sequence set. To meet this end, an automatic sequence subgrouping method, named 'Subgrouping Automata' was constructed. Firstly, tree analysis module analyzes the structure of tree and calculates the all possible subgroups in each node. Sequence similarity analysis module calculates average sequence similarity for all subgroups in each node. Representative sequence generation module finds a representative sequence using profile analysis and self-scoring for each subgroup. For all nodes, average sequence similarities are calculated and 'Subgrouping Automata' searches a node showing statistically maximum sequence similarity increase using Student's t-value. A node showing the maximum t-value, which gives the most significant differences in average sequence similarity between two adjacent nodes, is determined as an optimum subgrouping node in the phylogenetic tree. Further analysis showed that the optimum subgrouping node from SA prevents under-subgrouping and over-subgrouping. Copyright © 2013. Published by Elsevier Ltd.

  10. Structural system reliability calculation using a probabilistic fault tree analysis method

    NASA Technical Reports Server (NTRS)

    Torng, T. Y.; Wu, Y.-T.; Millwater, H. R.

    1992-01-01

    The development of a new probabilistic fault tree analysis (PFTA) method for calculating structural system reliability is summarized. The proposed PFTA procedure includes: developing a fault tree to represent the complex structural system, constructing an approximation function for each bottom event, determining a dominant sampling sequence for all bottom events, and calculating the system reliability using an adaptive importance sampling method. PFTA is suitable for complicated structural problems that require computer-intensive computer calculations. A computer program has been developed to implement the PFTA.

  11. Object detection and tracking system

    DOEpatents

    Ma, Tian J.

    2017-05-30

    Methods and apparatuses for analyzing a sequence of images for an object are disclosed herein. In a general embodiment, the method identifies a region of interest in the sequence of images. The object is likely to move within the region of interest. The method divides the region of interest in the sequence of images into sections and calculates signal-to-noise ratios for a section in the sections. A signal-to-noise ratio for the section is calculated using the section in the image, a prior section in a prior image to the image, and a subsequent section in a subsequent image to the image. The signal-to-noise ratios are for potential velocities of the object in the section. The method also selects a velocity from the potential velocities for the object in the section using a potential velocity in the potential velocities having a highest signal-to-noise ratio in the signal-to-noise ratios.

  12. Sequence-specific bias correction for RNA-seq data using recurrent neural networks.

    PubMed

    Zhang, Yao-Zhong; Yamaguchi, Rui; Imoto, Seiya; Miyano, Satoru

    2017-01-25

    The recent success of deep learning techniques in machine learning and artificial intelligence has stimulated a great deal of interest among bioinformaticians, who now wish to bring the power of deep learning to bare on a host of bioinformatical problems. Deep learning is ideally suited for biological problems that require automatic or hierarchical feature representation for biological data when prior knowledge is limited. In this work, we address the sequence-specific bias correction problem for RNA-seq data redusing Recurrent Neural Networks (RNNs) to model nucleotide sequences without pre-determining sequence structures. The sequence-specific bias of a read is then calculated based on the sequence probabilities estimated by RNNs, and used in the estimation of gene abundance. We explore the application of two popular RNN recurrent units for this task and demonstrate that RNN-based approaches provide a flexible way to model nucleotide sequences without knowledge of predetermined sequence structures. Our experiments show that training a RNN-based nucleotide sequence model is efficient and RNN-based bias correction methods compare well with the-state-of-the-art sequence-specific bias correction method on the commonly used MAQC-III data set. RNNs provides an alternative and flexible way to calculate sequence-specific bias without explicitly pre-determining sequence structures.

  13. Rational design of DNA sequences for nanotechnology, microarrays and molecular computers using Eulerian graphs.

    PubMed

    Pancoska, Petr; Moravek, Zdenek; Moll, Ute M

    2004-01-01

    Nucleic acids are molecules of choice for both established and emerging nanoscale technologies. These technologies benefit from large functional densities of 'DNA processing elements' that can be readily manufactured. To achieve the desired functionality, polynucleotide sequences are currently designed by a process that involves tedious and laborious filtering of potential candidates against a series of requirements and parameters. Here, we present a complete novel methodology for the rapid rational design of large sets of DNA sequences. This method allows for the direct implementation of very complex and detailed requirements for the generated sequences, thus avoiding 'brute force' filtering. At the same time, these sequences have narrow distributions of melting temperatures. The molecular part of the design process can be done without computer assistance, using an efficient 'human engineering' approach by drawing a single blueprint graph that represents all generated sequences. Moreover, the method eliminates the necessity for extensive thermodynamic calculations. Melting temperature can be calculated only once (or not at all). In addition, the isostability of the sequences is independent of the selection of a particular set of thermodynamic parameters. Applications are presented for DNA sequence designs for microarrays, universal microarray zip sequences and electron transfer experiments.

  14. Quantiprot - a Python package for quantitative analysis of protein sequences.

    PubMed

    Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold

    2017-07-17

    The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.

  15. Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.

    PubMed

    Liu, Xuejun; Shi, Xinxin; Chen, Chunlin; Zhang, Li

    2015-10-16

    The high-throughput sequencing technology, RNA-Seq, has been widely used to quantify gene and isoform expression in the study of transcriptome in recent years. Accurate expression measurement from the millions or billions of short generated reads is obstructed by difficulties. One is ambiguous mapping of reads to reference transcriptome caused by alternative splicing. This increases the uncertainty in estimating isoform expression. The other is non-uniformity of read distribution along the reference transcriptome due to positional, sequencing, mappability and other undiscovered sources of biases. This violates the uniform assumption of read distribution for many expression calculation approaches, such as the direct RPKM calculation and Poisson-based models. Many methods have been proposed to address these difficulties. Some approaches employ latent variable models to discover the underlying pattern of read sequencing. However, most of these methods make bias correction based on surrounding sequence contents and share the bias models by all genes. They therefore cannot estimate gene- and isoform-specific biases as revealed by recent studies. We propose a latent variable model, NLDMseq, to estimate gene and isoform expression. Our method adopts latent variables to model the unknown isoforms, from which reads originate, and the underlying percentage of multiple spliced variants. The isoform- and exon-specific read sequencing biases are modeled to account for the non-uniformity of read distribution, and are identified by utilizing the replicate information of multiple lanes of a single library run. We employ simulation and real data to verify the performance of our method in terms of accuracy in the calculation of gene and isoform expression. Results show that NLDMseq obtains competitive gene and isoform expression compared to popular alternatives. Finally, the proposed method is applied to the detection of differential expression (DE) to show its usefulness in the downstream analysis. The proposed NLDMseq method provides an approach to accurately estimate gene and isoform expression from RNA-Seq data by modeling the isoform- and exon-specific read sequencing biases. It makes use of a latent variable model to discover the hidden pattern of read sequencing. We have shown that it works well in both simulations and real datasets, and has competitive performance compared to popular methods. The method has been implemented as a freely available software which can be found at https://github.com/PUGEA/NLDMseq.

  16. Structural optimization with approximate sensitivities

    NASA Technical Reports Server (NTRS)

    Patnaik, S. N.; Hopkins, D. A.; Coroneos, R.

    1994-01-01

    Computational efficiency in structural optimization can be enhanced if the intensive computations associated with the calculation of the sensitivities, that is, gradients of the behavior constraints, are reduced. Approximation to gradients of the behavior constraints that can be generated with small amount of numerical calculations is proposed. Structural optimization with these approximate sensitivities produced correct optimum solution. Approximate gradients performed well for different nonlinear programming methods, such as the sequence of unconstrained minimization technique, method of feasible directions, sequence of quadratic programming, and sequence of linear programming. Structural optimization with approximate gradients can reduce by one third the CPU time that would otherwise be required to solve the problem with explicit closed-form gradients. The proposed gradient approximation shows potential to reduce intensive computation that has been associated with traditional structural optimization.

  17. A new method to cluster genomes based on cumulative Fourier power spectrum.

    PubMed

    Dong, Rui; Zhu, Ziyue; Yin, Changchuan; He, Rong L; Yau, Stephen S-T

    2018-06-20

    Analyzing phylogenetic relationships using mathematical methods has always been of importance in bioinformatics. Quantitative research may interpret the raw biological data in a precise way. Multiple Sequence Alignment (MSA) is used frequently to analyze biological evolutions, but is very time-consuming. When the scale of data is large, alignment methods cannot finish calculation in reasonable time. Therefore, we present a new method using moments of cumulative Fourier power spectrum in clustering the DNA sequences. Each sequence is translated into a vector in Euclidean space. Distances between the vectors can reflect the relationships between sequences. The mapping between the spectra and moment vector is one-to-one, which means that no information is lost in the power spectra during the calculation. We cluster and classify several datasets including Influenza A, primates, and human rhinovirus (HRV) datasets to build up the phylogenetic trees. Results show that the new proposed cumulative Fourier power spectrum is much faster and more accurately than MSA and another alignment-free method known as k-mer. The research provides us new insights in the study of phylogeny, evolution, and efficient DNA comparison algorithms for large genomes. The computer programs of the cumulative Fourier power spectrum are available at GitHub (https://github.com/YaulabTsinghua/cumulative-Fourier-power-spectrum). Copyright © 2018. Published by Elsevier B.V.

  18. MPN estimation of qPCR target sequence recoveries from whole cell calibrator samples.

    PubMed

    Sivaganesan, Mano; Siefring, Shawn; Varma, Manju; Haugland, Richard A

    2011-12-01

    DNA extracts from enumerated target organism cells (calibrator samples) have been used for estimating Enterococcus cell equivalent densities in surface waters by a comparative cycle threshold (Ct) qPCR analysis method. To compare surface water Enterococcus density estimates from different studies by this approach, either a consistent source of calibrator cells must be used or the estimates must account for any differences in target sequence recoveries from different sources of calibrator cells. In this report we describe two methods for estimating target sequence recoveries from whole cell calibrator samples based on qPCR analyses of their serially diluted DNA extracts and most probable number (MPN) calculation. The first method employed a traditional MPN calculation approach. The second method employed a Bayesian hierarchical statistical modeling approach and a Monte Carlo Markov Chain (MCMC) simulation method to account for the uncertainty in these estimates associated with different individual samples of the cell preparations, different dilutions of the DNA extracts and different qPCR analytical runs. The two methods were applied to estimate mean target sequence recoveries per cell from two different lots of a commercially available source of enumerated Enterococcus cell preparations. The mean target sequence recovery estimates (and standard errors) per cell from Lot A and B cell preparations by the Bayesian method were 22.73 (3.4) and 11.76 (2.4), respectively, when the data were adjusted for potential false positive results. Means were similar for the traditional MPN approach which cannot comparably assess uncertainty in the estimates. Cell numbers and estimates of recoverable target sequences in calibrator samples prepared from the two cell sources were also used to estimate cell equivalent and target sequence quantities recovered from surface water samples in a comparative Ct method. Our results illustrate the utility of the Bayesian method in accounting for uncertainty, the high degree of precision attainable by the MPN approach and the need to account for the differences in target sequence recoveries from different calibrator sample cell sources when they are used in the comparative Ct method. Published by Elsevier B.V.

  19. Integrating multiple genomic data to predict disease-causing nonsynonymous single nucleotide variants in exome sequencing studies.

    PubMed

    Wu, Jiaxin; Li, Yanda; Jiang, Rui

    2014-03-01

    Exome sequencing has been widely used in detecting pathogenic nonsynonymous single nucleotide variants (SNVs) for human inherited diseases. However, traditional statistical genetics methods are ineffective in analyzing exome sequencing data, due to such facts as the large number of sequenced variants, the presence of non-negligible fraction of pathogenic rare variants or de novo mutations, and the limited size of affected and normal populations. Indeed, prevalent applications of exome sequencing have been appealing for an effective computational method for identifying causative nonsynonymous SNVs from a large number of sequenced variants. Here, we propose a bioinformatics approach called SPRING (Snv PRioritization via the INtegration of Genomic data) for identifying pathogenic nonsynonymous SNVs for a given query disease. Based on six functional effect scores calculated by existing methods (SIFT, PolyPhen2, LRT, MutationTaster, GERP and PhyloP) and five association scores derived from a variety of genomic data sources (gene ontology, protein-protein interactions, protein sequences, protein domain annotations and gene pathway annotations), SPRING calculates the statistical significance that an SNV is causative for a query disease and hence provides a means of prioritizing candidate SNVs. With a series of comprehensive validation experiments, we demonstrate that SPRING is valid for diseases whose genetic bases are either partly known or completely unknown and effective for diseases with a variety of inheritance styles. In applications of our method to real exome sequencing data sets, we show the capability of SPRING in detecting causative de novo mutations for autism, epileptic encephalopathies and intellectual disability. We further provide an online service, the standalone software and genome-wide predictions of causative SNVs for 5,080 diseases at http://bioinfo.au.tsinghua.edu.cn/spring.

  20. Malware analysis using visualized image matrices.

    PubMed

    Han, KyoungSoo; Kang, BooJoong; Im, Eul Gyu

    2014-01-01

    This paper proposes a novel malware visual analysis method that contains not only a visualization method to convert binary files into images, but also a similarity calculation method between these images. The proposed method generates RGB-colored pixels on image matrices using the opcode sequences extracted from malware samples and calculates the similarities for the image matrices. Particularly, our proposed methods are available for packed malware samples by applying them to the execution traces extracted through dynamic analysis. When the images are generated, we can reduce the overheads by extracting the opcode sequences only from the blocks that include the instructions related to staple behaviors such as functions and application programming interface (API) calls. In addition, we propose a technique that generates a representative image for each malware family in order to reduce the number of comparisons for the classification of unknown samples and the colored pixel information in the image matrices is used to calculate the similarities between the images. Our experimental results show that the image matrices of malware can effectively be used to classify malware families both statically and dynamically with accuracy of 0.9896 and 0.9732, respectively.

  1. Ancestral sequence reconstruction in primate mitochondrial DNA: compositional bias and effect on functional inference.

    PubMed

    Krishnan, Neeraja M; Seligmann, Hervé; Stewart, Caro-Beth; De Koning, A P Jason; Pollock, David D

    2004-10-01

    Reconstruction of ancestral DNA and amino acid sequences is an important means of inferring information about past evolutionary events. Such reconstructions suggest changes in molecular function and evolutionary processes over the course of evolution and are used to infer adaptation and convergence. Maximum likelihood (ML) is generally thought to provide relatively accurate reconstructed sequences compared to parsimony, but both methods lead to the inference of multiple directional changes in nucleotide frequencies in primate mitochondrial DNA (mtDNA). To better understand this surprising result, as well as to better understand how parsimony and ML differ, we constructed a series of computationally simple "conditional pathway" methods that differed in the number of substitutions allowed per site along each branch, and we also evaluated the entire Bayesian posterior frequency distribution of reconstructed ancestral states. We analyzed primate mitochondrial cytochrome b (Cyt-b) and cytochrome oxidase subunit I (COI) genes and found that ML reconstructs ancestral frequencies that are often more different from tip sequences than are parsimony reconstructions. In contrast, frequency reconstructions based on the posterior ensemble more closely resemble extant nucleotide frequencies. Simulations indicate that these differences in ancestral sequence inference are probably due to deterministic bias caused by high uncertainty in the optimization-based ancestral reconstruction methods (parsimony, ML, Bayesian maximum a posteriori). In contrast, ancestral nucleotide frequencies based on an average of the Bayesian set of credible ancestral sequences are much less biased. The methods involving simpler conditional pathway calculations have slightly reduced likelihood values compared to full likelihood calculations, but they can provide fairly unbiased nucleotide reconstructions and may be useful in more complex phylogenetic analyses than considered here due to their speed and flexibility. To determine whether biased reconstructions using optimization methods might affect inferences of functional properties, ancestral primate mitochondrial tRNA sequences were inferred and helix-forming propensities for conserved pairs were evaluated in silico. For ambiguously reconstructed nucleotides at sites with high base composition variability, ancestral tRNA sequences from Bayesian analyses were more compatible with canonical base pairing than were those inferred by other methods. Thus, nucleotide bias in reconstructed sequences apparently can lead to serious bias and inaccuracies in functional predictions.

  2. A meta-heuristic method for solving scheduling problem: crow search algorithm

    NASA Astrophysics Data System (ADS)

    Adhi, Antono; Santosa, Budi; Siswanto, Nurhadi

    2018-04-01

    Scheduling is one of the most important processes in an industry both in manufacturingand services. The scheduling process is the process of selecting resources to perform an operation on tasks. Resources can be machines, peoples, tasks, jobs or operations.. The selection of optimum sequence of jobs from a permutation is an essential issue in every research in scheduling problem. Optimum sequence becomes optimum solution to resolve scheduling problem. Scheduling problem becomes NP-hard problem since the number of job in the sequence is more than normal number can be processed by exact algorithm. In order to obtain optimum results, it needs a method with capability to solve complex scheduling problems in an acceptable time. Meta-heuristic is a method usually used to solve scheduling problem. The recently published method called Crow Search Algorithm (CSA) is adopted in this research to solve scheduling problem. CSA is an evolutionary meta-heuristic method which is based on the behavior in flocks of crow. The calculation result of CSA for solving scheduling problem is compared with other algorithms. From the comparison, it is found that CSA has better performance in term of optimum solution and time calculation than other algorithms.

  3. How to Calculate Renyi Entropy from Heart Rate Variability, and Why it Matters for Detecting Cardiac Autonomic Neuropathy.

    PubMed

    Cornforth, David J; Tarvainen, Mika P; Jelinek, Herbert F

    2014-01-01

    Cardiac autonomic neuropathy (CAN) is a disease that involves nerve damage leading to an abnormal control of heart rate. An open question is to what extent this condition is detectable from heart rate variability (HRV), which provides information only on successive intervals between heart beats, yet is non-invasive and easy to obtain from a three-lead ECG recording. A variety of measures may be extracted from HRV, including time domain, frequency domain, and more complex non-linear measures. Among the latter, Renyi entropy has been proposed as a suitable measure that can be used to discriminate CAN from controls. However, all entropy methods require estimation of probabilities, and there are a number of ways in which this estimation can be made. In this work, we calculate Renyi entropy using several variations of the histogram method and a density method based on sequences of RR intervals. In all, we calculate Renyi entropy using nine methods and compare their effectiveness in separating the different classes of participants. We found that the histogram method using single RR intervals yields an entropy measure that is either incapable of discriminating CAN from controls, or that it provides little information that could not be gained from the SD of the RR intervals. In contrast, probabilities calculated using a density method based on sequences of RR intervals yield an entropy measure that provides good separation between groups of participants and provides information not available from the SD. The main contribution of this work is that different approaches to calculating probability may affect the success of detecting disease. Our results bring new clarity to the methods used to calculate the Renyi entropy in general, and in particular, to the successful detection of CAN.

  4. How to Calculate Renyi Entropy from Heart Rate Variability, and Why it Matters for Detecting Cardiac Autonomic Neuropathy

    PubMed Central

    Cornforth, David J.;  Tarvainen, Mika P.; Jelinek, Herbert F.

    2014-01-01

    Cardiac autonomic neuropathy (CAN) is a disease that involves nerve damage leading to an abnormal control of heart rate. An open question is to what extent this condition is detectable from heart rate variability (HRV), which provides information only on successive intervals between heart beats, yet is non-invasive and easy to obtain from a three-lead ECG recording. A variety of measures may be extracted from HRV, including time domain, frequency domain, and more complex non-linear measures. Among the latter, Renyi entropy has been proposed as a suitable measure that can be used to discriminate CAN from controls. However, all entropy methods require estimation of probabilities, and there are a number of ways in which this estimation can be made. In this work, we calculate Renyi entropy using several variations of the histogram method and a density method based on sequences of RR intervals. In all, we calculate Renyi entropy using nine methods and compare their effectiveness in separating the different classes of participants. We found that the histogram method using single RR intervals yields an entropy measure that is either incapable of discriminating CAN from controls, or that it provides little information that could not be gained from the SD of the RR intervals. In contrast, probabilities calculated using a density method based on sequences of RR intervals yield an entropy measure that provides good separation between groups of participants and provides information not available from the SD. The main contribution of this work is that different approaches to calculating probability may affect the success of detecting disease. Our results bring new clarity to the methods used to calculate the Renyi entropy in general, and in particular, to the successful detection of CAN. PMID:25250311

  5. Detection of Mycobacterium tuberculosis in extrapulmonary biopsy samples using PCR targeting IS6110, rpoB, and nested-rpoB PCR Cloning

    PubMed Central

    Meghdadi, Hossein; Khosravi, Azar D.; Ghadiri, Ata A.; Sina, Amir H.; Alami, Ameneh

    2015-01-01

    Present study was aimed to examine the diagnostic utility of polymerase chain reaction (PCR) and nested PCR techniques for the detection of Mycobacterium tuberculosis (MTB) DNA in samples from patients with extra pulmonary tuberculosis (EPTB). In total 80 formalin-fixed, paraffin-embedded (FFPE) samples comprising 70 samples with definite diagnosis of EPTB and 10 samples from known non- EPTB on the basis of histopathology examination, were included in the study. PCR amplification targeting IS6110, rpoB gene and nested PCR targeting the rpoB gene were performed on the extracted DNAs from 80 FFPE samples. The strong positive samples were directly sequenced. For negative samples and those with weak band in nested-rpoB PCR, TA cloning was performed by cloning the products into the plasmid vector with subsequent sequencing. The 95% confidence intervals (CI) for the estimates of sensitivity and specificity were calculated for each method. Fourteen (20%), 34 (48.6%), and 60 (85.7%) of the 70 positive samples confirmed by histopathology, were positive by rpoB-PCR, IS6110-PCR, and nested-rpoB PCR, respectively. By performing TA cloning on samples that yielded weak (n = 8) or negative results (n = 10) in the PCR methods, we were able to improve their quality for later sequencing. All samples with weak band and 7 out of 10 negative samples, showed strong positive results after cloning. So nested-rpoB PCR cloning revealed positivity in 67 out of 70 confirmed samples (95.7%). The sensitivity of these combination methods was calculated as 95.7% in comparison with histopathology examination. The CI for sensitivity of the PCR methods were calculated as 11.39–31.27% for rpoB-PCR, 36.44–60.83% for IS6110- PCR, 75.29–92.93% for nested-rpoB PCR, and 87.98–99.11% for nested-rpoB PCR cloning. The 10 true EPTB negative samples by histopathology, were negative by all tested methods including cloning and were used to calculate the specificity of the applied methods. The CI for 100% specificity of each PCR method were calculated as 69.15–100%. Our results indicated that nested-rpoB PCR combined with TA cloning and sequencing is a preferred method for the detection of MTB DNA in EPTB samples with high sensitivity and specificity which confirm the histopathology results. PMID:26191059

  6. Detection of Mycobacterium tuberculosis in extrapulmonary biopsy samples using PCR targeting IS6110, rpoB, and nested-rpoB PCR Cloning.

    PubMed

    Meghdadi, Hossein; Khosravi, Azar D; Ghadiri, Ata A; Sina, Amir H; Alami, Ameneh

    2015-01-01

    Present study was aimed to examine the diagnostic utility of polymerase chain reaction (PCR) and nested PCR techniques for the detection of Mycobacterium tuberculosis (MTB) DNA in samples from patients with extra pulmonary tuberculosis (EPTB). In total 80 formalin-fixed, paraffin-embedded (FFPE) samples comprising 70 samples with definite diagnosis of EPTB and 10 samples from known non- EPTB on the basis of histopathology examination, were included in the study. PCR amplification targeting IS6110, rpoB gene and nested PCR targeting the rpoB gene were performed on the extracted DNAs from 80 FFPE samples. The strong positive samples were directly sequenced. For negative samples and those with weak band in nested-rpoB PCR, TA cloning was performed by cloning the products into the plasmid vector with subsequent sequencing. The 95% confidence intervals (CI) for the estimates of sensitivity and specificity were calculated for each method. Fourteen (20%), 34 (48.6%), and 60 (85.7%) of the 70 positive samples confirmed by histopathology, were positive by rpoB-PCR, IS6110-PCR, and nested-rpoB PCR, respectively. By performing TA cloning on samples that yielded weak (n = 8) or negative results (n = 10) in the PCR methods, we were able to improve their quality for later sequencing. All samples with weak band and 7 out of 10 negative samples, showed strong positive results after cloning. So nested-rpoB PCR cloning revealed positivity in 67 out of 70 confirmed samples (95.7%). The sensitivity of these combination methods was calculated as 95.7% in comparison with histopathology examination. The CI for sensitivity of the PCR methods were calculated as 11.39-31.27% for rpoB-PCR, 36.44-60.83% for IS6110- PCR, 75.29-92.93% for nested-rpoB PCR, and 87.98-99.11% for nested-rpoB PCR cloning. The 10 true EPTB negative samples by histopathology, were negative by all tested methods including cloning and were used to calculate the specificity of the applied methods. The CI for 100% specificity of each PCR method were calculated as 69.15-100%. Our results indicated that nested-rpoB PCR combined with TA cloning and sequencing is a preferred method for the detection of MTB DNA in EPTB samples with high sensitivity and specificity which confirm the histopathology results.

  7. Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms.

    PubMed

    Speiser, Daniel I; Pankey, M Sabrina; Zaharoff, Alexander K; Battelle, Barbara A; Bracken-Grissom, Heather D; Breinholt, Jesse W; Bybee, Seth M; Cronin, Thomas W; Garm, Anders; Lindgren, Annie R; Patel, Nipam H; Porter, Megan L; Protas, Meredith E; Rivera, Ajna S; Serb, Jeanne M; Zigler, Kirk S; Crandall, Keith A; Oakley, Todd H

    2014-11-19

    Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( http://bitbucket.org/osiris_phylogenetics/pia/ ) and we demonstrate PIA on a publicly-accessible web server ( http://galaxy-dev.cnsi.ucsb.edu/pia/ ). Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa.

  8. On the relationship between residue structural environment and sequence conservation in proteins.

    PubMed

    Liu, Jen-Wei; Lin, Jau-Ji; Cheng, Chih-Wen; Lin, Yu-Feng; Hwang, Jenn-Kang; Huang, Tsun-Tsao

    2017-09-01

    Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C α positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C α atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C α atoms with other substructures in their contributions to the sequence conservation. Our results show that C α positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C α atoms and the other substructures are high, yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C α and all-atom substructures. These results indicate that only C α atoms of a protein structure could reflect sequence conservation at the residue level. © 2017 Wiley Periodicals, Inc.

  9. Is multiple-sequence alignment required for accurate inference of phylogeny?

    PubMed

    Höhl, Michael; Ragan, Mark A

    2007-04-01

    The process of inferring phylogenetic trees from molecular sequences almost always starts with a multiple alignment of these sequences but can also be based on methods that do not involve multiple sequence alignment. Very little is known about the accuracy with which such alignment-free methods recover the correct phylogeny or about the potential for increasing their accuracy. We conducted a large-scale comparison of ten alignment-free methods, among them one new approach that does not calculate distances and a faster variant of our pattern-based approach; all distance-based alignment-free methods are freely available from http://www.bioinformatics.org.au (as Python package decaf+py). We show that most methods exhibit a higher overall reconstruction accuracy in the presence of high among-site rate variation. Under all conditions that we considered, variants of the pattern-based approach were significantly better than the other alignment-free methods. The new pattern-based variant achieved a speed-up of an order of magnitude in the distance calculation step, accompanied by a small loss of tree reconstruction accuracy. A method of Bayesian inference from k-mers did not improve on classical alignment-free (and distance-based) methods but may still offer other advantages due to its Bayesian nature. We found the optimal word length k of word-based methods to be stable across various data sets, and we provide parameter ranges for two different alphabets. The influence of these alphabets was analyzed to reveal a trade-off in reconstruction accuracy between long and short branches. We have mapped the phylogenetic accuracy for many alignment-free methods, among them several recently introduced ones, and increased our understanding of their behavior in response to biologically important parameters. In all experiments, the pattern-based approach emerged as superior, at the expense of higher resource consumption. Nonetheless, no alignment-free method that we examined recovers the correct phylogeny as accurately as does an approach based on maximum-likelihood distance estimates of multiply aligned sequences.

  10. A Method of Time-Intensity Curve Calculation for Vascular Perfusion of Uterine Fibroids Based on Subtraction Imaging with Motion Correction

    NASA Astrophysics Data System (ADS)

    Zhu, Xinjian; Wu, Ruoyu; Li, Tao; Zhao, Dawei; Shan, Xin; Wang, Puling; Peng, Song; Li, Faqi; Wu, Baoming

    2016-12-01

    The time-intensity curve (TIC) from contrast-enhanced ultrasound (CEUS) image sequence of uterine fibroids provides important parameter information for qualitative and quantitative evaluation of efficacy of treatment such as high-intensity focused ultrasound surgery. However, respiration and other physiological movements inevitably affect the process of CEUS imaging, and this reduces the accuracy of TIC calculation. In this study, a method of TIC calculation for vascular perfusion of uterine fibroids based on subtraction imaging with motion correction is proposed. First, the fibroid CEUS recording video was decoded into frame images based on the record frame rate. Next, the Brox optical flow algorithm was used to estimate the displacement field and correct the motion between two frames based on warp technique. Then, subtraction imaging was performed to extract the positional distribution of vascular perfusion (PDOVP). Finally, the average gray of all pixels in the PDOVP from each image was determined, and this was considered the TIC of CEUS image sequence. Both the correlation coefficient and mutual information of the results with proposed method were larger than those determined using the original method. PDOVP extraction results have been improved significantly after motion correction. The variance reduction rates were all positive, indicating that the fluctuations of TIC had become less pronounced, and the calculation accuracy has been improved after motion correction. This proposed method can effectively overcome the influence of motion mainly caused by respiration and allows precise calculation of TIC.

  11. Malware Analysis Using Visualized Image Matrices

    PubMed Central

    Im, Eul Gyu

    2014-01-01

    This paper proposes a novel malware visual analysis method that contains not only a visualization method to convert binary files into images, but also a similarity calculation method between these images. The proposed method generates RGB-colored pixels on image matrices using the opcode sequences extracted from malware samples and calculates the similarities for the image matrices. Particularly, our proposed methods are available for packed malware samples by applying them to the execution traces extracted through dynamic analysis. When the images are generated, we can reduce the overheads by extracting the opcode sequences only from the blocks that include the instructions related to staple behaviors such as functions and application programming interface (API) calls. In addition, we propose a technique that generates a representative image for each malware family in order to reduce the number of comparisons for the classification of unknown samples and the colored pixel information in the image matrices is used to calculate the similarities between the images. Our experimental results show that the image matrices of malware can effectively be used to classify malware families both statically and dynamically with accuracy of 0.9896 and 0.9732, respectively. PMID:25133202

  12. Enabling multiplexed testing of pooled donor cells through whole-genome sequencing.

    PubMed

    Chan, Yingleong; Chan, Ying Kai; Goodman, Daniel B; Guo, Xiaoge; Chavez, Alejandro; Lim, Elaine T; Church, George M

    2018-04-19

    We describe a method that enables the multiplex screening of a pool of many different donor cell lines. Our method accurately predicts each donor proportion from the pool without requiring the use of unique DNA barcodes as markers of donor identity. Instead, we take advantage of common single nucleotide polymorphisms, whole-genome sequencing, and an algorithm to calculate the proportions from the sequencing data. By testing using simulated and real data, we showed that our method robustly predicts the individual proportions from a mixed-pool of numerous donors, thus enabling the multiplexed testing of diverse donor cells en masse.More information is available at https://pgpresearch.med.harvard.edu/poolseq/.

  13. Research in Computational Astrobiology

    NASA Technical Reports Server (NTRS)

    Chaban, Galina; Colombano, Silvano; Scargle, Jeff; New, Michael H.; Pohorille, Andrew; Wilson, Michael A.

    2003-01-01

    We report on several projects in the field of computational astrobiology, which is devoted to advancing our understanding of the origin, evolution and distribution of life in the Universe using theoretical and computational tools. Research projects included modifying existing computer simulation codes to use efficient, multiple time step algorithms, statistical methods for analysis of astrophysical data via optimal partitioning methods, electronic structure calculations on water-nuclei acid complexes, incorporation of structural information into genomic sequence analysis methods and calculations of shock-induced formation of polycylic aromatic hydrocarbon compounds.

  14. Quantifying the Relationships among Drug Classes

    PubMed Central

    Hert, Jérôme; Keiser, Michael J.; Irwin, John J.; Oprea, Tudor I.; Shoichet, Brian K.

    2009-01-01

    The similarity of drug targets is typically measured using sequence or structural information. Here, we consider chemo-centric approaches that measure target similarity on the basis of their ligands, asking how chemoinformatics similarities differ from those derived bioinformatically, how stable the ligand networks are to changes in chemoinformatics metrics, and which network is the most reliable for prediction of pharmacology. We calculated the similarities between hundreds of drug targets and their ligands and mapped the relationship between them in a formal network. Bioinformatics networks were based on the BLAST similarity between sequences, while chemoinformatics networks were based on the ligand-set similarities calculated with either the Similarity Ensemble Approach (SEA) or a method derived from Bayesian statistics. By multiple criteria, bioinformatics and chemoinformatics networks differed substantially, and only occasionally did a high sequence similarity correspond to a high ligand-set similarity. In contrast, the chemoinformatics networks were stable to the method used to calculate the ligand-set similarities and to the chemical representation of the ligands. Also, the chemoinformatics networks were more natural and more organized, by network theory, than their bioinformatics counterparts: ligand-based networks were found to be small-world and broad-scale. PMID:18335977

  15. Detection of Fiber Layer-Up Lamination Order of CFRP Composite Using Thermal-Wave Radar Imaging

    NASA Astrophysics Data System (ADS)

    Wang, Fei; Liu, Junyan; Liu, Yang; Wang, Yang; Gong, Jinlong

    2016-09-01

    In this paper, thermal-wave radar imaging (TWRI) is used as a nondestructive inspection method to evaluate carbon-fiber-reinforced-polymer (CFRP) composite. An inverse methodology that combines TWRI with numerical optimization technique is proposed to determine the fiber layer-up lamination sequences of anisotropic CFRP composite. A 7-layer CFRP laminate [0°/45°/90°/0°]_{{s}} is heated by a chirp-modulated Gaussian laser beam, and then finite element method (FEM) is employed to calculate the temperature field of CFRP laminates. The phase based on lock-in correlation between reference chirp signal and the thermal-wave signal is performed to obtain the phase image of TWRI, and the least square method is applied to reconstruct the cost function that minimizes the square of the difference between the phase of TWRI inspection and numerical calculation. A hybrid algorithm that combines the simulation annealing with Nelder-Mead simplex research method is employed to solve the reconstructed cost function and find the global optimal solution of the layer-up sequences of CFRP composite. The result shows the feasibility of estimating the fiber layer-up lamination sequences of CFRP composite with optimal discrete and constraint conditions.

  16. Optimum quantum receiver for detecting weak signals in PAM communication systems

    NASA Astrophysics Data System (ADS)

    Sharma, Navneet; Rawat, Tarun Kumar; Parthasarathy, Harish; Gautam, Kumar

    2017-09-01

    This paper deals with the modeling of an optimum quantum receiver for pulse amplitude modulator (PAM) communication systems. The information bearing sequence {I_k}_{k=0}^{N-1} is estimated using the maximum likelihood (ML) method. The ML method is based on quantum mechanical measurements of an observable X in the Hilbert space of the quantum system at discrete times, when the Hamiltonian of the system is perturbed by an operator obtained by modulating a potential V with a PAM signal derived from the information bearing sequence {I_k}_{k=0}^{N-1}. The measurement process at each time instant causes collapse of the system state to an observable eigenstate. All probabilities of getting different outcomes from an observable are calculated using the perturbed evolution operator combined with the collapse postulate. For given probability densities, calculation of the mean square error evaluates the performance of the receiver. Finally, we present an example involving estimating an information bearing sequence that modulates a quantum electromagnetic field incident on a quantum harmonic oscillator.

  17. Accounting for uncertainty in DNA sequencing data.

    PubMed

    O'Rawe, Jason A; Ferson, Scott; Lyon, Gholson J

    2015-02-01

    Science is defined in part by an honest exposition of the uncertainties that arise in measurements and propagate through calculations and inferences, so that the reliabilities of its conclusions are made apparent. The recent rapid development of high-throughput DNA sequencing technologies has dramatically increased the number of measurements made at the biochemical and molecular level. These data come from many different DNA-sequencing technologies, each with their own platform-specific errors and biases, which vary widely. Several statistical studies have tried to measure error rates for basic determinations, but there are no general schemes to project these uncertainties so as to assess the surety of the conclusions drawn about genetic, epigenetic, and more general biological questions. We review here the state of uncertainty quantification in DNA sequencing applications, describe sources of error, and propose methods that can be used for accounting and propagating these errors and their uncertainties through subsequent calculations. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. Mapping Base Modifications in DNA by Transverse-Current Sequencing

    NASA Astrophysics Data System (ADS)

    Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.

    2018-02-01

    Sequencing DNA modifications and lesions, such as methylation of cytosine and oxidation of guanine, is even more important and challenging than sequencing the genome itself. The traditional methods for detecting DNA modifications are either insensitive to these modifications or require additional processing steps to identify a particular type of modification. Transverse-current sequencing in nanopores can potentially identify the canonical bases and base modifications in the same run. In this work, we demonstrate that the most common DNA epigenetic modifications and lesions can be detected with any predefined accuracy based on their tunneling current signature. Our results are based on simulations of the nanopore tunneling current through DNA molecules, calculated using nonequilibrium electron-transport methodology within an effective multiorbital model derived from first-principles calculations, followed by a base-calling algorithm accounting for neighbor current-current correlations. This methodology can be integrated with existing experimental techniques to improve base-calling fidelity.

  19. Use of the melting curve assay as a means for high-throughput quantification of Illumina sequencing libraries.

    PubMed

    Shinozuka, Hiroshi; Forster, John W

    2016-01-01

    Background. Multiplexed sequencing is commonly performed on massively parallel short-read sequencing platforms such as Illumina, and the efficiency of library normalisation can affect the quality of the output dataset. Although several library normalisation approaches have been established, none are ideal for highly multiplexed sequencing due to issues of cost and/or processing time. Methods. An inexpensive and high-throughput library quantification method has been developed, based on an adaptation of the melting curve assay. Sequencing libraries were subjected to the assay using the Bio-Rad Laboratories CFX Connect(TM) Real-Time PCR Detection System. The library quantity was calculated through summation of reduction of relative fluorescence units between 86 and 95 °C. Results.PCR-enriched sequencing libraries are suitable for this quantification without pre-purification of DNA. Short DNA molecules, which ideally should be eliminated from the library for subsequent processing, were differentiated from the target DNA in a mixture on the basis of differences in melting temperature. Quantification results for long sequences targeted using the melting curve assay were correlated with those from existing methods (R (2) > 0.77), and that observed from MiSeq sequencing (R (2) = 0.82). Discussion.The results of multiplexed sequencing suggested that the normalisation performance of the described method is equivalent to that of another recently reported high-throughput bead-based method, BeNUS. However, costs for the melting curve assay are considerably lower and processing times shorter than those of other existing methods, suggesting greater suitability for highly multiplexed sequencing applications.

  20. Improving the time efficiency of the Fourier synthesis method for slice selection in magnetic resonance imaging.

    PubMed

    Tahayori, B; Khaneja, N; Johnston, L A; Farrell, P M; Mareels, I M Y

    2016-01-01

    The design of slice selective pulses for magnetic resonance imaging can be cast as an optimal control problem. The Fourier synthesis method is an existing approach to solve these optimal control problems. In this method the gradient field as well as the excitation field are switched rapidly and their amplitudes are calculated based on a Fourier series expansion. Here, we provide a novel insight into the Fourier synthesis method via representing the Bloch equation in spherical coordinates. Based on the spherical Bloch equation, we propose an alternative sequence of pulses that can be used for slice selection which is more time efficient compared to the original method. Simulation results demonstrate that while the performance of both methods is approximately the same, the required time for the proposed sequence of pulses is half of the original sequence of pulses. Furthermore, the slice selectivity of both sequences of pulses changes with radio frequency field inhomogeneities in a similar way. We also introduce a measure, referred to as gradient complexity, to compare the performance of both sequences of pulses. This measure indicates that for a desired level of uniformity in the excited slice, the gradient complexity for the proposed sequence of pulses is less than the original sequence. Copyright © 2015 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.

  1. Historical feature pattern extraction based network attack situation sensing algorithm.

    PubMed

    Zeng, Yong; Liu, Dacheng; Lei, Zhou

    2014-01-01

    The situation sequence contains a series of complicated and multivariate random trends, which are very sudden, uncertain, and difficult to recognize and describe its principle by traditional algorithms. To solve the above questions, estimating parameters of super long situation sequence is essential, but very difficult, so this paper proposes a situation prediction method based on historical feature pattern extraction (HFPE). First, HFPE algorithm seeks similar indications from the history situation sequence recorded and weighs the link intensity between occurred indication and subsequent effect. Then it calculates the probability that a certain effect reappears according to the current indication and makes a prediction after weighting. Meanwhile, HFPE method gives an evolution algorithm to derive the prediction deviation from the views of pattern and accuracy. This algorithm can continuously promote the adaptability of HFPE through gradual fine-tuning. The method preserves the rules in sequence at its best, does not need data preprocessing, and can track and adapt to the variation of situation sequence continuously.

  2. Historical Feature Pattern Extraction Based Network Attack Situation Sensing Algorithm

    PubMed Central

    Zeng, Yong; Liu, Dacheng; Lei, Zhou

    2014-01-01

    The situation sequence contains a series of complicated and multivariate random trends, which are very sudden, uncertain, and difficult to recognize and describe its principle by traditional algorithms. To solve the above questions, estimating parameters of super long situation sequence is essential, but very difficult, so this paper proposes a situation prediction method based on historical feature pattern extraction (HFPE). First, HFPE algorithm seeks similar indications from the history situation sequence recorded and weighs the link intensity between occurred indication and subsequent effect. Then it calculates the probability that a certain effect reappears according to the current indication and makes a prediction after weighting. Meanwhile, HFPE method gives an evolution algorithm to derive the prediction deviation from the views of pattern and accuracy. This algorithm can continuously promote the adaptability of HFPE through gradual fine-tuning. The method preserves the rules in sequence at its best, does not need data preprocessing, and can track and adapt to the variation of situation sequence continuously. PMID:24892054

  3. Dynamic programming algorithms for biological sequence comparison.

    PubMed

    Pearson, W R; Miller, W

    1992-01-01

    Efficient dynamic programming algorithms are available for a broad class of protein and DNA sequence comparison problems. These algorithms require computer time proportional to the product of the lengths of the two sequences being compared [O(N2)] but require memory space proportional only to the sum of these lengths [O(N)]. Although the requirement for O(N2) time limits use of the algorithms to the largest computers when searching protein and DNA sequence databases, many other applications of these algorithms, such as calculation of distances for evolutionary trees and comparison of a new sequence to a library of sequence profiles, are well within the capabilities of desktop computers. In particular, the results of library searches with rapid searching programs, such as FASTA or BLAST, should be confirmed by performing a rigorous optimal alignment. Whereas rapid methods do not overlook significant sequence similarities, FASTA limits the number of gaps that can be inserted into an alignment, so that a rigorous alignment may extend the alignment substantially in some cases. BLAST does not allow gaps in the local regions that it reports; a calculation that allows gaps is very likely to extend the alignment substantially. Although a Monte Carlo evaluation of the statistical significance of a similarity score with a rigorous algorithm is much slower than the heuristic approach used by the RDF2 program, the dynamic programming approach should take less than 1 hr on a 386-based PC or desktop Unix workstation. For descriptive purposes, we have limited our discussion to methods for calculating similarity scores and distances that use gap penalties of the form g = rk. Nevertheless, programs for the more general case (g = q+rk) are readily available. Versions of these programs that run either on Unix workstations, IBM-PC class computers, or the Macintosh can be obtained from either of the authors.

  4. Image based automatic water meter reader

    NASA Astrophysics Data System (ADS)

    Jawas, N.; Indrianto

    2018-01-01

    Water meter is used as a tool to calculate water consumption. This tool works by utilizing water flow and shows the calculation result with mechanical digit counter. Practically, in everyday use, an operator will manually check the digit counter periodically. The Operator makes logs of the number shows by water meter to know the water consumption. This manual operation is time consuming and prone to human error. Therefore, in this paper we propose an automatic water meter digit reader from digital image. The digits sequence is detected by utilizing contour information of the water meter front panel.. Then an OCR method is used to get the each digit character. The digit sequence detection is an important part of overall process. It determines the success of overall system. The result shows promising results especially in sequence detection.

  5. Computer program for calculating supersonic flow about circular, elliptic, and bielliptic cones by the method of lines

    NASA Technical Reports Server (NTRS)

    Klunker, E. B.; South, J. C., Jr.; Davis, R. M.

    1972-01-01

    A user's manual for a computer program which calculates the supersonic flow about circular, elliptic, and bielliptic cones at incidence and elliptic cones at yaw by the method of lines is presented. The program is automated to compute a case from known or easily calculated solution by changing the parameters through a sequence of steps. It provides information including the shock shape, flow field, isentropic surface properties, entropy layer, and force coefficients. A description of the program operation, sample computations, and a FORTRAN 4 listing are presented.

  6. A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy.

    PubMed

    Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng

    2017-05-10

    Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite its higher computational costs, our method is still suitable for analyzing large-scale microbiome datasets for practical purposes. Furthermore, our method can be applied for taxonomic classification of any phylogenetic marker gene sequences. Our software, called BLCA, is freely available at https://github.com/qunfengdong/BLCA .

  7. Laboratory Sequence in Computational Methods for Introductory Chemistry

    NASA Astrophysics Data System (ADS)

    Cody, Jason A.; Wiser, Dawn C.

    2003-07-01

    A four-exercise laboratory sequence for introductory chemistry integrating hands-on, student-centered experience with computer modeling has been designed and implemented. The progression builds from exploration of molecular shapes to intermolecular forces and the impact of those forces on chemical separations made with gas chromatography and distillation. The sequence ends with an exploration of molecular orbitals. The students use the computers as a tool; they build the molecules, submit the calculations, and interpret the results. Because of the construction of the sequence and its placement spanning the semester break, good laboratory notebook practices are reinforced and the continuity of course content and methods between semesters is emphasized. The inclusion of these techniques in the first year of chemistry has had a positive impact on student perceptions and student learning.

  8. Vibration-Rotation Bands of HF and DF

    DTIC Science & Technology

    1977-09-23

    98 IZa. Comparison of Observed and Calculated Line Positions of HF, Av = I Sequence ........................... 99 f2b. Comparison of Observed and...Calculated Line Positions of HF, Av = 2 Sequence ........................... 102 12c. Comparison of Observed and Calculated Line Positions of HF, Av = 3...Sequence ........................... 107 i2d. Comparison of Observed and Calculated Line Positions ofHF, Av = 4 Sequence ........................... fi

  9. Measuring Sister Chromatid Cohesion Protein Genome Occupancy in Drosophila melanogaster by ChIP-seq.

    PubMed

    Dorsett, Dale; Misulovin, Ziva

    2017-01-01

    This chapter presents methods to conduct and analyze genome-wide chromatin immunoprecipitation of the cohesin complex and the Nipped-B cohesin loading factor in Drosophila cells using high-throughput DNA sequencing (ChIP-seq). Procedures for isolation of chromatin, immunoprecipitation, and construction of sequencing libraries for the Ion Torrent Proton high throughput sequencer are detailed, and computational methods to calculate occupancy as input-normalized fold-enrichment are described. The results obtained by ChIP-seq are compared to those obtained by ChIP-chip (genomic ChIP using tiling microarrays), and the effects of sequencing depth on the accuracy are analyzed. ChIP-seq provides similar sensitivity and reproducibility as ChIP-chip, and identifies the same broad regions of occupancy. The locations of enrichment peaks, however, can differ between ChIP-chip and ChIP-seq, and low sequencing depth can splinter broad regions of occupancy into distinct peaks.

  10. GAMSOR: Gamma Source Preparation and DIF3D Flux Solution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smith, M. A.; Lee, C. H.; Hill, R. N.

    2017-06-28

    Nuclear reactors that rely upon the fission reaction have two modes of thermal energy deposition in the reactor system: neutron absorption and gamma absorption. The gamma rays are typically generated by neutron capture reactions or during the fission process which means the primary driver of energy production is of course the neutron interaction. In conventional reactor physics methods, the gamma heating component is ignored such that the gamma absorption is forced to occur at the gamma emission site. For experimental reactor systems like EBR-II and FFTF, the placement of structural pins and assemblies internal to the core leads to problemsmore » with power heating predictions because there is no fission power source internal to the assembly to dictate a spatial distribution of the power. As part of the EBR-II support work in the 1980s, the GAMSOR code was developed to assist analysts in calculating the gamma heating. The GAMSOR code is a modified version of DIF3D and actually functions within a sequence of DIF3D calculations. The gamma flux in a conventional fission reactor system does not perturb the neutron flux and thus the gamma flux calculation can be cast as a fixed source problem given a solution to the steady state neutron flux equation. This leads to a sequence of DIF3D calculations, called the GAMSOR sequence, which involves solving the neutron flux, then the gamma flux, and then combining the results to do a summary edit. In this manuscript, we go over the GAMSOR code and detail how it is put together and functions. We also discuss how to setup the GAMSOR sequence and input for each DIF3D calculation in the GAMSOR sequence.« less

  11. Sequence comparison alignment-free approach based on suffix tree and L-words frequency.

    PubMed

    Soares, Inês; Goios, Ana; Amorim, António

    2012-01-01

    The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions/deletions). In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length L-L-words--in each sequence is rapidly calculated. Based on the L-words frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.

  12. Efficient dynamical correction of the transition state theory rate estimate for a flat energy barrier.

    PubMed

    Mökkönen, Harri; Ala-Nissila, Tapio; Jónsson, Hannes

    2016-09-07

    The recrossing correction to the transition state theory estimate of a thermal rate can be difficult to calculate when the energy barrier is flat. This problem arises, for example, in polymer escape if the polymer is long enough to stretch between the initial and final state energy wells while the polymer beads undergo diffusive motion back and forth over the barrier. We present an efficient method for evaluating the correction factor by constructing a sequence of hyperplanes starting at the transition state and calculating the probability that the system advances from one hyperplane to another towards the product. This is analogous to what is done in forward flux sampling except that there the hyperplane sequence starts at the initial state. The method is applied to the escape of polymers with up to 64 beads from a potential well. For high temperature, the results are compared with direct Langevin dynamics simulations as well as forward flux sampling and excellent agreement between the three rate estimates is found. The use of a sequence of hyperplanes in the evaluation of the recrossing correction speeds up the calculation by an order of magnitude as compared with the traditional approach. As the temperature is lowered, the direct Langevin dynamics simulations as well as the forward flux simulations become computationally too demanding, while the harmonic transition state theory estimate corrected for recrossings can be calculated without significant increase in the computational effort.

  13. An online supervised learning method based on gradient descent for spiking neurons.

    PubMed

    Xu, Yan; Yang, Jing; Zhong, Shuiming

    2017-09-01

    The purpose of supervised learning with temporal encoding for spiking neurons is to make the neurons emit a specific spike train encoded by precise firing times of spikes. The gradient-descent-based (GDB) learning methods are widely used and verified in the current research. Although the existing GDB multi-spike learning (or spike sequence learning) methods have good performance, they work in an offline manner and still have some limitations. This paper proposes an online GDB spike sequence learning method for spiking neurons that is based on the online adjustment mechanism of real biological neuron synapses. The method constructs error function and calculates the adjustment of synaptic weights as soon as the neurons emit a spike during their running process. We analyze and synthesize desired and actual output spikes to select appropriate input spikes in the calculation of weight adjustment in this paper. The experimental results show that our method obviously improves learning performance compared with the offline learning manner and has certain advantage on learning accuracy compared with other learning methods. Stronger learning ability determines that the method has large pattern storage capacity. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Design of nucleic acid sequences for DNA computing based on a thermodynamic approach

    PubMed Central

    Tanaka, Fumiaki; Kameda, Atsushi; Yamamoto, Masahito; Ohuchi, Azuma

    2005-01-01

    We have developed an algorithm for designing multiple sequences of nucleic acids that have a uniform melting temperature between the sequence and its complement and that do not hybridize non-specifically with each other based on the minimum free energy (ΔGmin). Sequences that satisfy these constraints can be utilized in computations, various engineering applications such as microarrays, and nano-fabrications. Our algorithm is a random generate-and-test algorithm: it generates a candidate sequence randomly and tests whether the sequence satisfies the constraints. The novelty of our algorithm is that the filtering method uses a greedy search to calculate ΔGmin. This effectively excludes inappropriate sequences before ΔGmin is calculated, thereby reducing computation time drastically when compared with an algorithm without the filtering. Experimental results in silico showed the superiority of the greedy search over the traditional approach based on the hamming distance. In addition, experimental results in vitro demonstrated that the experimental free energy (ΔGexp) of 126 sequences correlated well with ΔGmin (|R| = 0.90) than with the hamming distance (|R| = 0.80). These results validate the rationality of a thermodynamic approach. We implemented our algorithm in a graphic user interface-based program written in Java. PMID:15701762

  15. Review of road user costs and methods.

    DOT National Transportation Integrated Search

    2013-07-01

    The South Dakota Department of Transportation (SDDOT) uses road user costs (RUC) to calculate incentive or disincentive compensation for contractors, quantify project-specific liquidated damages, select the ideal sequencing of a project, and forecast...

  16. A segmentation method for lung nodule image sequences based on superpixels and density-based spatial clustering of applications with noise

    PubMed Central

    Zhang, Wei; Zhang, Xiaolong; Qiang, Yan; Tian, Qi; Tang, Xiaoxian

    2017-01-01

    The fast and accurate segmentation of lung nodule image sequences is the basis of subsequent processing and diagnostic analyses. However, previous research investigating nodule segmentation algorithms cannot entirely segment cavitary nodules, and the segmentation of juxta-vascular nodules is inaccurate and inefficient. To solve these problems, we propose a new method for the segmentation of lung nodule image sequences based on superpixels and density-based spatial clustering of applications with noise (DBSCAN). First, our method uses three-dimensional computed tomography image features of the average intensity projection combined with multi-scale dot enhancement for preprocessing. Hexagonal clustering and morphological optimized sequential linear iterative clustering (HMSLIC) for sequence image oversegmentation is then proposed to obtain superpixel blocks. The adaptive weight coefficient is then constructed to calculate the distance required between superpixels to achieve precise lung nodules positioning and to obtain the subsequent clustering starting block. Moreover, by fitting the distance and detecting the change in slope, an accurate clustering threshold is obtained. Thereafter, a fast DBSCAN superpixel sequence clustering algorithm, which is optimized by the strategy of only clustering the lung nodules and adaptive threshold, is then used to obtain lung nodule mask sequences. Finally, the lung nodule image sequences are obtained. The experimental results show that our method rapidly, completely and accurately segments various types of lung nodule image sequences. PMID:28880916

  17. Rapid 3D Reconstruction for Image Sequence Acquired from UAV Camera

    PubMed Central

    Qu, Yufu; Huang, Jianyu; Zhang, Xuan

    2018-01-01

    In order to reconstruct three-dimensional (3D) structures from an image sequence captured by unmanned aerial vehicles’ camera (UAVs) and improve the processing speed, we propose a rapid 3D reconstruction method that is based on an image queue, considering the continuity and relevance of UAV camera images. The proposed approach first compresses the feature points of each image into three principal component points by using the principal component analysis method. In order to select the key images suitable for 3D reconstruction, the principal component points are used to estimate the interrelationships between images. Second, these key images are inserted into a fixed-length image queue. The positions and orientations of the images are calculated, and the 3D coordinates of the feature points are estimated using weighted bundle adjustment. With this structural information, the depth maps of these images can be calculated. Next, we update the image queue by deleting some of the old images and inserting some new images into the queue, and a structural calculation of all the images can be performed by repeating the previous steps. Finally, a dense 3D point cloud can be obtained using the depth–map fusion method. The experimental results indicate that when the texture of the images is complex and the number of images exceeds 100, the proposed method can improve the calculation speed by more than a factor of four with almost no loss of precision. Furthermore, as the number of images increases, the improvement in the calculation speed will become more noticeable. PMID:29342908

  18. Rapid 3D Reconstruction for Image Sequence Acquired from UAV Camera.

    PubMed

    Qu, Yufu; Huang, Jianyu; Zhang, Xuan

    2018-01-14

    In order to reconstruct three-dimensional (3D) structures from an image sequence captured by unmanned aerial vehicles' camera (UAVs) and improve the processing speed, we propose a rapid 3D reconstruction method that is based on an image queue, considering the continuity and relevance of UAV camera images. The proposed approach first compresses the feature points of each image into three principal component points by using the principal component analysis method. In order to select the key images suitable for 3D reconstruction, the principal component points are used to estimate the interrelationships between images. Second, these key images are inserted into a fixed-length image queue. The positions and orientations of the images are calculated, and the 3D coordinates of the feature points are estimated using weighted bundle adjustment. With this structural information, the depth maps of these images can be calculated. Next, we update the image queue by deleting some of the old images and inserting some new images into the queue, and a structural calculation of all the images can be performed by repeating the previous steps. Finally, a dense 3D point cloud can be obtained using the depth-map fusion method. The experimental results indicate that when the texture of the images is complex and the number of images exceeds 100, the proposed method can improve the calculation speed by more than a factor of four with almost no loss of precision. Furthermore, as the number of images increases, the improvement in the calculation speed will become more noticeable.

  19. Classification of G-protein coupled receptors based on a rich generation of convolutional neural network, N-gram transformation and multiple sequence alignments.

    PubMed

    Li, Man; Ling, Cheng; Xu, Qi; Gao, Jingyang

    2018-02-01

    Sequence classification is crucial in predicting the function of newly discovered sequences. In recent years, the prediction of the incremental large-scale and diversity of sequences has heavily relied on the involvement of machine-learning algorithms. To improve prediction accuracy, these algorithms must confront the key challenge of extracting valuable features. In this work, we propose a feature-enhanced protein classification approach, considering the rich generation of multiple sequence alignment algorithms, N-gram probabilistic language model and the deep learning technique. The essence behind the proposed method is that if each group of sequences can be represented by one feature sequence, composed of homologous sites, there should be less loss when the sequence is rebuilt, when a more relevant sequence is added to the group. On the basis of this consideration, the prediction becomes whether a query sequence belonging to a group of sequences can be transferred to calculate the probability that the new feature sequence evolves from the original one. The proposed work focuses on the hierarchical classification of G-protein Coupled Receptors (GPCRs), which begins by extracting the feature sequences from the multiple sequence alignment results of the GPCRs sub-subfamilies. The N-gram model is then applied to construct the input vectors. Finally, these vectors are imported into a convolutional neural network to make a prediction. The experimental results elucidate that the proposed method provides significant performance improvements. The classification error rate of the proposed method is reduced by at least 4.67% (family level I) and 5.75% (family Level II), in comparison with the current state-of-the-art methods. The implementation program of the proposed work is freely available at: https://github.com/alanFchina/CNN .

  20. A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis.

    PubMed

    Liu, Bin; Wang, Xiaolong; Lin, Lei; Dong, Qiwen; Wang, Xuan

    2008-12-01

    Protein remote homology detection and fold recognition are central problems in bioinformatics. Currently, discriminative methods based on support vector machine (SVM) are the most effective and accurate methods for solving these problems. A key step to improve the performance of the SVM-based methods is to find a suitable representation of protein sequences. In this paper, a novel building block of proteins called Top-n-grams is presented, which contains the evolutionary information extracted from the protein sequence frequency profiles. The protein sequence frequency profiles are calculated from the multiple sequence alignments outputted by PSI-BLAST and converted into Top-n-grams. The protein sequences are transformed into fixed-dimension feature vectors by the occurrence times of each Top-n-gram. The training vectors are evaluated by SVM to train classifiers which are then used to classify the test protein sequences. We demonstrate that the prediction performance of remote homology detection and fold recognition can be improved by combining Top-n-grams and latent semantic analysis (LSA), which is an efficient feature extraction technique from natural language processing. When tested on superfamily and fold benchmarks, the method combining Top-n-grams and LSA gives significantly better results compared to related methods. The method based on Top-n-grams significantly outperforms the methods based on many other building blocks including N-grams, patterns, motifs and binary profiles. Therefore, Top-n-gram is a good building block of the protein sequences and can be widely used in many tasks of the computational biology, such as the sequence alignment, the prediction of domain boundary, the designation of knowledge-based potentials and the prediction of protein binding sites.

  1. Mapping DNA methylation by transverse current sequencing: Reduction of noise from neighboring nucleotides

    NASA Astrophysics Data System (ADS)

    Alvarez, Jose; Massey, Steven; Kalitsov, Alan; Velev, Julian

    Nanopore sequencing via transverse current has emerged as a competitive candidate for mapping DNA methylation without needed bisulfite-treatment, fluorescent tag, or PCR amplification. By eliminating the error producing amplification step, long read lengths become feasible, which greatly simplifies the assembly process and reduces the time and the cost inherent in current technologies. However, due to the large error rates of nanopore sequencing, single base resolution has not been reached. A very important source of noise is the intrinsic structural noise in the electric signature of the nucleotide arising from the influence of neighboring nucleotides. In this work we perform calculations of the tunneling current through DNA molecules in nanopores using the non-equilibrium electron transport method within an effective multi-orbital tight-binding model derived from first-principles calculations. We develop a base-calling algorithm accounting for the correlations of the current through neighboring bases, which in principle can reduce the error rate below any desired precision. Using this method we show that we can clearly distinguish DNA methylation and other base modifications based on the reading of the tunneling current.

  2. Calculation of vitrinite reflectance from thermal histories: A comparison of some methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Morrow, D.W.; Issler, D.R.

    1993-04-01

    Vitrinite reflectance values (%R[sub o]) calculated from commonly used methods are compared with respect to time invariant temperatures and constant heating rates. Two monofunctional methods, one involving a time-temperature index to vitrinite reflectance correlation (TTI-%R[sub o]) to depth correlation, yield vitrinite reflectance values that are similar to those calculated by recently published Arrhenius-based methods, such as EASY%R[sub o]. The approximate agreement between these methods supports the perception that the EASY%R[sub o] algorithm is the most accurate method for the prediction of vitrinite reflectances throughout the range of organic maturity normally encountered. However, calibration of these methods against vitrinite reflectance datamore » from two basin sequences with well-documented geologic histories indicates that, although the EASY%R[sub o] method has wide applicability, it slightly overestimates vitrinite reflectances in strata of low to medium maturity up to a %R[sub o] value of 0.9%. The two monofunctional methods may be more accurate for prediction of vitrinite reflectances in similar sequences of low maturity. An older, but previously widely accepted TTI-%R[sub O] correlation consistently overestimates vitrinite reflectances with respect to other methods. Underestimation of paleogeothermal gradients in the original calibration of time-temperature history to vitrinite reflectance may have introduced a systematic bias to the TTI-%R[sub o] correlation used in this method. Also, incorporation of TAI (thermal alteration index) data and its conversion to %R[sub o]-equivalent values may have introduced inaccuracies. 36 refs., 7 figs.« less

  3. Identification of genomic indels and structural variations using split reads

    PubMed Central

    2011-01-01

    Background Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs) in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC), a sequence-based method for SV detection. Results We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read). All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions). A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models). This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions). We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events) allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. Conclusions Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole size spectrum for deletions. Moreover, with the advent of the third-generation sequencing technologies that produce longer reads, we expect our method to be even more useful. PMID:21787423

  4. Using expected sequence features to improve basecalling accuracy of amplicon pyrosequencing data.

    PubMed

    Rask, Thomas S; Petersen, Bent; Chen, Donald S; Day, Karen P; Pedersen, Anders Gorm

    2016-04-22

    Amplicon pyrosequencing targets a known genetic region and thus inherently produces reads highly anticipated to have certain features, such as conserved nucleotide sequence, and in the case of protein coding DNA, an open reading frame. Pyrosequencing errors, consisting mainly of nucleotide insertions and deletions, are on the other hand likely to disrupt open reading frames. Such an inverse relationship between errors and expectation based on prior knowledge can be used advantageously to guide the process known as basecalling, i.e. the inference of nucleotide sequence from raw sequencing data. The new basecalling method described here, named Multipass, implements a probabilistic framework for working with the raw flowgrams obtained by pyrosequencing. For each sequence variant Multipass calculates the likelihood and nucleotide sequence of several most likely sequences given the flowgram data. This probabilistic approach enables integration of basecalling into a larger model where other parameters can be incorporated, such as the likelihood for observing a full-length open reading frame at the targeted region. We apply the method to 454 amplicon pyrosequencing data obtained from a malaria virulence gene family, where Multipass generates 20 % more error-free sequences than current state of the art methods, and provides sequence characteristics that allow generation of a set of high confidence error-free sequences. This novel method can be used to increase accuracy of existing and future amplicon sequencing data, particularly where extensive prior knowledge is available about the obtained sequences, for example in analysis of the immunoglobulin VDJ region where Multipass can be combined with a model for the known recombining germline genes. Multipass is available for Roche 454 data at http://www.cbs.dtu.dk/services/MultiPass-1.0 , and the concept can potentially be implemented for other sequencing technologies as well.

  5. MutaBind estimates and interprets the effects of sequence variants on protein-protein interactions.

    PubMed

    Li, Minghui; Simonetti, Franco L; Goncearenco, Alexander; Panchenko, Anna R

    2016-07-08

    Proteins engage in highly selective interactions with their macromolecular partners. Sequence variants that alter protein binding affinity may cause significant perturbations or complete abolishment of function, potentially leading to diseases. There exists a persistent need to develop a mechanistic understanding of impacts of variants on proteins. To address this need we introduce a new computational method MutaBind to evaluate the effects of sequence variants and disease mutations on protein interactions and calculate the quantitative changes in binding affinity. The MutaBind method uses molecular mechanics force fields, statistical potentials and fast side-chain optimization algorithms. The MutaBind server maps mutations on a structural protein complex, calculates the associated changes in binding affinity, determines the deleterious effect of a mutation, estimates the confidence of this prediction and produces a mutant structural model for download. MutaBind can be applied to a large number of problems, including determination of potential driver mutations in cancer and other diseases, elucidation of the effects of sequence variants on protein fitness in evolution and protein design. MutaBind is available at http://www.ncbi.nlm.nih.gov/projects/mutabind/. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  6. Form drag in rivers due to small-scale natural topographic features: 2. Irregular sequences

    USGS Publications Warehouse

    Kean, J.W.; Smith, J.D.

    2006-01-01

    The size, shape, and spacing of small-scale topographic features found on the boundaries of natural streams, rivers, and floodplains can be quite variable. Consequently, a procedure for determining the form drag on irregular sequences of different-sized topographic features is essential for calculating near-boundary flows and sediment transport. A method for carrying out such calculations is developed in this paper. This method builds on the work of Kean and Smith (2006), which describes the flow field for the simpler case of a regular sequence of identical topographic features. Both approaches model topographic features as two-dimensional elements with Gaussian-shaped cross sections defined in terms of three parameters. Field measurements of bank topography are used to show that (1) the magnitude of these shape parameters can vary greatly between adjacent topographic features and (2) the variability of these shape parameters follows a lognormal distribution. Simulations using an irregular set of topographic roughness elements show that the drag on an individual element is primarily controlled by the size and shape of the feature immediately upstream and that the spatial average of the boundary shear stress over a large set of randomly ordered elements is relatively insensitive to the sequence of the elements. In addition, a method to transform the topography of irregular surfaces into an equivalently rough surface of regularly spaced, identical topographic elements also is given. The methods described in this paper can be used to improve predictions of flow resistance in rivers as well as quantify bank roughness.

  7. Iterative pass optimization of sequence data

    NASA Technical Reports Server (NTRS)

    Wheeler, Ward C.

    2003-01-01

    The problem of determining the minimum-cost hypothetical ancestral sequences for a given cladogram is known to be NP-complete. This "tree alignment" problem has motivated the considerable effort placed in multiple sequence alignment procedures. Wheeler in 1996 proposed a heuristic method, direct optimization, to calculate cladogram costs without the intervention of multiple sequence alignment. This method, though more efficient in time and more effective in cladogram length than many alignment-based procedures, greedily optimizes nodes based on descendent information only. In their proposal of an exact multiple alignment solution, Sankoff et al. in 1976 described a heuristic procedure--the iterative improvement method--to create alignments at internal nodes by solving a series of median problems. The combination of a three-sequence direct optimization with iterative improvement and a branch-length-based cladogram cost procedure, provides an algorithm that frequently results in superior (i.e., lower) cladogram costs. This iterative pass optimization is both computation and memory intensive, but economies can be made to reduce this burden. An example in arthropod systematics is discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.

  8. The scattering of electromagnetic pulses by a slit in a conducting screen

    NASA Technical Reports Server (NTRS)

    Ackerknecht, W. E., III; Chen, C.-L.

    1975-01-01

    A direct method for calculating the impulse response of a slit in a conducting screen is presented which is derived specifically for the analysis of transient scattering by two-dimensional objects illuminated by a plane incident wave. The impulse response is obtained by assuming that the total response is composed of two sequences of diffracted waves. The solution is determined for the first two waves in one sequence by using Green's functions and the equivalence principle, for additional waves in the sequence by iteration, and for the other sequence by a transformation of coordinates. The cases of E-polarization and H-polarization are considered.

  9. PCV: An Alignment Free Method for Finding Homologous Nucleotide Sequences and its Application in Phylogenetic Study.

    PubMed

    Kumar, Rajnish; Mishra, Bharat Kumar; Lahiri, Tapobrata; Kumar, Gautam; Kumar, Nilesh; Gupta, Rahul; Pal, Manoj Kumar

    2017-06-01

    Online retrieval of the homologous nucleotide sequences through existing alignment techniques is a common practice against the given database of sequences. The salient point of these techniques is their dependence on local alignment techniques and scoring matrices the reliability of which is limited by computational complexity and accuracy. Toward this direction, this work offers a novel way for numerical representation of genes which can further help in dividing the data space into smaller partitions helping formation of a search tree. In this context, this paper introduces a 36-dimensional Periodicity Count Value (PCV) which is representative of a particular nucleotide sequence and created through adaptation from the concept of stochastic model of Kolekar et al. (American Institute of Physics 1298:307-312, 2010. doi: 10.1063/1.3516320 ). The PCV construct uses information on physicochemical properties of nucleotides and their positional distribution pattern within a gene. It is observed that PCV representation of gene reduces computational cost in the calculation of distances between a pair of genes while being consistent with the existing methods. The validity of PCV-based method was further tested through their use in molecular phylogeny constructs in comparison with that using existing sequence alignment methods.

  10. The Impact of Normalization Methods on RNA-Seq Data Analysis

    PubMed Central

    Zyprych-Walczak, J.; Szabelska, A.; Handschuh, L.; Górczak, K.; Klamecka, K.; Figlerowicz, M.; Siatkowski, I.

    2015-01-01

    High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tackle the analysis and management of data. The data normalization is one of the most crucial steps of data processing and this process must be carefully considered as it has a profound effect on the results of the analysis. In this work, we focus on a comprehensive comparison of five normalization methods related to sequencing depth, widely used for transcriptome sequencing (RNA-seq) data, and their impact on the results of gene expression analysis. Based on this study, we suggest a universal workflow that can be applied for the selection of the optimal normalization procedure for any particular data set. The described workflow includes calculation of the bias and variance values for the control genes, sensitivity and specificity of the methods, and classification errors as well as generation of the diagnostic plots. Combining the above information facilitates the selection of the most appropriate normalization method for the studied data sets and determines which methods can be used interchangeably. PMID:26176014

  11. Calculation of Cardiac Kinetic Energy Index from PET images.

    PubMed

    Sims, John; Oliveira, Marco Antônio; Meneghetti, José Claudio; Gutierrez, Marco Antônio

    2015-01-01

    Cardiac function can be assessed from displacement measurements in imaging modalities from nuclear medicine Using positron emission tomography (PET) image sequences with Rubidium-82, we propose and estimate the total Kinetic Energy Index (KEf) obtained from the velocity field, which was calculated using 3D optical flow(OF) methods applied over the temporal image sequence. However, it was found that the brightness of the image varied unexpectedly between frames, violating the constant brightness assumption of the OF method and causing large errors in estimating the velocity field. Therefore total brightness was equalized across image frames and the adjusted configuration tested with rest perfusion images acquired from individuals with normal (n=30) and low (n=33) cardiac function. For these images KEf was calculated as 0.5731±0.0899 and 0.3812±0.1146 for individuals with normal and low cardiac function respectively. The ability of KEf to properly classify patients into the two groups was tested with a ROC analysis, with area under the curve estimated as 0.906. To our knowledge this is the first time that KEf has been applied to PET images.

  12. Implementation of hierarchical clustering using k-mer sparse matrix to analyze MERS-CoV genetic relationship

    NASA Astrophysics Data System (ADS)

    Bustamam, A.; Ulul, E. D.; Hura, H. F. A.; Siswantining, T.

    2017-07-01

    Hierarchical clustering is one of effective methods in creating a phylogenetic tree based on the distance matrix between DNA (deoxyribonucleic acid) sequences. One of the well-known methods to calculate the distance matrix is k-mer method. Generally, k-mer is more efficient than some distance matrix calculation techniques. The steps of k-mer method are started from creating k-mer sparse matrix, and followed by creating k-mer singular value vectors. The last step is computing the distance amongst vectors. In this paper, we analyze the sequences of MERS-CoV (Middle East Respiratory Syndrome - Coronavirus) DNA by implementing hierarchical clustering using k-mer sparse matrix in order to perform the phylogenetic analysis. Our results show that the ancestor of our MERS-CoV is coming from Egypt. Moreover, we found that the MERS-CoV infection that occurs in one country may not necessarily come from the same country of origin. This suggests that the process of MERS-CoV mutation might not only be influenced by geographical factor.

  13. Calculating electronic correlation effects from densities of transitions

    NASA Astrophysics Data System (ADS)

    Haydock, Roger

    Adding a localized electron to a system of interacting electrons induces a density of transitions described by the time-independent Heisenberg equation. Sequences of these transitions generate interacting states whose total energy is the sum of energies of the constituent transitions. A calculation of magnetic moments for itinerant electrons with Ising interactions illustrates this method. supported by the H. V. Snyder Gift to the University of Oregon.

  14. Noise and drift analysis of non-equally spaced timing data

    NASA Technical Reports Server (NTRS)

    Vernotte, F.; Zalamansky, G.; Lantz, E.

    1994-01-01

    Generally, it is possible to obtain equally spaced timing data from oscillators. The measurement of the drifts and noises affecting oscillators is then performed by using a variance (Allan variance, modified Allan variance, or time variance) or a system of several variances (multivariance method). However, in some cases, several samples, or even several sets of samples, are missing. In the case of millisecond pulsar timing data, for instance, observations are quite irregularly spaced in time. Nevertheless, since some observations are very close together (one minute) and since the timing data sequence is very long (more than ten years), information on both short-term and long-term stability is available. Unfortunately, a direct variance analysis is not possible without interpolating missing data. Different interpolation algorithms (linear interpolation, cubic spline) are used to calculate variances in order to verify that they neither lose information nor add erroneous information. A comparison of the results of the different algorithms is given. Finally, the multivariance method was adapted to the measurement sequence of the millisecond pulsar timing data: the responses of each variance of the system are calculated for each type of noise and drift, with the same missing samples as in the pulsar timing sequence. An estimation of precision, dynamics, and separability of this method is given.

  15. Bacterial community comparisons by taxonomy-supervised analysis independent of sequence alignment and clustering

    PubMed Central

    Sul, Woo Jun; Cole, James R.; Jesus, Ederson da C.; Wang, Qiong; Farris, Ryan J.; Fish, Jordan A.; Tiedje, James M.

    2011-01-01

    High-throughput sequencing of 16S rRNA genes has increased our understanding of microbial community structure, but now even higher-throughput methods to the Illumina scale allow the creation of much larger datasets with more samples and orders-of-magnitude more sequences that swamp current analytic methods. We developed a method capable of handling these larger datasets on the basis of assignment of sequences into an existing taxonomy using a supervised learning approach (taxonomy-supervised analysis). We compared this method with a commonly used clustering approach based on sequence similarity (taxonomy-unsupervised analysis). We sampled 211 different bacterial communities from various habitats and obtained ∼1.3 million 16S rRNA sequences spanning the V4 hypervariable region by pyrosequencing. Both methodologies gave similar ecological conclusions in that β-diversity measures calculated by using these two types of matrices were significantly correlated to each other, as were the ordination configurations and hierarchical clustering dendrograms. In addition, our taxonomy-supervised analyses were also highly correlated with phylogenetic methods, such as UniFrac. The taxonomy-supervised analysis has the advantages that it is not limited by the exhaustive computation required for the alignment and clustering necessary for the taxonomy-unsupervised analysis, is more tolerant of sequencing errors, and allows comparisons when sequences are from different regions of the 16S rRNA gene. With the tremendous expansion in 16S rRNA data acquisition underway, the taxonomy-supervised approach offers the potential to provide more rapid and extensive community comparisons across habitats and samples. PMID:21873204

  16. Comparison of predicted binders in Rhipicephalus (Boophilus) microplus intestine protein variants Bm86 Campo Grande strain, Bm86 and Bm95.

    PubMed

    Andreotti, Renato; Pedroso, Marisela S; Caetano, Alexandre R; Martins, Natália F

    2008-01-01

    This paper reports the sequence analysis of Bm86 Campo Grande strain comparing it with Bm86 and Bm95 antigens from the preparations TickGardPLUS and Gavac, respectively. The PCR product was cloned into pMOSBlue and sequenced. The secondary structure prediction tool PSIPRED was used to calculate alpha helices and beta strand contents of the predicted polypeptide. The hydrophobicity profile was calculated using the algorithms from the Hopp and Woods method, in addition to identification of potential MHC class-I binding regions in the antigens. Pair-wise alignment revealed that the similarity between Bm86 Campo Grande strain and Bm86 is 0.2% higher than that between Bm86 Campo Grande strain and Bm95 antigens. The identities were 96.5% and 96.3% respectively. Major suggestive differences in hydrophobicity were predicted among the sequences in two specific regions.

  17. Hybrid preconditioning for iterative diagonalization of ill-conditioned generalized eigenvalue problems in electronic structure calculations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cai, Yunfeng, E-mail: yfcai@math.pku.edu.cn; Department of Computer Science, University of California, Davis 95616; Bai, Zhaojun, E-mail: bai@cs.ucdavis.edu

    2013-12-15

    The iterative diagonalization of a sequence of large ill-conditioned generalized eigenvalue problems is a computational bottleneck in quantum mechanical methods employing a nonorthogonal basis for ab initio electronic structure calculations. We propose a hybrid preconditioning scheme to effectively combine global and locally accelerated preconditioners for rapid iterative diagonalization of such eigenvalue problems. In partition-of-unity finite-element (PUFE) pseudopotential density-functional calculations, employing a nonorthogonal basis, we show that the hybrid preconditioned block steepest descent method is a cost-effective eigensolver, outperforming current state-of-the-art global preconditioning schemes, and comparably efficient for the ill-conditioned generalized eigenvalue problems produced by PUFE as the locally optimal blockmore » preconditioned conjugate-gradient method for the well-conditioned standard eigenvalue problems produced by planewave methods.« less

  18. Progesterone and testosterone studies by neutron scattering and nuclear magnetic resonance methods and quantum chemistry calculations

    NASA Astrophysics Data System (ADS)

    Szyczewski, A.; Hołderna-Natkaniec, K.; Natkaniec, I.

    2004-05-01

    Inelastic incoherent neutron scattering spectra of progesterone and testosterone measured at 20 and 290 K were compared with the IR spectra measured at 290 K. The Phonon Density of States spectra display well resolved peaks of low frequency internal vibration modes up to 1200 cm -1. The quantum chemistry calculations were performed by semiempirical PM3 method and by the density functional theory method with different basic sets for isolated molecule, as well as for the dimer system of testosterone. The proposed assignment of internal vibrations of normal modes enable us to conclude about the sequence of the onset of the torsion movements of the CH 3 groups. These conclusions were correlated with the results of proton molecular dynamics studies performed by NMR method. The GAUSSIAN program had been used for calculations.

  19. Optimum stacking sequence design of laminated composite circular plates with curvilinear fibres by a layer-wise optimization method

    NASA Astrophysics Data System (ADS)

    Guenanou, A.; Houmat, A.

    2018-05-01

    The optimum stacking sequence design for the maximum fundamental frequency of symmetrically laminated composite circular plates with curvilinear fibres is investigated for the first time using a layer-wise optimization method. The design variables are two fibre orientation angles per layer. The fibre paths are constructed using the method of shifted paths. The first-order shear deformation plate theory and a curved square p-element are used to calculate the objective function. The blending function method is used to model accurately the geometry of the circular plate. The equations of motion are derived using Lagrange's method. The numerical results are validated by means of a convergence test and comparison with published values for symmetrically laminated composite circular plates with rectilinear fibres. The material parameters, boundary conditions, number of layers and thickness are shown to influence the optimum solutions to different extents. The results should serve as a benchmark for optimum stacking sequences of symmetrically laminated composite circular plates with curvilinear fibres.

  20. High-Density Signal Interface Electromagnetic Radiation Prediction for Electromagnetic Compatibility Evaluation.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Halligan, Matthew

    Radiated power calculation approaches for practical scenarios of incomplete high- density interface characterization information and incomplete incident power information are presented. The suggested approaches build upon a method that characterizes power losses through the definition of power loss constant matrices. Potential radiated power estimates include using total power loss information, partial radiated power loss information, worst case analysis, and statistical bounding analysis. A method is also proposed to calculate radiated power when incident power information is not fully known for non-periodic signals at the interface. Incident data signals are modeled from a two-state Markov chain where bit state probabilities aremore » derived. The total spectrum for windowed signals is postulated as the superposition of spectra from individual pulses in a data sequence. Statistical bounding methods are proposed as a basis for the radiated power calculation due to the statistical calculation complexity to find a radiated power probability density function.« less

  1. Thermal calculations pertaining to experiments in the Yucca Mountain Exploratory Shaft

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Montan, D.N.

    1986-03-01

    A series of thermal calculations have been presented that appear to satisfy the needs for design of the Yucca Mountain Exploratory Shaft Tests. The accuracy of the modeling and calculational techniques employed probably exceeds the accuracy of the thermal properties used. The rather close agreement between simple analytical methods (the PLUS Family) and much more complex methods (TRUMP) suggest that the PLUS Family might be appropriate during final design to model, in a single calculation, the entire test array and sequence. Before doing further calculations it is recommended that all available thermal property information be critically evaluated to determine "best"more » values to be used for conductivity and saturation. Another possibility is to design one or more of the test sequences to approximately duplicate the early phase of Heater Test 1. In that experiment an unplanned power outage for about two days that occurred a week into the experiment gave extremely useful data from which to determine the conductivity and diffusivity. In any case we urge that adequate, properly calibrated instrumentation with data output available on a quasi-real time basis be installed. This would allow us to take advantage of significant power changes (planned or not) and also help "steer" the tests to desired temperatures. Finally, it should be kept in mind that the calculations presented here are strictly thermal. No hydrothermal effects due to liquid and vapor pressures have been considered.« less

  2. S -matrix calculations of energy levels of sodiumlike ions

    DOE PAGES

    Sapirstein, J.; Cheng, K. T.

    2015-06-24

    A recent S -matrix-based QED calculation of energy levels of the lithium isoelectronic sequence is extended to the general case of a valence electron outside an arbitrary filled core. Emphasis is placed on modifications of the lithiumlike formulas required because more than one core state is present, and an unusual feature of the two-photon exchange contribution involving autoionizing states is discussed. Here, the method is illustrated with a calculation of the energy levels of sodiumlike ions, with results for 3s 1/2, 3p 1/2, and 3p 3/2 energies tabulated for the range Z = 30 – 100 . Comparison with experimentmore » and other calculations is given, and prospects for extension of the method to ions with more complex electronic structure discussed.« less

  3. Structure-preserving interpolation of temporal and spatial image sequences using an optical flow-based method.

    PubMed

    Ehrhardt, J; Säring, D; Handels, H

    2007-01-01

    Modern tomographic imaging devices enable the acquisition of spatial and temporal image sequences. But, the spatial and temporal resolution of such devices is limited and therefore image interpolation techniques are needed to represent images at a desired level of discretization. This paper presents a method for structure-preserving interpolation between neighboring slices in temporal or spatial image sequences. In a first step, the spatiotemporal velocity field between image slices is determined using an optical flow-based registration method in order to establish spatial correspondence between adjacent slices. An iterative algorithm is applied using the spatial and temporal image derivatives and a spatiotemporal smoothing step. Afterwards, the calculated velocity field is used to generate an interpolated image at the desired time by averaging intensities between corresponding points. Three quantitative measures are defined to evaluate the performance of the interpolation method. The behavior and capability of the algorithm is demonstrated by synthetic images. A population of 17 temporal and spatial image sequences are utilized to compare the optical flow-based interpolation method to linear and shape-based interpolation. The quantitative results show that the optical flow-based method outperforms the linear and shape-based interpolation statistically significantly. The interpolation method presented is able to generate image sequences with appropriate spatial or temporal resolution needed for image comparison, analysis or visualization tasks. Quantitative and qualitative measures extracted from synthetic phantoms and medical image data show that the new method definitely has advantages over linear and shape-based interpolation.

  4. Discovering frequently recurring movement sequences in team-sport athlete spatiotemporal data.

    PubMed

    Sweeting, Alice J; Aughey, Robert J; Cormack, Stuart J; Morgan, Stuart

    2017-12-01

    Athlete external load is typically analysed from predetermined movement thresholds. The combination of movement sequences and differences in these movements between playing positions is also currently unknown. This study developed a method to discover the frequently recurring movement sequences across playing position during matches. The external load of 12 international female netball athletes was collected by a local positioning system during four national-level matches. Velocity, acceleration and angular velocity were calculated from positional (X, Y) data, clustered via one-dimensional k-means and assigned a unique alphabetic label. Combinations of velocity, acceleration and angular velocity movement were compared using the Levenshtein distance and similarities computed by the longest common substring problem. The contribution of each movement sequence, according to playing position and relative to the wider data set, was then calculated via the Minkowski distance. A total of 10 frequently recurring combinations of movement were discovered, regardless of playing position. Only the wing attack, goal attack and goal defence playing positions are closely related. We developed a technique to discover the movement sequences, according to playing position, performed by elite netballers. This methodology can be extended to discover the frequently recurring movements within other team sports and across levels of competition.

  5. Semiempirical Theories of the Affinities of Negative Atomic Ions

    NASA Technical Reports Server (NTRS)

    Edie, John W.

    1961-01-01

    The determination of the electron affinities of negative atomic ions by means of direct experimental investigation is limited. To supplement the meager experimental results, several semiempirical theories have been advanced. One commonly used technique involves extrapolating the electron affinities along the isoelectronic sequences, The most recent of these extrapolations Is studied by extending the method to Include one more member of the isoelectronic sequence, When the results show that this extension does not increase the accuracy of the calculations, several possible explanations for this situation are explored. A different approach to the problem is suggested by the regularities appearing in the electron affinities. Noting that the regular linear pattern that exists for the ionization potentials of the p electrons as a function of Z, repeats itself for different degrees of ionization q, the slopes and intercepts of these curves are extrapolated to the case of the negative Ion. The method is placed on a theoretical basis by calculating the Slater parameters as functions of q and n, the number of equivalent p-electrons. These functions are no more than quadratic in q and n. The electron affinities are calculated by extending the linear relations that exist for the neutral atoms and positive ions to the negative ions. The extrapolated. slopes are apparently correct, but the intercepts must be slightly altered to agree with experiment. For this purpose one or two experimental affinities (depending on the extrapolation method) are used in each of the two short periods. The two extrapolation methods used are: (A) an isoelectronic sequence extrapolation of the linear pattern as such; (B) the same extrapolation of a linearization of this pattern (configuration centers) combined with an extrapolation of the other terms of the ground configurations. The latter method Is preferable, since it requires only experimental point for each period. The results agree within experimental error with all data, except with the most recent value of C, which lies 10% lower.

  6. MRI image plane nonuniformity in evaluation of ferrous sulphate dosimeter gel (FeGel) by means of T1-relaxation time.

    PubMed

    Magnusson, P; Bäck, S A; Olsson, L E

    1999-11-01

    MR image nonuniformity can vary significantly with the spin-echo pulse sequence repetition time. When MR images with different nonuniformity shapes are used in a T1-calculation the resulting T1-image becomes nonuniform. As shown in this work the uniformity TR-dependence of the spin-echo pulse sequence is a critical property for T1 measurements in general and for ferrous sulfate dosimeter gel (FeGel) applications in particular. The purpose was to study the characteristics of the MR image plane nonuniformity in FeGel evaluation. This included studies of the possibility of decreasing nonuniformities by selecting uniformity optimized repetition times, studies of the transmitted and received RF-fields and studies of the effectiveness of the correction methods background subtraction and quotient correction. A pronounced MR image nonuniformity variation with repetition and T1 relaxation time was observed, and was found to originate from nonuniform RF-transmission in combination with the inherent differences in T1 relaxation for different repetition times. The T1 calculation itself, the uniformity optimized repetition times, nor none of the correction methods studied could sufficiently correct the nonuniformities observed in the T1 images. The nonuniformities were found to vary considerably less with inversion time for the inversion-recovery pulse sequence, than with repetition time for the spin-echo pulse sequence, resulting in considerably lower T1 image nonuniformity levels.

  7. Droplet digital PCR technology promises new applications and research areas.

    PubMed

    Manoj, P

    2016-01-01

    Digital Polymerase Chain Reaction (dPCR) is used to quantify nucleic acids and its applications are in the detection and precise quantification of low-level pathogens, rare genetic sequences, quantification of copy number variants, rare mutations and in relative gene expressions. Here the PCR is performed in large number of reaction chambers or partitions and the reaction is carried out in each partition individually. This separation allows a more reliable collection and sensitive measurement of nucleic acid. Results are calculated by counting amplified target sequence (positive droplets) and the number of partitions in which there is no amplification (negative droplets). The mean number of target sequences was calculated by Poisson Algorithm. Poisson correction compensates the presence of more than one copy of target gene in any droplets. The method provides information with accuracy and precision which is highly reproducible and less susceptible to inhibitors than qPCR. It has been demonstrated in studying variations in gene sequences, such as copy number variants and point mutations, distinguishing differences between expression of nearly identical alleles, assessment of clinically relevant genetic variations and it is routinely used for clonal amplification of samples for NGS methods. dPCR enables more reliable predictors of tumor status and patient prognosis by absolute quantitation using reference normalizations. Rare mitochondrial DNA deletions associated with a range of diseases and disorders as well as aging can be accurately detected with droplet digital PCR.

  8. Human action classification using procrustes shape theory

    NASA Astrophysics Data System (ADS)

    Cho, Wanhyun; Kim, Sangkyoon; Park, Soonyoung; Lee, Myungeun

    2015-02-01

    In this paper, we propose new method that can classify a human action using Procrustes shape theory. First, we extract a pre-shape configuration vector of landmarks from each frame of an image sequence representing an arbitrary human action, and then we have derived the Procrustes fit vector for pre-shape configuration vector. Second, we extract a set of pre-shape vectors from tanning sample stored at database, and we compute a Procrustes mean shape vector for these preshape vectors. Third, we extract a sequence of the pre-shape vectors from input video, and we project this sequence of pre-shape vectors on the tangent space with respect to the pole taking as a sequence of mean shape vectors corresponding with a target video. And we calculate the Procrustes distance between two sequences of the projection pre-shape vectors on the tangent space and the mean shape vectors. Finally, we classify the input video into the human action class with minimum Procrustes distance. We assess a performance of the proposed method using one public dataset, namely Weizmann human action dataset. Experimental results reveal that the proposed method performs very good on this dataset.

  9. Exact method for numerically analyzing a model of local denaturation in superhelically stressed DNA

    NASA Astrophysics Data System (ADS)

    Fye, Richard M.; Benham, Craig J.

    1999-03-01

    Local denaturation, the separation at specific sites of the two strands comprising the DNA double helix, is one of the most fundamental processes in biology, required to allow the base sequence to be read both in DNA transcription and in replication. In living organisms this process can be mediated by enzymes which regulate the amount of superhelical stress imposed on the DNA. We present a numerically exact technique for analyzing a model of denaturation in superhelically stressed DNA. This approach is capable of predicting the locations and extents of transition in circular superhelical DNA molecules of kilobase lengths and specified base pair sequences. It can also be used for closed loops of DNA which are typically found in vivo to be kilobases long. The analytic method consists of an integration over the DNA twist degrees of freedom followed by the introduction of auxiliary variables to decouple the remaining degrees of freedom, which allows the use of the transfer matrix method. The algorithm implementing our technique requires O(N2) operations and O(N) memory to analyze a DNA domain containing N base pairs. However, to analyze kilobase length DNA molecules it must be implemented in high precision floating point arithmetic. An accelerated algorithm is constructed by imposing an upper bound M on the number of base pairs that can simultaneously denature in a state. This accelerated algorithm requires O(MN) operations, and has an analytically bounded error. Sample calculations show that it achieves high accuracy (greater than 15 decimal digits) with relatively small values of M (M<0.05N) for kilobase length molecules under physiologically relevant conditions. Calculations are performed on the superhelical pBR322 DNA sequence to test the accuracy of the method. With no free parameters in the model, the locations and extents of local denaturation predicted by this analysis are in quantitatively precise agreement with in vitro experimental measurements. Calculations performed on the fructose-1,6-bisphosphatase gene sequence from yeast show that this approach can also accurately treat in vivo denaturation.

  10. A motif detection and classification method for peptide sequences using genetic programming.

    PubMed

    Tomita, Yasuyuki; Kato, Ryuji; Okochi, Mina; Honda, Hiroyuki

    2008-08-01

    An exploration of common rules (property motifs) in amino acid sequences has been required for the design of novel sequences and elucidation of the interactions between molecules controlled by the structural or physical environment. In the present study, we developed a new method to search property motifs that are common in peptide sequence data. Our method comprises the following two characteristics: (i) the automatic determination of the position and length of common property motifs by calculating the physicochemical similarity of amino acids, and (ii) the quick and effective exploration of motif candidates that discriminates the positives and negatives by the introduction of genetic programming (GP). Our method was evaluated by two types of model data sets. First, the intentionally buried property motifs were searched in the artificially derived peptide data containing intentionally buried property motifs. As a result, the expected property motifs were correctly extracted by our algorithm. Second, the peptide data that interact with MHC class II molecules were analyzed as one of the models of biologically active peptides with buried motifs in various lengths. Twofold MHC class II binding peptides were identified with the rule using our method, compared to the existing scoring matrix method. In conclusion, our GP based motif searching approach enabled to obtain knowledge of functional aspects of the peptides without any prior knowledge.

  11. High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features.

    PubMed

    Jones, David T; Kandathil, Shaun M

    2018-04-26

    In addition to substitution frequency data from protein sequence alignments, many state-of-the-art methods for contact prediction rely on additional sources of information, or features, of protein sequences in order to predict residue-residue contacts, such as solvent accessibility, predicted secondary structure, and scores from other contact prediction methods. It is unclear how much of this information is needed to achieve state-of-the-art results. Here, we show that using deep neural network models, simple alignment statistics contain sufficient information to achieve state-of-the-art precision. Our prediction method, DeepCov, uses fully convolutional neural networks operating on amino-acid pair frequency or covariance data derived directly from sequence alignments, without using global statistical methods such as sparse inverse covariance or pseudolikelihood estimation. Comparisons against CCMpred and MetaPSICOV2 show that using pairwise covariance data calculated from raw alignments as input allows us to match or exceed the performance of both of these methods. Almost all of the achieved precision is obtained when considering relatively local windows (around 15 residues) around any member of a given residue pairing; larger window sizes have comparable performance. Assessment on a set of shallow sequence alignments (fewer than 160 effective sequences) indicates that the new method is substantially more precise than CCMpred and MetaPSICOV2 in this regime, suggesting that improved precision is attainable on smaller sequence families. Overall, the performance of DeepCov is competitive with the state of the art, and our results demonstrate that global models, which employ features from all parts of the input alignment when predicting individual contacts, are not strictly needed in order to attain precise contact predictions. DeepCov is freely available at https://github.com/psipred/DeepCov. d.t.jones@ucl.ac.uk.

  12. COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features.

    PubMed

    Hu, Long; Xu, Zhiyu; Hu, Boqin; Lu, Zhi John

    2017-01-09

    Recent genomic studies suggest that novel long non-coding RNAs (lncRNAs) are specifically expressed and far outnumber annotated lncRNA sequences. To identify and characterize novel lncRNAs in RNA sequencing data from new samples, we have developed COME, a coding potential calculation tool based on multiple features. It integrates multiple sequence-derived and experiment-based features using a decompose-compose method, which makes it more accurate and robust than other well-known tools. We also showed that COME was able to substantially improve the consistency of predication results from other coding potential calculators. Moreover, COME annotates and characterizes each predicted lncRNA transcript with multiple lines of supporting evidence, which are not provided by other tools. Remarkably, we found that one subgroup of lncRNAs classified by such supporting features (i.e. conserved local RNA secondary structure) was highly enriched in a well-validated database (lncRNAdb). We further found that the conserved structural domains on lncRNAs had better chance than other RNA regions to interact with RNA binding proteins, based on the recent eCLIP-seq data in human, indicating their potential regulatory roles. Overall, we present COME as an accurate, robust and multiple-feature supported method for the identification and characterization of novel lncRNAs. The software implementation is available at https://github.com/lulab/COME. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Determination of fetal DNA fraction from the plasma of pregnant women using sequence read counts.

    PubMed

    Kim, Sung K; Hannum, Gregory; Geis, Jennifer; Tynan, John; Hogg, Grant; Zhao, Chen; Jensen, Taylor J; Mazloom, Amin R; Oeth, Paul; Ehrich, Mathias; van den Boom, Dirk; Deciu, Cosmin

    2015-08-01

    This study introduces a novel method, referred to as SeqFF, for estimating the fetal DNA fraction in the plasma of pregnant women and to infer the underlying mechanism that allows for such statistical modeling. Autosomal regional read counts from whole-genome massively parallel single-end sequencing of circulating cell-free DNA (ccfDNA) from the plasma of 25 312 pregnant women were used to train a multivariate model. The pretrained model was then applied to 505 pregnant samples to assess the performance of SeqFF against known methodologies for fetal DNA fraction calculations. Pearson's correlation between chromosome Y and SeqFF for pregnancies with male fetuses from two independent cohorts ranged from 0.932 to 0.938. Comparison between a single-nucleotide polymorphism-based approach and SeqFF yielded a Pearson's correlation of 0.921. Paired-end sequencing suggests that shorter ccfDNA, that is, less than 150 bp in length, is nonuniformly distributed across the genome. Regions exhibiting an increased proportion of short ccfDNA, which are more likely of fetal origin, tend to provide more information in the SeqFF calculations. SeqFF is a robust and direct method to determine fetal DNA fraction. Furthermore, the method is applicable to both male and female pregnancies and can greatly improve the accuracy of noninvasive prenatal testing for fetal copy number variation. © 2015 John Wiley & Sons, Ltd.

  14. Photoionization of the beryllium isoelectronic sequence: Relativistic and nonrelativistic R-matrix calculations

    NASA Astrophysics Data System (ADS)

    Chu, Wei-Chun

    The photoionization of the beryllium-like isoelectronic series has been studied. The bound state wave functions of the target ions were built with CIV3 program. The relativistic Breit-Pauli R-matrix method was used to calculate the cross sections in the photon energy range between the ionization threshold and 1s24 f7/2 threshold for each ion. For the total cross sections of Be, B+, C+2, N+3, and O +4, our results match experiment well. The comparison between the present work and other theoretical works are also discussed. We show the comparison with our LS results as it indicates the importance of relativistic effects on different ions. In the analysis, the resonances converging to 1 s22lj and 1s 23lj were identified and characterized with quantum defects, energies and widths using the eigenphase sum methodology. We summarize the general appearance of resonances along the resonance series and along the isoelectronic sequence. Partial cross sections are also reported systematically along the sequence. All calculations were performed on the NERSC system. INDEX WORDS: Photoionization, R-matrix, Cross section, Beryllium-like ion, Resonance

  15. Spectrum Orbit Utilization Program documentation: SOUP5 version 3.8 user's manual, volume 1, chapters 1 through 5

    NASA Technical Reports Server (NTRS)

    Davidson, J.; Ottey, H. R.; Sawitz, P.; Zusman, F. S.

    1985-01-01

    The underlying engineering and mathematical models as well as the computational methods used by the Spectrum Orbit Utilization Program 5 (SOUP5) analysis programs are described. Included are the algorithms used to calculate the technical parameters, and references to the technical literature. The organization, capabilities, processing sequences, and processing and data options of the SOUP5 system are described. The details of the geometric calculations are given. Also discussed are the various antenna gain algorithms; rain attenuation and depolarization calculations; calculations of transmitter power and received power flux density; channelization options, interference categories, and protection ratio calculation; generation of aggregrate interference and margins; equivalent gain calculations; and how to enter a protection ratio template.

  16. Methodological reporting of randomized trials in five leading Chinese nursing journals.

    PubMed

    Shi, Chunhu; Tian, Jinhui; Ren, Dan; Wei, Hongli; Zhang, Lihuan; Wang, Quan; Yang, Kehu

    2014-01-01

    Randomized controlled trials (RCTs) are not always well reported, especially in terms of their methodological descriptions. This study aimed to investigate the adherence of methodological reporting complying with CONSORT and explore associated trial level variables in the Chinese nursing care field. In June 2012, we identified RCTs published in five leading Chinese nursing journals and included trials with details of randomized methods. The quality of methodological reporting was measured through the methods section of the CONSORT checklist and the overall CONSORT methodological items score was calculated and expressed as a percentage. Meanwhile, we hypothesized that some general and methodological characteristics were associated with reporting quality and conducted a regression with these data to explore the correlation. The descriptive and regression statistics were calculated via SPSS 13.0. In total, 680 RCTs were included. The overall CONSORT methodological items score was 6.34 ± 0.97 (Mean ± SD). No RCT reported descriptions and changes in "trial design," changes in "outcomes" and "implementation," or descriptions of the similarity of interventions for "blinding." Poor reporting was found in detailing the "settings of participants" (13.1%), "type of randomization sequence generation" (1.8%), calculation methods of "sample size" (0.4%), explanation of any interim analyses and stopping guidelines for "sample size" (0.3%), "allocation concealment mechanism" (0.3%), additional analyses in "statistical methods" (2.1%), and targeted subjects and methods of "blinding" (5.9%). More than 50% of trials described randomization sequence generation, the eligibility criteria of "participants," "interventions," and definitions of the "outcomes" and "statistical methods." The regression analysis found that publication year and ITT analysis were weakly associated with CONSORT score. The completeness of methodological reporting of RCTs in the Chinese nursing care field is poor, especially with regard to the reporting of trial design, changes in outcomes, sample size calculation, allocation concealment, blinding, and statistical methods.

  17. Development of a Rapid Identification Method for a Variety of Antibody Candidates Using High-throughput Sequencing.

    PubMed

    Ito, Yuji

    2017-01-01

    As an alternative to hybridoma technology, the antibody phage library system can also be used for antibody selection. This method enables the isolation of antigen-specific binders through an in vitro selection process known as biopanning. While it has several advantages, such as an avoidance of animal immunization, the phage cloning and screening steps of biopanning are time-consuming and problematic. Here, we introduce a novel biopanning method combined with high-throughput sequencing (HTS) using a next-generation sequencer (NGS) to save time and effort in antibody selection, and to increase the diversity of acquired antibody sequences. Biopannings against a target antigen were performed using a human single chain Fv (scFv) antibody phage library. VH genes in pooled phages at each round of biopanning were analyzed by HTS on a NGS. The obtained data were trimmed, merged, and translated into amino acid sequences. The frequencies (%) of the respective VH sequences at each biopanning step were calculated, and the amplification factor (change of frequency through biopanning) was obtained to estimate the potential for antigen binding. A phylogenetic tree was drawn using the top 50 VH sequences with high amplification factors. Representative VH sequences forming the cluster were then picked up and used to reconstruct scFv genes harboring these VHs. Their derived scFv-Fc fusion proteins showed clear antigen binding activity. These results indicate that a combination of biopanning and HTS enables the rapid and comprehensive identification of specific binders from antibody phage libraries.

  18. JDet: interactive calculation and visualization of function-related conservation patterns in multiple sequence alignments and structures.

    PubMed

    Muth, Thilo; García-Martín, Juan A; Rausell, Antonio; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2012-02-15

    We have implemented in a single package all the features required for extracting, visualizing and manipulating fully conserved positions as well as those with a family-dependent conservation pattern in multiple sequence alignments. The program allows, among other things, to run different methods for extracting these positions, combine the results and visualize them in protein 3D structures and sequence spaces. JDet is a multiplatform application written in Java. It is freely available, including the source code, at http://csbg.cnb.csic.es/JDet. The package includes two of our recently developed programs for detecting functional positions in protein alignments (Xdet and S3Det), and support for other methods can be added as plug-ins. A help file and a guided tutorial for JDet are also available.

  19. FW-CADIS Method for Global and Semi-Global Variance Reduction of Monte Carlo Radiation Transport Calculations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wagner, John C; Peplow, Douglas E.; Mosher, Scott W

    2014-01-01

    This paper presents a new hybrid (Monte Carlo/deterministic) method for increasing the efficiency of Monte Carlo calculations of distributions, such as flux or dose rate distributions (e.g., mesh tallies), as well as responses at multiple localized detectors and spectra. This method, referred to as Forward-Weighted CADIS (FW-CADIS), is an extension of the Consistent Adjoint Driven Importance Sampling (CADIS) method, which has been used for more than a decade to very effectively improve the efficiency of Monte Carlo calculations of localized quantities, e.g., flux, dose, or reaction rate at a specific location. The basis of this method is the development ofmore » an importance function that represents the importance of particles to the objective of uniform Monte Carlo particle density in the desired tally regions. Implementation of this method utilizes the results from a forward deterministic calculation to develop a forward-weighted source for a deterministic adjoint calculation. The resulting adjoint function is then used to generate consistent space- and energy-dependent source biasing parameters and weight windows that are used in a forward Monte Carlo calculation to obtain more uniform statistical uncertainties in the desired tally regions. The FW-CADIS method has been implemented and demonstrated within the MAVRIC sequence of SCALE and the ADVANTG/MCNP framework. Application of the method to representative, real-world problems, including calculation of dose rate and energy dependent flux throughout the problem space, dose rates in specific areas, and energy spectra at multiple detectors, is presented and discussed. Results of the FW-CADIS method and other recently developed global variance reduction approaches are also compared, and the FW-CADIS method outperformed the other methods in all cases considered.« less

  20. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology.

    PubMed

    Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2014-09-07

    Due to the central roles of lipid binding proteins (LBPs) in many biological processes, sequence based identification of LBPs is of great interest. The major challenge is that LBPs are diverse in sequence, structure, and function which results in low accuracy of sequence homology based methods. Therefore, there is a need for developing alternative functional prediction methods irrespective of sequence similarity. To identify LBPs from non-LBPs, the performances of support vector machine (SVM) and neural network were compared in this study. Comprehensive protein features and various techniques were employed to create datasets. Five-fold cross-validation (CV) and independent evaluation (IE) tests were used to assess the validity of the two methods. The results indicated that SVM outperforms neural network. SVM achieved 89.28% (CV) and 89.55% (IE) overall accuracy in identification of LBPs from non-LBPs and 92.06% (CV) and 92.90% (IE) (in average) for classification of different LBPs classes. Increasing the number and the range of extracted protein features as well as optimization of the SVM parameters significantly increased the efficiency of LBPs class prediction in comparison to the only previous report in this field. Altogether, the results showed that the SVM algorithm can be run on broad, computationally calculated protein features and offers a promising tool in detection of LBPs classes. The proposed approach has the potential to integrate and improve the common sequence alignment based methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  1. Bayesian selection of Markov models for symbol sequences: application to microsaccadic eye movements.

    PubMed

    Bettenbühl, Mario; Rusconi, Marco; Engbert, Ralf; Holschneider, Matthias

    2012-01-01

    Complex biological dynamics often generate sequences of discrete events which can be described as a Markov process. The order of the underlying Markovian stochastic process is fundamental for characterizing statistical dependencies within sequences. As an example for this class of biological systems, we investigate the Markov order of sequences of microsaccadic eye movements from human observers. We calculate the integrated likelihood of a given sequence for various orders of the Markov process and use this in a Bayesian framework for statistical inference on the Markov order. Our analysis shows that data from most participants are best explained by a first-order Markov process. This is compatible with recent findings of a statistical coupling of subsequent microsaccade orientations. Our method might prove to be useful for a broad class of biological systems.

  2. Two-dimensional PCA-based human gait identification

    NASA Astrophysics Data System (ADS)

    Chen, Jinyan; Wu, Rongteng

    2012-11-01

    It is very necessary to recognize person through visual surveillance automatically for public security reason. Human gait based identification focus on recognizing human by his walking video automatically using computer vision and image processing approaches. As a potential biometric measure, human gait identification has attracted more and more researchers. Current human gait identification methods can be divided into two categories: model-based methods and motion-based methods. In this paper a two-Dimensional Principal Component Analysis and temporal-space analysis based human gait identification method is proposed. Using background estimation and image subtraction we can get a binary images sequence from the surveillance video. By comparing the difference of two adjacent images in the gait images sequence, we can get a difference binary images sequence. Every binary difference image indicates the body moving mode during a person walking. We use the following steps to extract the temporal-space features from the difference binary images sequence: Projecting one difference image to Y axis or X axis we can get two vectors. Project every difference image in the difference binary images sequence to Y axis or X axis difference binary images sequence we can get two matrixes. These two matrixes indicate the styles of one walking. Then Two-Dimensional Principal Component Analysis(2DPCA) is used to transform these two matrixes to two vectors while at the same time keep the maximum separability. Finally the similarity of two human gait images is calculated by the Euclidean distance of the two vectors. The performance of our methods is illustrated using the CASIA Gait Database.

  3. Calculation of levels, transition rates, and lifetimes for the arsenic isoelectronic sequence Sn XVIII-Ba XXIV, W XLII

    NASA Astrophysics Data System (ADS)

    Wang, K.; Chen, Z. B.; Chen, C. Y.; Yan, J.; Dang, W.; Zhao, X. H.; Yang, X.

    2017-09-01

    Multi-configuration Dirac-Fock (MCDF) calculations of energy levels, wavelengths, oscillator strengths, lifetimes, and electric dipole (E1), magnetic dipole (M1), electric quadrupole (E2), magnetic quadrupole (M2) transition rates are reported for the arsenic isoelectronic sequence Sn XVIII-Ba XXIV, W XLII. Results are presented among the 86 levels of the 4s2 4p3, 4 s 4p4, 4p5, 4s2 4p2 4 d, and 4 s 4p3 4 d configurations in each ion. The relativistic atomic structure package GRASP2K is adopted for the calculations, in which the contributions from the correlations within the n ≤ 7 complexes, Breit interaction (BI) and quantum electrodynamics (QED) effects are taking into account. The many-body perturbation theory (MBPT) method is also employed as an independent calculation for comparison purposes, taking W XLII as an example. Calculated results are compared with data from other calculations and the observed values from the Atomic Spectra Database (ASD) of the National Institute of Standards and Technology (NIST). Good agreements are obtained. i.e, the accuracy of our energy levels is assessed to be better than 0.6%. These accurate theoretical data should be useful for diagnostics of hot plasmas in fusion devices.

  4. Multi-Observation Continuous Density Hidden Markov Models for Anomaly Detection in Full Motion Video

    DTIC Science & Technology

    2012-06-01

    response profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.5 Method for measuring angular movement versus average direction...of movement 49 3.6 Method for calculating Angular Deviation, Θ . . . . . . . . . . . . . . . . . . 50 4.1 HMM produced by K Means Learning for agent H... Angular Deviation. A random variable, the difference in heading (in degrees) from the overall direction of movement over the sequence • S : Speed. A

  5. Use of Life Course Work–Family Profiles to Predict Mortality Risk Among US Women

    PubMed Central

    Guevara, Ivan Mejía; Glymour, M. Maria; Berkman, Lisa F.

    2015-01-01

    Objectives. We examined relationships between US women’s exposure to midlife work–family demands and subsequent mortality risk. Methods. We used data from women born 1935 to 1956 in the Health and Retirement Study to calculate employment, marital, and parenthood statuses for each age between 16 and 50 years. We used sequence analysis to identify 7 prototypical work–family trajectories. We calculated age-standardized mortality rates and hazard ratios (HRs) for mortality associated with work–family sequences, with adjustment for covariates and potentially explanatory later-life factors. Results. Married women staying home with children briefly before reentering the workforce had the lowest mortality rates. In comparison, after adjustment for age, race/ethnicity, and education, HRs for mortality were 2.14 (95% confidence interval [CI] = 1.58, 2.90) among single nonworking mothers, 1.48 (95% CI = 1.06, 1.98) among single working mothers, and 1.36 (95% CI = 1.02, 1.80) among married nonworking mothers. Adjustment for later-life behavioral and economic factors partially attenuated risks. Conclusions. Sequence analysis is a promising exposure assessment tool for life course research. This method permitted identification of certain lifetime work–family profiles associated with mortality risk before age 75 years. PMID:25713976

  6. Prediction of glutathionylation sites in proteins using minimal sequence information and their experimental validation.

    PubMed

    Pal, Debojyoti; Sharma, Deepak; Kumar, Mukesh; Sandur, Santosh K

    2016-09-01

    S-glutathionylation of proteins plays an important role in various biological processes and is known to be protective modification during oxidative stress. Since, experimental detection of S-glutathionylation is labor intensive and time consuming, bioinformatics based approach is a viable alternative. Available methods require relatively longer sequence information, which may prevent prediction if sequence information is incomplete. Here, we present a model to predict glutathionylation sites from pentapeptide sequences. It is based upon differential association of amino acids with glutathionylated and non-glutathionylated cysteines from a database of experimentally verified sequences. This data was used to calculate position dependent F-scores, which measure how a particular amino acid at a particular position may affect the likelihood of glutathionylation event. Glutathionylation-score (G-score), indicating propensity of a sequence to undergo glutathionylation, was calculated using position-dependent F-scores for each amino-acid. Cut-off values were used for prediction. Our model returned an accuracy of 58% with Matthew's correlation-coefficient (MCC) value of 0.165. On an independent dataset, our model outperformed the currently available model, in spite of needing much less sequence information. Pentapeptide motifs having high abundance among glutathionylated proteins were identified. A list of potential glutathionylation hotspot sequences were obtained by assigning G-scores and subsequent Protein-BLAST analysis revealed a total of 254 putative glutathionable proteins, a number of which were already known to be glutathionylated. Our model predicted glutathionylation sites in 93.93% of experimentally verified glutathionylated proteins. Outcome of this study may assist in discovering novel glutathionylation sites and finding candidate proteins for glutathionylation.

  7. Overview of Next-generation Sequencing Platforms Used in Published Draft Plant Genomes in Light of Genotypization of Immortelle Plant (Helichrysium Arenarium)

    PubMed Central

    Hodzic, Jasin; Gurbeta, Lejla; Omanovic-Miklicanin, Enisa; Badnjevic, Almir

    2017-01-01

    Introduction: Major advancements in DNA sequencing methods introduced in the first decade of the new millennium initiated a rapid expansion of sequencing studies, which yielded a tremendous amount of DNA sequence data, including whole sequenced genomes of various species, including plants. A set of novel sequencing platforms, often collectively named as “next-generation sequencing” (NGS) completely transformed the life sciences, by allowing extensive throughput, while greatly reducing the necessary time, labor and cost of any sequencing endeavor. Purpose: of this paper is to present an overview NGS platforms used to produce the current compendium of published draft genomes of various plants, namely the Roche/454, ABI/SOLiD, and Solexa/Illumina, and to determine the most frequently used platform for the whole genome sequencing of plants in light of genotypization of immortelle plant. Materials and methods: 45 papers were selected (with 47 presented plant genome draft sequences), and utilized sequencing techniques and NGS platforms (Roche/454, ABI/SOLiD and Illumina/Solexa) in selected papers were determined. Subsequently, frequency of usage of each platform or combination of platforms was calculated. Results: Illumina/Solexa platforms are by used either as sole sequencing tool in 40.42% of published genomes, or in combination with other platforms - additional 48.94% of published genomes, followed by Roche/454 platforms, used in combination with traditional Sanger sequencing method (10.64%), and never as a sole tool. ABI/SOLiD was only used in combination with Illumina/Solexa and Roche/454 in 4.25% of publications. Conclusions: Illumina/Solexa platforms are by far most preferred by researchers, most probably due to most affordable sequencing costs. Taking into consideration the current economic situation in the Balkans region, Illumina Solexa is the best (if not the only) platform choice if the sequencing of immortelle plant (Helichrysium arenarium) is to be performed by the researchers in this region. PMID:28974852

  8. Increased fMRI Sensitivity at Equal Data Burden Using Averaged Shifted Echo Acquisition

    PubMed Central

    Witt, Suzanne T.; Warntjes, Marcel; Engström, Maria

    2016-01-01

    There is growing evidence as to the benefits of collecting BOLD fMRI data with increased sampling rates. However, many of the newly developed acquisition techniques developed to collect BOLD data with ultra-short TRs require hardware, software, and non-standard analytic pipelines that may not be accessible to all researchers. We propose to incorporate the method of shifted echo into a standard multi-slice, gradient echo EPI sequence to achieve a higher sampling rate with a TR of <1 s with acceptable spatial resolution. We further propose to incorporate temporal averaging of consecutively acquired EPI volumes to both ameliorate the reduced temporal signal-to-noise inherent in ultra-fast EPI sequences and reduce the data burden. BOLD data were collected from 11 healthy subjects performing a simple, event-related visual-motor task with four different EPI sequences: (1) reference EPI sequence with TR = 1440 ms, (2) shifted echo EPI sequence with TR = 700 ms, (3) shifted echo EPI sequence with every two consecutively acquired EPI volumes averaged and effective TR = 1400 ms, and (4) shifted echo EPI sequence with every four consecutively acquired EPI volumes averaged and effective TR = 2800 ms. Both the temporally averaged sequences exhibited increased temporal signal-to-noise over the shifted echo EPI sequence. The shifted echo sequence with every two EPI volumes averaged also had significantly increased BOLD signal change compared with the other three sequences, while the shifted echo sequence with every four EPI volumes averaged had significantly decreased BOLD signal change compared with the other three sequences. The results indicated that incorporating the method of shifted echo into a standard multi-slice EPI sequence is a viable method for achieving increased sampling rate for collecting event-related BOLD data. Further, consecutively averaging every two consecutively acquired EPI volumes significantly increased the measured BOLD signal change and the subsequently calculated activation map statistics. PMID:27932947

  9. Stark widths regularities within spectral series of sodium isoelectronic sequence

    NASA Astrophysics Data System (ADS)

    Trklja, Nora; Tapalaga, Irinel; Dojčinović, Ivan P.; Purić, Jagoš

    2018-02-01

    Stark widths within spectral series of sodium isoelectronic sequence have been studied. This is a unique approach that includes both neutrals and ions. Two levels of problem are considered: if the required atomic parameters are known, Stark widths can be calculated by some of the known methods (in present paper modified semiempirical formula has been used), but if there is a lack of parameters, regularities enable determination of Stark broadening data. In the framework of regularity research, Stark broadening dependence on environmental conditions and certain atomic parameters has been investigated. The aim of this work is to give a simple model, with minimum of required parameters, which can be used for calculation of Stark broadening data for any chosen transitions within sodium like emitters. Obtained relations were used for predictions of Stark widths for transitions that have not been measured or calculated yet. This system enables fast data processing by using of proposed theoretical model and it provides quality control and verification of obtained results.

  10. Monte-Carlo Method Application for Precising Meteor Velocity from TV Observations

    NASA Astrophysics Data System (ADS)

    Kozak, P.

    2014-12-01

    Monte-Carlo method (method of statistical trials) as an application for meteor observations processing was developed in author's Ph.D. thesis in 2005 and first used in his works in 2008. The idea of using the method consists in that if we generate random values of input data - equatorial coordinates of the meteor head in a sequence of TV frames - in accordance with their statistical distributions we get a possibility to plot the probability density distributions for all its kinematical parameters, and to obtain their mean values and dispersions. At that the theoretical possibility appears to precise the most important parameter - geocentric velocity of a meteor - which has the highest influence onto precision of meteor heliocentric orbit elements calculation. In classical approach the velocity vector was calculated in two stages: first we calculate the vector direction as a vector multiplication of vectors of poles of meteor trajectory big circles, calculated from two observational points. Then we calculated the absolute value of velocity independently from each observational point selecting any of them from some reasons as a final parameter. In the given method we propose to obtain a statistical distribution of velocity absolute value as an intersection of two distributions corresponding to velocity values obtained from different points. We suppose that such an approach has to substantially increase the precision of meteor velocity calculation and remove any subjective inaccuracies.

  11. [Molecular identification of astragali radix and its adulterants by ITS sequences].

    PubMed

    Cui, Zhan-Hu; Li, Yue; Yuan, Qing-Jun; Zhou, Li-She; Li, Min-Hui

    2012-12-01

    To explore a new method for identification Astragali Radix from its adulterants by using ITS sequence. Thirteen samples of the different Astragali Radix materials and 6 samples of the adulterants of the roots of Hedysarum polybotrys, Medicago sativa and Althaea rosea were collected. ITS sequence was amplified by PCR and sequenced unidirectionally. The interspecific K-2-P distances of Astragali Radix and its adulterants were calculated, and NJ tree and UPGMA tree were constructed by MEGA 4. ITS sequences were obtained from 19 samples respectively, there were Astragali Radix 646-650 bp, H. polybotrys 664 bp, Medicago sativa 659 bp, Althaea rosea 728 bp, which were registered in the GenBank. Phylogeny trees reconstruction using NJ and UPGMA analysis based on ITS nucleotide sequences can effectively distinguish Astragali Radix from adulterants. ITS sequence can be used to identify Astragali Radix from its adulterants successfully and is an efficient molecular marker for authentication of Astragali Radix and its adulterants.

  12. Three-dimensional T1rho-weighted MRI at 1.5 Tesla.

    PubMed

    Borthakur, Arijitt; Wheaton, Andrew; Charagundla, Sridhar R; Shapiro, Erik M; Regatte, Ravinder R; Akella, Sarma V S; Kneeland, J Bruce; Reddy, Ravinder

    2003-06-01

    To design and implement a magnetic resonance imaging (MRI) pulse sequence capable of performing three-dimensional T(1rho)-weighted MRI on a 1.5-T clinical scanner, and determine the optimal sequence parameters, both theoretically and experimentally, so that the energy deposition by the radiofrequency pulses in the sequence, measured as the specific absorption rate (SAR), does not exceed safety guidelines for imaging human subjects. A three-pulse cluster was pre-encoded to a three-dimensional gradient-echo imaging sequence to create a three-dimensional, T(1rho)-weighted MRI pulse sequence. Imaging experiments were performed on a GE clinical scanner with a custom-built knee-coil. We validated the performance of this sequence by imaging articular cartilage of a bovine patella and comparing T(1rho) values measured by this sequence to those obtained with a previously tested two-dimensional imaging sequence. Using a previously developed model for SAR calculation, the imaging parameters were adjusted such that the energy deposition by the radiofrequency pulses in the sequence did not exceed safety guidelines for imaging human subjects. The actual temperature increase due to the sequence was measured in a phantom by a MRI-based temperature mapping technique. Following these experiments, the performance of this sequence was demonstrated in vivo by obtaining T(1rho)-weighted images of the knee joint of a healthy individual. Calculated T(1rho) of articular cartilage in the specimen was similar for both and three-dimensional and two-dimensional methods (84 +/- 2 msec and 80 +/- 3 msec, respectively). The temperature increase in the phantom resulting from the sequence was 0.015 degrees C, which is well below the established safety guidelines. Images of the human knee joint in vivo demonstrate a clear delineation of cartilage from surrounding tissues. We developed and implemented a three-dimensional T(1rho)-weighted pulse sequence on a 1.5-T clinical scanner. Copyright 2003 Wiley-Liss, Inc.

  13. The reliability of humerothoracic angles during arm elevation depends on the representation of rotations.

    PubMed

    López-Pascual, Juan; Cáceres, Magda Liliana; De Rosario, Helios; Page, Álvaro

    2016-02-08

    The reliability of joint rotation measurements is an issue of major interest, especially in clinical applications. The effect of instrumental errors and soft tissue artifacts on the variability of human motion measures is well known, but the influence of the representation of joint motion has not yet been studied. The aim of the study was to compare the within-subject reliability of three rotation formalisms for the calculation of the shoulder elevation joint angles. Five repetitions of humeral elevation in the scapular plane of 27 healthy subjects were recorded using a stereophotogrammetry system. The humerothoracic joint angles were calculated using the YX'Y" and XZ'Y" Euler angle sequences and the attitude vector. A within-subject repeatability study was performed for the three representations. ICC, SEM and CV were the indices used to estimate the error in the calculation of the angle amplitudes and the angular waveforms with each method. Excellent results were obtained in all representations for the main angle (elevation), but there were remarkable differences for axial rotation and plane of elevation. The YX'Y" sequence generally had the poorest reliability in the secondary angles. The XZ'Y' sequence proved to be the most reliable representation of axial rotation, whereas the attitude vector had the highest reliability in the plane of elevation. These results highlight the importance of selecting the method used to describe the joint motion when within-subjects reliability is an important issue of the experiment. This may be of particular importance when the secondary angles of motions are being studied. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Shaping up the protein folding funnel by local interaction: lesson from a structure prediction study.

    PubMed

    Chikenji, George; Fujitsuka, Yoshimi; Takada, Shoji

    2006-02-28

    Predicting protein tertiary structure by folding-like simulations is one of the most stringent tests of how much we understand the principle of protein folding. Currently, the most successful method for folding-based structure prediction is the fragment assembly (FA) method. Here, we address why the FA method is so successful and its lesson for the folding problem. To do so, using the FA method, we designed a structure prediction test of "chimera proteins." In the chimera proteins, local structural preference is specific to the target sequences, whereas nonlocal interactions are only sequence-independent compaction forces. We find that these chimera proteins can find the native folds of the intact sequences with high probability indicating dominant roles of the local interactions. We further explore roles of local structural preference by exact calculation of the HP lattice model of proteins. From these results, we suggest principles of protein folding: For small proteins, compact structures that are fully compatible with local structural preference are few, one of which is the native fold. These local biases shape up the funnel-like energy landscape.

  15. Shaping up the protein folding funnel by local interaction: Lesson from a structure prediction study

    PubMed Central

    Chikenji, George; Fujitsuka, Yoshimi; Takada, Shoji

    2006-01-01

    Predicting protein tertiary structure by folding-like simulations is one of the most stringent tests of how much we understand the principle of protein folding. Currently, the most successful method for folding-based structure prediction is the fragment assembly (FA) method. Here, we address why the FA method is so successful and its lesson for the folding problem. To do so, using the FA method, we designed a structure prediction test of “chimera proteins.” In the chimera proteins, local structural preference is specific to the target sequences, whereas nonlocal interactions are only sequence-independent compaction forces. We find that these chimera proteins can find the native folds of the intact sequences with high probability indicating dominant roles of the local interactions. We further explore roles of local structural preference by exact calculation of the HP lattice model of proteins. From these results, we suggest principles of protein folding: For small proteins, compact structures that are fully compatible with local structural preference are few, one of which is the native fold. These local biases shape up the funnel-like energy landscape. PMID:16488978

  16. Detection and interrogation of biomolecules via nanoscale probes: From fundamental physics to DNA sequencing

    NASA Astrophysics Data System (ADS)

    Zwolak, Michael

    2013-03-01

    A rapid and low-cost method to sequence DNA would revolutionize personalized medicine, where genetic information is used to diagnose, treat, and prevent diseases. There is a longstanding interest in nanopores as a platform for rapid interrogation of single DNA molecules. I will discuss a sequencing protocol based on the measurement of transverse electronic currents during the translocation of single-stranded DNA through nanopores. Using molecular dynamics simulations coupled to quantum mechanical calculations of the tunneling current, I will show that the DNA nucleotides are predicted to have distinguishable electronic signatures in experimentally realizable systems. Several recent experiments support our theoretical predictions. In addition to their possible impact in medicine and biology, the above methods offer ideal test beds to study open scientific issues in the relatively unexplored area at the interface between solids, liquids, and biomolecules at the nanometer length scale. http://mike.zwolak.org

  17. pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree

    PubMed Central

    2010-01-01

    Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service. PMID:21034504

  18. Assessment of proximal pulmonary arterial stiffness using magnetic resonance imaging: effects of technique, age and exercise

    PubMed Central

    Kamalasanan, Anu; Cassidy, Deidre B; Struthers, Allan D; Lipworth, Brian J; Houston, J Graeme

    2016-01-01

    Introduction To compare the reproducibility of pulmonary pulse wave velocity (PWV) techniques, and the effects of age and exercise on these. Methods 10 young healthy volunteers (YHV) and 20 older healthy volunteers (OHV) with no cardiac or lung condition were recruited. High temporal resolution phase contrast sequences were performed through the main pulmonary arteries (MPAs), right pulmonary arteries (RPAs) and left pulmonary arteries (LPAs), while high spatial resolution sequences were obtained through the MPA. YHV underwent 2 MRIs 6 months apart with the sequences repeated during exercise. OHV underwent an MRI scan with on-table repetition. PWV was calculated using the transit time (TT) and flow area techniques (QA). 3 methods for calculating QA PWV were compared. Results PWV did not differ between the two age groups (YHV 2.4±0.3/ms, OHV 2.9±0.2/ms, p=0.1). Using a high temporal resolution sequence through the RPA using the QA accounting for wave reflections yielded consistently better within-scan, interscan, intraobserver and interobserver reproducibility. Exercise did not result in a change in either TT PWV (mean (95% CI) of the differences: −0.42 (−1.2 to 0.4), p=0.24) or QA PWV (mean (95% CI) of the differences: 0.10 (−0.5 to 0.9), p=0.49) despite a significant rise in heart rate (65±2 to 87±3, p<0.0001), blood pressure (113/68 to 130/84, p<0.0001) and cardiac output (5.4±0.4 to 6.7±0.6 L/min, p=0.004). Conclusions QA PWV performed through the RPA using a high temporal resolution sequence accounting for wave reflections yields the most reproducible measurements of pulmonary PWV. PMID:27843548

  19. Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairs.

    PubMed

    Auch, Alexander F; Klenk, Hans-Peter; Göker, Markus

    2010-01-28

    DNA-DNA hybridization (DDH) is a widely applied wet-lab technique to obtain an estimate of the overall similarity between the genomes of two organisms. To base the species concept for prokaryotes ultimately on DDH was chosen by microbiologists as a pragmatic approach for deciding about the recognition of novel species, but also allowed a relatively high degree of standardization compared to other areas of taxonomy. However, DDH is tedious and error-prone and first and foremost cannot be used to incrementally establish a comparative database. Recent studies have shown that in-silico methods for the comparison of genome sequences can be used to replace DDH. Considering the ongoing rapid technological progress of sequencing methods, genome-based prokaryote taxonomy is coming into reach. However, calculating distances between genomes is dependent on multiple choices for software and program settings. We here provide an overview over the modifications that can be applied to distance methods based in high-scoring segment pairs (HSPs) or maximally unique matches (MUMs) and that need to be documented. General recommendations on determining HSPs using BLAST or other algorithms are also provided. As a reference implementation, we introduce the GGDC web server (http://ggdc.gbdp.org).

  20. Novel Δ J =1 Sequence in 78Ge: Possible Evidence for Triaxiality

    NASA Astrophysics Data System (ADS)

    Forney, A. M.; Walters, W. B.; Chiara, C. J.; Janssens, R. V. F.; Ayangeakaa, A. D.; Sethi, J.; Harker, J.; Alcorta, M.; Carpenter, M. P.; Gürdal, G.; Hoffman, C. R.; Kay, B. P.; Kondev, F. G.; Lauritsen, T.; Lister, C. J.; McCutchan, E. A.; Rogers, A. M.; Seweryniak, D.; Stefanescu, I.; Zhu, S.

    2018-05-01

    A sequence of low-energy levels in Ge783246 has been identified with spins and parity of 2+, 3+, 4+, 5+, and 6+. Decays within this band proceed strictly through Δ J =1 transitions, unlike similar sequences in neighboring Ge and Se nuclei. Above the 2+ level, members of this sequence do not decay into the ground-state band. Moreover, the energy staggering of this sequence has the phase that would be expected for a γ -rigid structure. The energies and branching ratios of many of the levels are described well by shell-model calculations. However, the calculated reduced transition probabilities for the Δ J =2 in-band transitions imply that they should have been observed, in contradiction with the experiment. Within the calculations of Davydov, Filippov, and Rostovsky for rigid-triaxial rotors with γ =3 0 ° , there are sequences of higher-spin levels connected by strong Δ J =1 transitions which decay in the same manner as those observed experimentally, yet are calculated at too high an excitation energy.

  1. Non-invasive fetal sex determination by maternal plasma sequencing and application in X-linked disorder counseling.

    PubMed

    Pan, Xiaoyu; Zhang, Chunlei; Li, Xuchao; Chen, Shengpei; Ge, Huijuan; Zhang, Yanyan; Chen, Fang; Jiang, Hui; Jiang, Fuman; Zhang, Hongyun; Wang, Wei; Zhang, Xiuqing

    2014-12-01

    To develop a fetal sex determination method based on maternal plasma sequencing (MPS), assess its performance and potential use in X-linked disorder counseling. 900 cases of MPS data from a previous study were reviewed, in which 100 and 800 cases were used as training and validation set, respectively. The percentage of uniquely mapped sequencing reads on Y chromosome was calculated and used to classify male and female cases. Eight pregnant women who are carriers of Duchenne muscular dystrophy (DMD) mutations were recruited, whose plasma were subjected to multiplex sequencing and fetal sex determination analysis. In the training set, a sensitivity of 96% and false positive rate of 0% for male cases detection were reached in our method. The blinded validation results showed 421 in 423 male cases and 374 in 377 female cases were successfully identified, revealing sensitivity and specificity of 99.53% and 99.20% for fetal sex determination, at as early as 12 gestational weeks. Fetal sex for all eight DMD genetic counseling cases were correctly identified, which were confirmed by amniocentesis. Based on MPS, high accuracy of non-invasive fetal sex determination can be achieved. This method can potentially be used for prenatal genetic counseling.

  2. Cycle-time determination and process control of sequencing batch membrane bioreactors.

    PubMed

    Krampe, J

    2013-01-01

    In this paper a method to determine the cycle time for sequencing batch membrane bioreactors (SBMBRs) is introduced. One of the advantages of SBMBRs is the simplicity of adapting them to varying wastewater composition. The benefit of this flexibility can only be fully utilised if the cycle times are optimised for the specific inlet load conditions. This requires either proactive and ongoing operator adjustment or active predictive instrument-based control. Determination of the cycle times for conventional sequencing batch reactor (SBR) plants is usually based on experience. Due to the higher mixed liquor suspended solids concentrations in SBMBRs and the limited experience with their application, a new approach to calculate the cycle time had to be developed. Based on results from a semi-technical pilot plant, the paper presents an approach for calculating the cycle time in relation to the influent concentration according to the Activated Sludge Model No. 1 and the German HSG (Hochschulgruppe) Approach. The approach presented in this paper considers the increased solid contents in the reactor and the resultant shortened reaction times. This allows for an exact calculation of the nitrification and denitrification cycles with a tolerance of only a few minutes. Ultimately the same approach can be used for a predictive control strategy and for conventional SBR plants.

  3. Aircraft stress sequence development: A complex engineering process made simple

    NASA Technical Reports Server (NTRS)

    Schrader, K. H.; Butts, D. G.; Sparks, W. A.

    1994-01-01

    Development of stress sequences for critical aircraft structure requires flight measured usage data, known aircraft loads, and established relationships between aircraft flight loads and structural stresses. Resulting cycle-by-cycle stress sequences can be directly usable for crack growth analysis and coupon spectra tests. Often, an expert in loads and spectra development manipulates the usage data into a typical sequence of representative flight conditions for which loads and stresses are calculated. For a fighter/trainer type aircraft, this effort is repeated many times for each of the fatigue critical locations (FCL) resulting in expenditure of numerous engineering hours. The Aircraft Stress Sequence Computer Program (ACSTRSEQ), developed by Southwest Research Institute under contract to San Antonio Air Logistics Center, presents a unique approach for making complex technical computations in a simple, easy to use method. The program is written in Microsoft Visual Basic for the Microsoft Windows environment.

  4. Sequence analysis of a few species of termites (Order: Isoptera) on the basis of partial characterization of COII gene.

    PubMed

    Sobti, Ranbir Chander; Kumari, Mamtesh; Sharma, Vijay Lakshmi; Sodhi, Monika; Mukesh, Manishi; Shouche, Yogesh

    2009-11-01

    The present study was aimed to get the nucleotide sequences of a part of COII mitochondrial gene amplified from individuals of five species of Termites (Isoptera: Termitidae: Macrotermitinae). Four of them belonged to the genus Odontotermes (O. obesus, O. horni, O. bhagwatii and Odontotermes sp.) and one to Microtermes (M. obesi). Partial COII gene fragments were amplified by using specific primers. The sequences so obtained were characterized to calculate the frequencies of each nucleotide bases and a high A + T content was observed. The interspecific pairwise sequence divergence in Odontotermes species ranged from 6.5% to 17.1% across COII fragment. M. obesi sequence diversity ranged from 2.5 with Odontotermes sp. to 19.0% with O. bhagwatii. Phylogenetic trees drawn on the basis of distance neighbour-joining method revealed three main clades clustering all the individuals according to their genera and families.

  5. General methods for determining the linear stability of coronal magnetic fields

    NASA Technical Reports Server (NTRS)

    Craig, I. J. D.; Sneyd, A. D.; Mcclymont, A. N.

    1988-01-01

    A time integration of a linearized plasma equation of motion has been performed to calculate the ideal linear stability of arbitrary three-dimensional magnetic fields. The convergence rates of the explicit and implicit power methods employed are speeded up by using sequences of cyclic shifts. Growth rates are obtained for Gold-Hoyle force-free equilibria, and the corkscrew-kink instability is found to be very weak.

  6. General methods for determining the linear stability of coronal magnetic fields

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Craig, I.J.D.; Sneyd, A.D.; McClymont, A.N.

    1988-12-01

    A time integration of a linearized plasma equation of motion has been performed to calculate the ideal linear stability of arbitrary three-dimensional magnetic fields. The convergence rates of the explicit and implicit power methods employed are speeded up by using sequences of cyclic shifts. Growth rates are obtained for Gold-Hoyle force-free equilibria, and the corkscrew-kink instability is found to be very weak. 19 references.

  7. SU-E-T-250: New IMRT Sequencing Strategy: Towards Intra-Fraction Plan Adaptation for the MR-Linac

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kontaxis, C; Bol, G; Lagendijk, J

    2014-06-01

    Purpose: To develop a new sequencer for IMRT planning that during treatment makes the inclusion of external factors possible and by doing so accounts for intra-fraction anatomy changes. Given a real-time imaging modality that will provide the updated patient anatomy during delivery, this sequencer is able to take these changes into account during the calculation of subsequent segments. Methods: Pencil beams are generated for each beam angle of the treatment and a fluence optimization is performed. The pencil beams, together with the patient anatomy and the above optimal fluence form the input of our algorithm. During each iteration the followingmore » steps are performed: A fluence optimization is done and each beam's fluence is then split to discrete intensity levels. Deliverable segments are calculated for each one of these. Each segment's area multiplied by its intensity describes its efficiency. The most efficient segment among all beams is then chosen to deliver a part of the calculated fluence and the dose that will be delivered by this segment is calculated. This delivered dose is then subtracted from the remaining dose. This loop is repeated until 90% of the dose has been delivered and a final segment weight optimization is performed to reach full convergence. Results: This algorithm was tested in several prostate cases yielding results that meet all clinical constraints. Quality assurance was performed on Delta4 and film phantoms for one of these prostate cases and received clinical acceptance after passing both gamma analyses with the 3%/3mm criteria. Conclusion: A new sequencing algorithm was developed to facilitate the needs of intensity modulated treatment. The first results on static anatomy confirm that it can calculate clinical plans equivalent to those of the commercially available planning systems. We are now working towards 100% dose convergence which will allow us to handle anatomy deformations. This work is financially supported by Elekta AB, Stockholm, Sweden.« less

  8. The new interactive CESAR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fox, P.B.; Yatabe, M.

    1987-01-01

    In this report the Nuclear Criticality Safety Analytical Methods Resource Center describes a new interactive version of CESAR, a critical experiments storage and retrieval program available on the Nuclear Criticality Information System (NCIS) database at Lawrence Livermore National Laboratory. The original version of CESAR did not include interactive search capabilities. The CESAR database was developed to provide a convenient, readily accessible means of storing and retrieving code input data for the SCALE Criticality Safety Analytical Sequences and the codes comprising those sequences. The database includes data for both cross section preparation and criticality safety calculations. 3 refs., 1 tab.

  9. New interactive CESAR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fox, P.B.; Yatabe, M.

    1987-01-01

    The Nuclear Criticality Safety Analytical Methods Resource Center announces the availability of a new interactive version of CESAR, a critical experiments storage and retrieval program available on the Nuclear Criticality Information System (NCIS) data base at Lawrence Livermore National Laboratory. The original version of CESAR did not include interactive search capabilities. The CESAR data base was developed to provide a convenient, readily accessible means of storing and retrieving code input data for the SCALE criticality safety analytical sequences and the codes comprising those sequences. The data base includes data for both cross-section preparation and criticality safety calculations.

  10. Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach.

    PubMed

    Nielsen, Morten; Lundegaard, Claus; Worning, Peder; Hvid, Christina Sylvester; Lamberth, Kasper; Buus, Søren; Brunak, Søren; Lund, Ole

    2004-06-12

    Prediction of which peptides will bind a specific major histocompatibility complex (MHC) constitutes an important step in identifying potential T-cell epitopes suitable as vaccine candidates. MHC class II binding peptides have a broad length distribution complicating such predictions. Thus, identifying the correct alignment is a crucial part of identifying the core of an MHC class II binding motif. In this context, we wish to describe a novel Gibbs motif sampler method ideally suited for recognizing such weak sequence motifs. The method is based on the Gibbs sampling method, and it incorporates novel features optimized for the task of recognizing the binding motif of MHC classes I and II. The method locates the binding motif in a set of sequences and characterizes the motif in terms of a weight-matrix. Subsequently, the weight-matrix can be applied to identifying effectively potential MHC binding peptides and to guiding the process of rational vaccine design. We apply the motif sampler method to the complex problem of MHC class II binding. The input to the method is amino acid peptide sequences extracted from the public databases of SYFPEITHI and MHCPEP and known to bind to the MHC class II complex HLA-DR4(B1*0401). Prior identification of information-rich (anchor) positions in the binding motif is shown to improve the predictive performance of the Gibbs sampler. Similarly, a consensus solution obtained from an ensemble average over suboptimal solutions is shown to outperform the use of a single optimal solution. In a large-scale benchmark calculation, the performance is quantified using relative operating characteristics curve (ROC) plots and we make a detailed comparison of the performance with that of both the TEPITOPE method and a weight-matrix derived using the conventional alignment algorithm of ClustalW. The calculation demonstrates that the predictive performance of the Gibbs sampler is higher than that of ClustalW and in most cases also higher than that of the TEPITOPE method.

  11. GAMSOR: Gamma Source Preparation and DIF3D Flux Solution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smith, M. A.; Lee, C. H.; Hill, R. N.

    2016-12-15

    Nuclear reactors that rely upon the fission reaction have two modes of thermal energy deposition in the reactor system: neutron absorption and gamma absorption. The gamma rays are typically generated by neutron absorption reactions or during the fission process which means the primary driver of energy production is of course the neutron interaction. In conventional reactor physics methods, the gamma heating component is ignored such that the gamma absorption is forced to occur at the gamma emission site. For experimental reactor systems like EBR-II and FFTF, the placement of structural pins and assemblies internal to the core leads to problemsmore » with power heating predictions because there is no fission power source internal to the assembly to dictate a spatial distribution of the power. As part of the EBR-II support work in the 1980s, the GAMSOR code was developed to assist analysts in calculating the gamma heating. The GAMSOR code is a modified version of DIF3D and actually functions within a sequence of DIF3D calculations. The gamma flux in a conventional fission reactor system does not perturb the neutron flux and thus the gamma flux calculation can be cast as a fixed source problem given a solution to the steady state neutron flux equation. This leads to a sequence of DIF3D calculations, called the GAMSOR sequence, which involves solving the neutron flux, then the gamma flux, then combining the results to do a summary edit. In this manuscript, we go over the GAMSOR code and detail how it is put together and functions. We also discuss how to setup the GAMSOR sequence and input for each DIF3D calculation in the GAMSOR sequence. With the GAMSOR capability, users can take any valid steady state DIF3D calculation and compute the power distribution due to neutron and gamma heating. The MC2-3 code is the preferable companion code to use for generating neutron and gamma cross section data, but the GAMSOR code can accept cross section data from other sources. To further this aspect, an additional utility code was created which demonstrates how to merge the neutron and gamma cross section data together to carry out a simultaneous solve of the two systems.« less

  12. Automated two-point dixon screening for the evaluation of hepatic steatosis and siderosis: comparison with R2-relaxometry and chemical shift-based sequences.

    PubMed

    Henninger, B; Zoller, H; Rauch, S; Schocke, M; Kannengiesser, S; Zhong, X; Reiter, G; Jaschke, W; Kremser, C

    2015-05-01

    To evaluate the automated two-point Dixon screening sequence for the detection and estimated quantification of hepatic iron and fat compared with standard sequences as a reference. One hundred and two patients with suspected diffuse liver disease were included in this prospective study. The following MRI protocol was used: 3D-T1-weighted opposed- and in-phase gradient echo with two-point Dixon reconstruction and dual-ratio signal discrimination algorithm ("screening" sequence); fat-saturated, multi-gradient-echo sequence with 12 echoes; gradient-echo T1 FLASH opposed- and in-phase. Bland-Altman plots were generated and correlation coefficients were calculated to compare the sequences. The screening sequence diagnosed fat in 33, iron in 35 and a combination of both in 4 patients. Correlation between R2* values of the screening sequence and the standard relaxometry was excellent (r = 0.988). A slightly lower correlation (r = 0.978) was found between the fat fraction of the screening sequence and the standard sequence. Bland-Altman revealed systematically lower R2* values obtained from the screening sequence and higher fat fraction values obtained with the standard sequence with a rather high variability in agreement. The screening sequence is a promising method with fast diagnosis of the predominant liver disease. It is capable of estimating the amount of hepatic fat and iron comparable to standard methods. • MRI plays a major role in the clarification of diffuse liver disease. • The screening sequence was introduced for the assessment of diffuse liver disease. • It is a fast and automated algorithm for the evaluation of hepatic iron and fat. • It is capable of estimating the amount of hepatic fat and iron.

  13. DUK - A Fast and Efficient Kmer Based Sequence Matching Tool

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Mingkun; Copeland, Alex; Han, James

    2011-03-21

    A new tool, DUK, is developed to perform matching task. Matching is to find whether a query sequence partially or totally matches given reference sequences or not. Matching is similar to alignment. Indeed many traditional analysis tasks like contaminant removal use alignment tools. But for matching, there is no need to know which bases of a query sequence matches which position of a reference sequence, it only need know whether there exists a match or not. This subtle difference can make matching task much faster than alignment. DUK is accurate, versatile, fast, and has efficient memory usage. It uses Kmermore » hashing method to index reference sequences and Poisson model to calculate p-value. DUK is carefully implemented in C++ in object oriented design. The resulted classes can also be used to develop other tools quickly. DUK have been widely used in JGI for a wide range of applications such as contaminant removal, organelle genome separation, and assembly refinement. Many real applications and simulated dataset demonstrate its power.« less

  14. Nuclear magnetic resonance signal dynamics of liquids in the presence of distant dipolar fields, revisited

    PubMed Central

    Barros, Wilson; Gochberg, Daniel F.; Gore, John C.

    2009-01-01

    The description of the nuclear magnetic resonance magnetization dynamics in the presence of long-range dipolar interactions, which is based upon approximate solutions of Bloch–Torrey equations including the effect of a distant dipolar field, has been revisited. New experiments show that approximate analytic solutions have a broader regime of validity as well as dependencies on pulse-sequence parameters that seem to have been overlooked. In order to explain these experimental results, we developed a new method consisting of calculating the magnetization via an iterative formalism where both diffusion and distant dipolar field contributions are treated as integral operators incorporated into the Bloch–Torrey equations. The solution can be organized as a perturbative series, whereby access to higher order terms allows one to set better boundaries on validity regimes for analytic first-order approximations. Finally, the method legitimizes the use of simple analytic first-order approximations under less demanding experimental conditions, it predicts new pulse-sequence parameter dependencies for the range of validity, and clarifies weak points in previous calculations. PMID:19425789

  15. A graph-based semantic similarity measure for the gene ontology.

    PubMed

    Alvarez, Marco A; Yan, Changhui

    2011-12-01

    Existing methods for calculating semantic similarities between pairs of Gene Ontology (GO) terms and gene products often rely on external databases like Gene Ontology Annotation (GOA) that annotate gene products using the GO terms. This dependency leads to some limitations in real applications. Here, we present a semantic similarity algorithm (SSA), that relies exclusively on the GO. When calculating the semantic similarity between a pair of input GO terms, SSA takes into account the shortest path between them, the depth of their nearest common ancestor, and a novel similarity score calculated between the definitions of the involved GO terms. In our work, we use SSA to calculate semantic similarities between pairs of proteins by combining pairwise semantic similarities between the GO terms that annotate the involved proteins. The reliability of SSA was evaluated by comparing the resulting semantic similarities between proteins with the functional similarities between proteins derived from expert annotations or sequence similarity. Comparisons with existing state-of-the-art methods showed that SSA is highly competitive with the other methods. SSA provides a reliable measure for semantics similarity independent of external databases of functional-annotation observations.

  16. Methodological Reporting of Randomized Trials in Five Leading Chinese Nursing Journals

    PubMed Central

    Shi, Chunhu; Tian, Jinhui; Ren, Dan; Wei, Hongli; Zhang, Lihuan; Wang, Quan; Yang, Kehu

    2014-01-01

    Background Randomized controlled trials (RCTs) are not always well reported, especially in terms of their methodological descriptions. This study aimed to investigate the adherence of methodological reporting complying with CONSORT and explore associated trial level variables in the Chinese nursing care field. Methods In June 2012, we identified RCTs published in five leading Chinese nursing journals and included trials with details of randomized methods. The quality of methodological reporting was measured through the methods section of the CONSORT checklist and the overall CONSORT methodological items score was calculated and expressed as a percentage. Meanwhile, we hypothesized that some general and methodological characteristics were associated with reporting quality and conducted a regression with these data to explore the correlation. The descriptive and regression statistics were calculated via SPSS 13.0. Results In total, 680 RCTs were included. The overall CONSORT methodological items score was 6.34±0.97 (Mean ± SD). No RCT reported descriptions and changes in “trial design,” changes in “outcomes” and “implementation,” or descriptions of the similarity of interventions for “blinding.” Poor reporting was found in detailing the “settings of participants” (13.1%), “type of randomization sequence generation” (1.8%), calculation methods of “sample size” (0.4%), explanation of any interim analyses and stopping guidelines for “sample size” (0.3%), “allocation concealment mechanism” (0.3%), additional analyses in “statistical methods” (2.1%), and targeted subjects and methods of “blinding” (5.9%). More than 50% of trials described randomization sequence generation, the eligibility criteria of “participants,” “interventions,” and definitions of the “outcomes” and “statistical methods.” The regression analysis found that publication year and ITT analysis were weakly associated with CONSORT score. Conclusions The completeness of methodological reporting of RCTs in the Chinese nursing care field is poor, especially with regard to the reporting of trial design, changes in outcomes, sample size calculation, allocation concealment, blinding, and statistical methods. PMID:25415382

  17. Linear scaling computation of the Fock matrix. II. Rigorous bounds on exchange integrals and incremental Fock build

    NASA Astrophysics Data System (ADS)

    Schwegler, Eric; Challacombe, Matt; Head-Gordon, Martin

    1997-06-01

    A new linear scaling method for computation of the Cartesian Gaussian-based Hartree-Fock exchange matrix is described, which employs a method numerically equivalent to standard direct SCF, and which does not enforce locality of the density matrix. With a previously described method for computing the Coulomb matrix [J. Chem. Phys. 106, 5526 (1997)], linear scaling incremental Fock builds are demonstrated for the first time. Microhartree accuracy and linear scaling are achieved for restricted Hartree-Fock calculations on sequences of water clusters and polyglycine α-helices with the 3-21G and 6-31G basis sets. Eightfold speedups are found relative to our previous method. For systems with a small ionization potential, such as graphitic sheets, the method naturally reverts to the expected quadratic behavior. Also, benchmark 3-21G calculations attaining microhartree accuracy are reported for the P53 tetramerization monomer involving 698 atoms and 3836 basis functions.

  18. Study on multiple-hops performance of MOOC sequences-based optical labels for OPS networks

    NASA Astrophysics Data System (ADS)

    Zhang, Chongfu; Qiu, Kun; Ma, Chunli

    2009-11-01

    In this paper, we utilize a new study method that is under independent case of multiple optical orthogonal codes to derive the probability function of MOOCS-OPS networks, discuss the performance characteristics for a variety of parameters, and compare some characteristics of the system employed by single optical orthogonal code or multiple optical orthogonal codes sequences-based optical labels. The performance of the system is also calculated, and our results verify that the method is effective. Additionally it is found that performance of MOOCS-OPS networks would, negatively, be worsened, compared with single optical orthogonal code-based optical label for optical packet switching (SOOC-OPS); however, MOOCS-OPS networks can greatly enlarge the scalability of optical packet switching networks.

  19. The Barnes-Evans color-surface brightness relation: A preliminary theoretical interpretation

    NASA Technical Reports Server (NTRS)

    Shipman, H. L.

    1980-01-01

    Model atmosphere calculations are used to assess whether an empirically derived relation between V-R and surface brightness is independent of a variety of stellar paramters, including surface gravity. This relationship is used in a variety of applications, including the determination of the distances of Cepheid variables using a method based on the Beade-Wesselink method. It is concluded that the use of a main sequence relation between V-R color and surface brightness in determining radii of giant stars is subject to systematic errors that are smaller than 10% in the determination of a radius or distance for temperature cooler than 12,000 K. The error in white dwarf radii determined from a main sequence color surface brightness relation is roughly 10%.

  20. Ariadne: a database search engine for identification and chemical analysis of RNA using tandem mass spectrometry data.

    PubMed

    Nakayama, Hiroshi; Akiyama, Misaki; Taoka, Masato; Yamauchi, Yoshio; Nobe, Yuko; Ishikawa, Hideaki; Takahashi, Nobuhiro; Isobe, Toshiaki

    2009-04-01

    We present here a method to correlate tandem mass spectra of sample RNA nucleolytic fragments with an RNA nucleotide sequence in a DNA/RNA sequence database, thereby allowing tandem mass spectrometry (MS/MS)-based identification of RNA in biological samples. Ariadne, a unique web-based database search engine, identifies RNA by two probability-based evaluation steps of MS/MS data. In the first step, the software evaluates the matches between the masses of product ions generated by MS/MS of an RNase digest of sample RNA and those calculated from a candidate nucleotide sequence in a DNA/RNA sequence database, which then predicts the nucleotide sequences of these RNase fragments. In the second step, the candidate sequences are mapped for all RNA entries in the database, and each entry is scored for a function of occurrences of the candidate sequences to identify a particular RNA. Ariadne can also predict post-transcriptional modifications of RNA, such as methylation of nucleotide bases and/or ribose, by estimating mass shifts from the theoretical mass values. The method was validated with MS/MS data of RNase T1 digests of in vitro transcripts. It was applied successfully to identify an unknown RNA component in a tRNA mixture and to analyze post-transcriptional modification in yeast tRNA(Phe-1).

  1. Discrimination of germline V genes at different sequencing lengths and mutational burdens: A new tool for identifying and evaluating the reliability of V gene assignment.

    PubMed

    Zhang, Bochao; Meng, Wenzhao; Prak, Eline T Luning; Hershberg, Uri

    2015-12-01

    Immune repertoires are collections of lymphocytes that express diverse antigen receptor gene rearrangements consisting of Variable (V), (Diversity (D) in the case of heavy chains) and Joining (J) gene segments. Clonally related cells typically share the same germline gene segments and have highly similar junctional sequences within their third complementarity determining regions. Identifying clonal relatedness of sequences is a key step in the analysis of immune repertoires. The V gene is the most important for clone identification because it has the longest sequence and the greatest number of sequence variants. However, accurate identification of a clone's germline V gene source is challenging because there is a high degree of similarity between different germline V genes. This difficulty is compounded in antibodies, which can undergo somatic hypermutation. Furthermore, high-throughput sequencing experiments often generate partial sequences and have significant error rates. To address these issues, we describe a novel method to estimate which germline V genes (or alleles) cannot be discriminated under different conditions (read lengths, sequencing errors or somatic hypermutation frequencies). Starting with any set of germline V genes, this method measures their similarity using different sequencing lengths and calculates their likelihood of unambiguous assignment under different levels of mutation. Hence, one can identify, under different experimental and biological conditions, the germline V genes (or alleles) that cannot be uniquely identified and bundle them together into groups of specific V genes with highly similar sequences. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Novel Δ J = 1 Sequence in Ge 78 : Possible Evidence for Triaxiality

    DOE PAGES

    Forney, A. M.; Walters, W. B.; Chiara, C. J.; ...

    2018-05-22

    Here, a sequence of low-energy levels in 78 32Ge 46 has been identified with spins and parity of 2 +, 3 +, 4 +, 5 +, and 6 +. Decays within this band proceed strictly through ΔJ=1 transitions, unlike similar sequences in neighboring Ge and Se nuclei. Above the 2+ level, members of this sequence do not decay into the ground-state band. Moreover, the energy staggering of this sequence has the phase that would be expected for a γ-rigid structure. The energies and branching ratios of many of the levels are described well by shell-model calculations. However, the calculated reducedmore » transition probabilities for the ΔJ=2 in-band transitions imply that they should have been observed, in contradiction with the experiment. Within the calculations of Davydov, Filippov, and Rostovsky for rigid-triaxial rotors with γ=30°, there are sequences of higher-spin levels connected by strong ΔJ=1 transitions which decay in the same manner as those observed experimentally, yet are calculated at too high an excitation energy.« less

  3. A novel ΔJ = 1 sequence in 78Ge: possible evidence for triaxiality

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Forney, A. M.; Walters, W. B.; Chiara, C. J.

    2018-02-20

    A sequence of low-energy levels inmore » $$78\\atop{32}$$Ge 46 has been identi ed with spins and parity of 2 +, 3 +, 4 +, 5 +, and 6 +. Decays within this band proceed strictly through ΔJ = 1 transitions, unlike similar sequences in neighboring Ge and Se nuclei. Above the 2 + level, members of this sequence do not decay into the ground-state band. Moreover, the energy staggering of this sequence has the phase that would be expected for a γ-rigid structure. The energies and branching ratios of many of the levels are described well by shell-model calculations. However, the calculated reduced transition probabilities for the ΔJ = 2 in-band transitions imply that they should have been observed, in contradiction with the experiment. Lastly, within the calculations of Davydov, Filippov, and Rostovsky for rigid-triaxial rotors with γ = 30°, there are sequences of higher-spin levels connected by strong ΔJ = 1 transitions which decay in the same manner as those observed experimentally, yet calculated at too high an excitation energy.« less

  4. Novel Δ J = 1 Sequence in Ge 78 : Possible Evidence for Triaxiality

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Forney, A. M.; Walters, W. B.; Chiara, C. J.

    Here, a sequence of low-energy levels in 78 32Ge 46 has been identified with spins and parity of 2 +, 3 +, 4 +, 5 +, and 6 +. Decays within this band proceed strictly through ΔJ=1 transitions, unlike similar sequences in neighboring Ge and Se nuclei. Above the 2+ level, members of this sequence do not decay into the ground-state band. Moreover, the energy staggering of this sequence has the phase that would be expected for a γ-rigid structure. The energies and branching ratios of many of the levels are described well by shell-model calculations. However, the calculated reducedmore » transition probabilities for the ΔJ=2 in-band transitions imply that they should have been observed, in contradiction with the experiment. Within the calculations of Davydov, Filippov, and Rostovsky for rigid-triaxial rotors with γ=30°, there are sequences of higher-spin levels connected by strong ΔJ=1 transitions which decay in the same manner as those observed experimentally, yet are calculated at too high an excitation energy.« less

  5. Using the auxiliary camera for system calibration of 3D measurement by digital speckle

    NASA Astrophysics Data System (ADS)

    Xue, Junpeng; Su, Xianyu; Zhang, Qican

    2014-06-01

    The study of 3D shape measurement by digital speckle temporal sequence correlation have drawn a lot of attention by its own advantages, however, the measurement mainly for depth z-coordinate, horizontal physical coordinate (x, y) are usually marked as image pixel coordinate. In this paper, a new approach for the system calibration is proposed. With an auxiliary camera, we made up the temporary binocular vision system, which are used for the calibration of horizontal coordinates (mm) while the temporal sequence reference-speckle-sets are calibrated. First, the binocular vision system has been calibrated using the traditional method. Then, the digital speckles are projected on the reference plane, which is moved by equal distance in the direction of depth, temporal sequence speckle images are acquired with camera as reference sets. When the reference plane is in the first position and final position, crossed fringe pattern are projected to the plane respectively. The control points of pixel coordinates are extracted by Fourier analysis from the images, and the physical coordinates are calculated by the binocular vision. The physical coordinates corresponding to each pixel of the images are calculated by interpolation algorithm. Finally, the x and y corresponding to arbitrary depth value z are obtained by the geometric formula. Experiments prove that our method can fast and flexibly measure the 3D shape of an object as point cloud.

  6. Prediction of Ras-effector interactions using position energy matrices.

    PubMed

    Kiel, Christina; Serrano, Luis

    2007-09-01

    One of the more challenging problems in biology is to determine the cellular protein interaction network. Progress has been made to predict protein-protein interactions based on structural information, assuming that structural similar proteins interact in a similar way. In a previous publication, we have determined a genome-wide Ras-effector interaction network based on homology models, with a high accuracy of predicting binding and non-binding domains. However, for a prediction on a genome-wide scale, homology modelling is a time-consuming process. Therefore, we here successfully developed a faster method using position energy matrices, where based on different Ras-effector X-ray template structures, all amino acids in the effector binding domain are sequentially mutated to all other amino acid residues and the effect on binding energy is calculated. Those pre-calculated matrices can then be used to score for binding any Ras or effector sequences. Based on position energy matrices, the sequences of putative Ras-binding domains can be scanned quickly to calculate an energy sum value. By calibrating energy sum values using quantitative experimental binding data, thresholds can be defined and thus non-binding domains can be excluded quickly. Sequences which have energy sum values above this threshold are considered to be potential binding domains, and could be further analysed using homology modelling. This prediction method could be applied to other protein families sharing conserved interaction types, in order to determine in a fast way large scale cellular protein interaction networks. Thus, it could have an important impact on future in silico structural genomics approaches, in particular with regard to increasing structural proteomics efforts, aiming to determine all possible domain folds and interaction types. All matrices are deposited in the ADAN database (http://adan-embl.ibmc.umh.es/). Supplementary data are available at Bioinformatics online.

  7. Optimal control design of turbo spin‐echo sequences with applications to parallel‐transmit systems

    PubMed Central

    Hoogduin, Hans; Hajnal, Joseph V.; van den Berg, Cornelis A. T.; Luijten, Peter R.; Malik, Shaihan J.

    2016-01-01

    Purpose The design of turbo spin‐echo sequences is modeled as a dynamic optimization problem which includes the case of inhomogeneous transmit radiofrequency fields. This problem is efficiently solved by optimal control techniques making it possible to design patient‐specific sequences online. Theory and Methods The extended phase graph formalism is employed to model the signal evolution. The design problem is cast as an optimal control problem and an efficient numerical procedure for its solution is given. The numerical and experimental tests address standard multiecho sequences and pTx configurations. Results Standard, analytically derived flip angle trains are recovered by the numerical optimal control approach. New sequences are designed where constraints on radiofrequency total and peak power are included. In the case of parallel transmit application, the method is able to calculate the optimal echo train for two‐dimensional and three‐dimensional turbo spin echo sequences in the order of 10 s with a single central processing unit (CPU) implementation. The image contrast is maintained through the whole field of view despite inhomogeneities of the radiofrequency fields. Conclusion The optimal control design sheds new light on the sequence design process and makes it possible to design sequences in an online, patient‐specific fashion. Magn Reson Med 77:361–373, 2017. © 2016 The Authors Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine PMID:26800383

  8. [MRI of focal liver lesions using a 1.5 turbo-spin-echo technique compared with spin-echo technique].

    PubMed

    Steiner, S; Vogl, T J; Fischer, P; Steger, W; Neuhaus, P; Keck, H

    1995-08-01

    The aim of our study was to evaluate a T2-weighted turbo-spinecho sequence in comparison to a T2-weighted spinecho sequence in imaging focal liver lesions. In our study 35 patients with suspected focal liver lesions were examined. Standardised imaging protocol included a conventional T2-weighted SE sequence (TR/TE = 2000/90/45, acquisition time = 10.20) as well as a T2-weighted TSE sequence (TR/TE = 4700/90, acquisition time = 6.33). Calculation of S/N and C/N ratio as a basis of quantitative evaluation was done using standard methods. A diagnostic score was implemented to enable qualitative assessment. In 7% (n = 2) the TSE sequence enabled detection of further liver lesions showing a size of less than 1 cm in diameter. Comparing anatomical details the TSE sequence was superior. S/N and C/N ratio of anatomic and pathologic structures of the TSE sequence were higher compared to results of the SE sequence. Our results indicate that the T2-weighted turbo-spinecho sequence is well appropriate for imaging focal liver lesions, and leads to reduction of imaging time.

  9. SU-C-17A-07: The Development of An MR Accelerator-Enabled Planning-To-Delivery Technique for Stereotactic Palliative Radiotherapy Treatment of Spinal Metastases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hoogcarspel, S J; Kontaxis, C; Velden, J M van der

    2014-06-01

    Purpose: To develop an MR accelerator-enabled online planning-todelivery technique for stereotactic palliative radiotherapy treatment of spinal metastases. The technical challenges include; automated stereotactic treatment planning, online MR-based dose calculation and MR guidance during treatment. Methods: Using the CT data of 20 patients previously treated at our institution, a class solution for automated treatment planning for spinal bone metastases was created. For accurate dose simulation right before treatment, we fused geometrically correct online MR data with pretreatment CT data of the target volume (TV). For target tracking during treatment, a dynamic T2-weighted TSE MR sequence was developed. An in house developedmore » GPU based IMRT optimization and dose calculation algorithm was used for fast treatment planning and simulation. An automatically generated treatment plan developed with this treatment planning system was irradiated on a clinical 6 MV linear accelerator and evaluated using a Delta4 dosimeter. Results: The automated treatment planning method yielded clinically viable plans for all patients. The MR-CT fusion based dose calculation accuracy was within 2% as compared to calculations performed with original CT data. The dynamic T2-weighted TSE MR Sequence was able to provide an update of the anatomical location of the TV every 10 seconds. Dose calculation and optimization of the automatically generated treatment plans using only one GPU took on average 8 minutes. The Delta4 measurement of the irradiated plan agreed with the dose calculation with a 3%/3mm gamma pass rate of 86.4%. Conclusions: The development of an MR accelerator-enabled planning-todelivery technique for stereotactic palliative radiotherapy treatment of spinal metastases was presented. Future work will involve developing an intrafraction motion adaptation strategy, MR-only dose calculation, radiotherapy quality-assurance in a magnetic field, and streamlining the entire treatment process on an MR accelerator.« less

  10. Have the temperature time series a structural change after 1998?

    NASA Astrophysics Data System (ADS)

    Werner, Rolf; Valev, Dimitare; Danov, Dimitar

    2012-07-01

    The global and hemisphere temperature GISS and Hadcrut3 time series were analysed for structural changes. We postulate the continuity of the preceding temperature function depending from the time. The slopes are calculated for a sequence of segments limited by time thresholds. We used a standard method, the restricted linear regression with dummy variables. We performed the calculations and tests for different number of thresholds. The thresholds are searched continuously in determined time intervals. The F-statistic is used to obtain the time points of the structural changes.

  11. Quantitative comparison between a multiecho sequence and a single-echo sequence for susceptibility-weighted phase imaging.

    PubMed

    Gilbert, Guillaume; Savard, Geneviève; Bard, Céline; Beaudoin, Gilles

    2012-06-01

    The aim of this study was to investigate the benefits arising from the use of a multiecho sequence for susceptibility-weighted phase imaging using a quantitative comparison with a standard single-echo acquisition. Four healthy adult volunteers were imaged on a clinical 3-T system using a protocol comprising two different three-dimensional susceptibility-weighted gradient-echo sequences: a standard single-echo sequence and a multiecho sequence. Both sequences were repeated twice in order to evaluate the local noise contribution by a subtraction of the two acquisitions. For the multiecho sequence, the phase information from each echo was independently unwrapped, and the background field contribution was removed using either homodyne filtering or the projection onto dipole fields method. The phase information from all echoes was then combined using a weighted linear regression. R2 maps were also calculated from the multiecho acquisitions. The noise standard deviation in the reconstructed phase images was evaluated for six manually segmented regions of interest (frontal white matter, posterior white matter, globus pallidus, putamen, caudate nucleus and lateral ventricle). The use of the multiecho sequence for susceptibility-weighted phase imaging led to a reduction of the noise standard deviation for all subjects and all regions of interest investigated in comparison to the reference single-echo acquisition. On average, the noise reduction ranged from 18.4% for the globus pallidus to 47.9% for the lateral ventricle. In addition, the amount of noise reduction was found to be strongly inversely correlated to the estimated R2 value (R=-0.92). In conclusion, the use of a multiecho sequence is an effective way to decrease the noise contribution in susceptibility-weighted phase images, while preserving both contrast and acquisition time. The proposed approach additionally permits the calculation of R2 maps. Copyright © 2012 Elsevier Inc. All rights reserved.

  12. SU-F-J-112: Clinical Feasibility Test of An RF Pulse-Based MRI Method for the Quantitative Fat-Water Segmentation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yee, S; Wloch, J; Pirkola, M

    Purpose: Quantitative fat-water segmentation is important not only because of the clinical utility of fat-suppressed MRI images in better detecting lesions of clinical significance (in the midst of bright fat signal) but also because of the possible physical need, in which CT-like images based on the materials’ photon attenuation properties may have to be generated from MR images; particularly, as in the case of MR-only radiation oncology environment to obtain radiation dose calculation or as in the case of hybrid PET/MR modality to obtain attenuation correction map for the quantitative PET reconstruction. The majority of such fat-water quantitative segmentations havemore » been performed by utilizing the Dixon’s method and its variations, which have to enforce the proper settings (often predefined) of echo time (TE) in the pulse sequences. Therefore, such methods have been unable to be directly combined with those ultrashort TE (UTE) sequences that, taking the advantage of very low TE values (∼ 10’s microsecond), might be beneficial to directly detect bones. Recently, an RF pulse-based method (http://dx.doi.org/10.1016/j.mri.2015.11.006), termed as PROD pulse method, was introduced as a method of quantitative fat-water segmentation that does not have to depend on predefined TE settings. Here, the clinical feasibility of this method is verified in brain tumor patients by combining the PROD pulse with several sequences. Methods: In a clinical 3T MRI, the PROD pulse was combined with turbo spin echo (e.g. TR=1500, TE=16 or 60, ETL=15) or turbo field echo (e.g. TR=5.6, TE=2.8, ETL=12) sequences without specifying TE values. Results: The fat-water segmentation was possible without having to set specific TE values. Conclusion: The PROD pulse method is clinically feasible. Although not yet combined with UTE sequences in our laboratory, the method is potentially compatible with UTE sequences, and thus, might be useful to directly segment fat, water, bone and air.« less

  13. Hybrid composite laminates reinforced with Kevlar/carbon/glass woven fabrics for ballistic impact testing.

    PubMed

    Randjbaran, Elias; Zahari, Rizal; Jalil, Nawal Aswan Abdul; Majid, Dayang Laila Abang Abdul

    2014-01-01

    Current study reported a facile method to investigate the effects of stacking sequence layers of hybrid composite materials on ballistic energy absorption by running the ballistic test at the high velocity ballistic impact conditions. The velocity and absorbed energy were accordingly calculated as well. The specimens were fabricated from Kevlar, carbon, and glass woven fabrics and resin and were experimentally investigated under impact conditions. All the specimens possessed equal mass, shape, and density; nevertheless, the layers were ordered in different stacking sequence. After running the ballistic test at the same conditions, the final velocities of the cylindrical AISI 4340 Steel pellet showed how much energy was absorbed by the samples. The energy absorption of each sample through the ballistic impact was calculated; accordingly, the proper ballistic impact resistance materials could be found by conducting the test. This paper can be further studied in order to characterise the material properties for the different layers.

  14. SlideSort: all pairs similarity search for short reads

    PubMed Central

    Shimizu, Kana; Tsuda, Koji

    2011-01-01

    Motivation: Recent progress in DNA sequencing technologies calls for fast and accurate algorithms that can evaluate sequence similarity for a huge amount of short reads. Searching similar pairs from a string pool is a fundamental process of de novo genome assembly, genome-wide alignment and other important analyses. Results: In this study, we designed and implemented an exact algorithm SlideSort that finds all similar pairs from a string pool in terms of edit distance. Using an efficient pattern growth algorithm, SlideSort discovers chains of common k-mers to narrow down the search. Compared to existing methods based on single k-mers, our method is more effective in reducing the number of edit distance calculations. In comparison to backtracking methods such as BWA, our method is much faster in finding remote matches, scaling easily to tens of millions of sequences. Our software has an additional function of single link clustering, which is useful in summarizing short reads for further processing. Availability: Executable binary files and C++ libraries are available at http://www.cbrc.jp/~shimizu/slidesort/ for Linux and Windows. Contact: slidesort@m.aist.go.jp; shimizu-kana@aist.go.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21148542

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moore, B; Yin, F; Cai, J

    Purpose: To determine the variation in tumor contrast between different MRI sequences and between patients for the purpose of MRI-based treatment planning. Methods: Multiple MRI scans of 11 patients with cancer(s) in the liver were included in this IRB-approved study. Imaging sequences consisted of T1W MRI, Contrast-Enhanced T1W MRI, T2W MRI, and T2*/T1W MRI. MRI images were acquired on a 1.5T GE Signa scanner with a four-channel torso coil. We calculated the tumor-to-tissue contrast to noise ratio (CNR) for each MR sequence by contouring the tumor and a region of interest (ROI) in a homogeneous region of the liver usingmore » the Eclipse treatment planning software. CNR was calculated (I-Tum-I-ROI)/SD-ROI, where I-Tum and I-ROI are the mean values of the tumor and the ROI respectively, and SD-ROI is the standard deviation of the ROI. The same tumor and ROI structures were used in all measurements for different MR sequences. Inter-patient Coefficient of variation (CV), and inter-sequence CV was determined. In addition, mean and standard deviation of CNR were calculated and compared between different MR sequences. Results: Our preliminary results showed large inter-patient CV (range: 37.7% to 88%) and inter-sequence CV (range 5.3% to 104.9%) of liver tumor CNR, indicating great variations in tumor CNR between MR sequences and between patients. Tumor CNR was found to be largest in CE-T1W (8.5±7.5), followed by T2W (4.2±2.4), T1W (3.4±2.2), and T2*/T1W (1.7±0.6) MR scans. The inter-patient CV of tumor CNR was also the largest in CE-T1W (88%), followed by T1W (64.3%), T1W (56.2%), and T2*/T1W (37.7) MR scans. Conclusion: Large inter-sequence and inter-patient variations were observed in liver tumor CNR. CE-T1W MR images on average provided the best tumor CNR. Efforts are needed to optimize tumor contrast and its consistency for MRI-based treatment planning of cancer in the liver. This project is supported by NIH grant: 1R21CA165384.« less

  16. A method of measuring three-dimensional scapular attitudes using the optotrak probing system.

    PubMed

    Hébert, L J; Moffet, H; McFadyen, B J; St-Vincent, G

    2000-01-01

    To develop a method to obtain accurate three-dimensional scapular attitudes and to assess their concurrent validity and reliability. In this methodological study, the three-dimensional scapular attitudes were calculated in degrees, using a rotation matrix (cyclic Cardanic sequence), from spatial coordinates obtained with the probing of three non colinear landmarks first on an anatomical model and second on a healthy subject. Although abnormal movement of the scapula is related to shoulder impingement syndrome, it is not clearly understood whether or not scapular motion impairment is a predisposing factor. Characterization of three-dimensional scapular attitudes in planes and at joint angles for which sub-acromial impingement is more likely to occur is not known. The Optotrak probing system was used. An anatomical model of the scapula was built and allowed us to impose scapular attitudes of known direction and magnitude. A local coordinate reference system was defined with three non colinear anatomical landmarks to assess accuracy and concurrent validity of the probing method with fixed markers. Axial rotation angles were calculated from a rotation matrix using a cyclic Cardanic sequence of rotations. The same three non colinear body landmarks were digitized on one healthy subject and the three dimensional scapular attitudes obtained were compared between sessions in order to assess the reliability. The measure of three dimensional scapular attitudes calculated from data using the Optotrak probing system was accurate with means of the differences between imposed and calculated rotation angles ranging from 1.5 degrees to 4.2 degrees. Greatest variations were observed around the third axis of the Cardanic sequence associated with posterior-anterior transverse rotations. The mean difference between the Optotrak probing system method and fixed markers was 1.73 degrees showing a good concurrent validity. Differences between the two methods were generally very low for one and two direction displacements and the largest discrepancies were observed for imposed displacements combining movement about the three axes. The between sessions variation of three dimensional scapular attitudes was less than 10% for most of the arm positions adopted by a healthy subject suggesting a good reliability. The Optotrak probing system used with a standardized protocol lead to accurate, valid and reliable measures of scapular attitudes. Although abnormal range of motion of the scapula is often related to shoulder pathologies, reliable outcome measures to quantify three-dimensional scapular motion on subjects are not available. It is important to establish a standardized protocol to characterize three-dimensional scapular motion on subjects using a method for which the accuracy and validity are known. The method used in the present study has provided such a protocol and will now allow to verify to what extent, scapular motion impairment is linked to the development of specific shoulder pathologies.

  17. Calculating stage duration statistics in multistage diseases.

    PubMed

    Komarova, Natalia L; Thalhauser, Craig J

    2011-01-01

    Many human diseases are characterized by multiple stages of progression. While the typical sequence of disease progression can be identified, there may be large individual variations among patients. Identifying mean stage durations and their variations is critical for statistical hypothesis testing needed to determine if treatment is having a significant effect on the progression, or if a new therapy is showing a delay of progression through a multistage disease. In this paper we focus on two methods for extracting stage duration statistics from longitudinal datasets: an extension of the linear regression technique, and a counting algorithm. Both are non-iterative, non-parametric and computationally cheap methods, which makes them invaluable tools for studying the epidemiology of diseases, with a goal of identifying different patterns of progression by using bioinformatics methodologies. Here we show that the regression method performs well for calculating the mean stage durations under a wide variety of assumptions, however, its generalization to variance calculations fails under realistic assumptions about the data collection procedure. On the other hand, the counting method yields reliable estimations for both means and variances of stage durations. Applications to Alzheimer disease progression are discussed.

  18. Production Task Queue Optimization Based on Multi-Attribute Evaluation for Complex Product Assembly Workshop.

    PubMed

    Li, Lian-Hui; Mo, Rong

    2015-01-01

    The production task queue has a great significance for manufacturing resource allocation and scheduling decision. Man-made qualitative queue optimization method has a poor effect and makes the application difficult. A production task queue optimization method is proposed based on multi-attribute evaluation. According to the task attributes, the hierarchical multi-attribute model is established and the indicator quantization methods are given. To calculate the objective indicator weight, criteria importance through intercriteria correlation (CRITIC) is selected from three usual methods. To calculate the subjective indicator weight, BP neural network is used to determine the judge importance degree, and then the trapezoid fuzzy scale-rough AHP considering the judge importance degree is put forward. The balanced weight, which integrates the objective weight and the subjective weight, is calculated base on multi-weight contribution balance model. The technique for order preference by similarity to an ideal solution (TOPSIS) improved by replacing Euclidean distance with relative entropy distance is used to sequence the tasks and optimize the queue by the weighted indicator value. A case study is given to illustrate its correctness and feasibility.

  19. Production Task Queue Optimization Based on Multi-Attribute Evaluation for Complex Product Assembly Workshop

    PubMed Central

    Li, Lian-hui; Mo, Rong

    2015-01-01

    The production task queue has a great significance for manufacturing resource allocation and scheduling decision. Man-made qualitative queue optimization method has a poor effect and makes the application difficult. A production task queue optimization method is proposed based on multi-attribute evaluation. According to the task attributes, the hierarchical multi-attribute model is established and the indicator quantization methods are given. To calculate the objective indicator weight, criteria importance through intercriteria correlation (CRITIC) is selected from three usual methods. To calculate the subjective indicator weight, BP neural network is used to determine the judge importance degree, and then the trapezoid fuzzy scale-rough AHP considering the judge importance degree is put forward. The balanced weight, which integrates the objective weight and the subjective weight, is calculated base on multi-weight contribution balance model. The technique for order preference by similarity to an ideal solution (TOPSIS) improved by replacing Euclidean distance with relative entropy distance is used to sequence the tasks and optimize the queue by the weighted indicator value. A case study is given to illustrate its correctness and feasibility. PMID:26414758

  20. [Study of beta-turns in globular proteins].

    PubMed

    Amirova, S R; Milchevskiĭ, Iu V; Filatov, I V; Esipova, N G; Tumanian, V G

    2005-01-01

    The formation of beta-turns in globular proteins has been studied by the method of molecular mechanics. Statistical method of discriminant analysis was applied to calculate energy components and sequences of oligopeptide segments, and after this prediction of I type beta-turns has been drawn. The accuracy of true positive prediction is 65%. Components of conformational energy considerably affecting beta-turn formation were delineated. There are torsional energy, energy of hydrogen bonds, and van der Waals energy.

  1. Implied alignment: a synapomorphy-based multiple-sequence alignment method and its use in cladogram search

    NASA Technical Reports Server (NTRS)

    Wheeler, Ward C.

    2003-01-01

    A method to align sequence data based on parsimonious synapomorphy schemes generated by direct optimization (DO; earlier termed optimization alignment) is proposed. DO directly diagnoses sequence data on cladograms without an intervening multiple-alignment step, thereby creating topology-specific, dynamic homology statements. Hence, no multiple-alignment is required to generate cladograms. Unlike general and globally optimal multiple-alignment procedures, the method described here, implied alignment (IA), takes these dynamic homologies and traces them back through a single cladogram, linking the unaligned sequence positions in the terminal taxa via DO transformation series. These "lines of correspondence" link ancestor-descendent states and, when displayed as linearly arrayed columns without hypothetical ancestors, are largely indistinguishable from standard multiple alignment. Since this method is based on synapomorphy, the treatment of certain classes of insertion-deletion (indel) events may be different from that of other alignment procedures. As with all alignment methods, results are dependent on parameter assumptions such as indel cost and transversion:transition ratios. Such an IA could be used as a basis for phylogenetic search, but this would be questionable since the homologies derived from the implied alignment depend on its natal cladogram and any variance, between DO and IA + Search, due to heuristic approach. The utility of this procedure in heuristic cladogram searches using DO and the improvement of heuristic cladogram cost calculations are discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.

  2. Analysis of HIV Using a High Resolution Melting (HRM) Diversity Assay: Automation of HRM Data Analysis Enhances the Utility of the Assay for Analysis of HIV Incidence

    PubMed Central

    Cousins, Matthew M.; Swan, David; Magaret, Craig A.; Hoover, Donald R.; Eshleman, Susan H.

    2012-01-01

    Background HIV diversity may be a useful biomarker for discriminating between recent and non-recent HIV infection. The high resolution melting (HRM) diversity assay was developed to quantify HIV diversity in viral populations without sequencing. In this assay, HIV diversity is expressed as a single numeric HRM score that represents the width of a melting peak. HRM scores are highly associated with diversity measures obtained with next generation sequencing. In this report, a software package, the HRM Diversity Assay Analysis Tool (DivMelt), was developed to automate calculation of HRM scores from melting curve data. Methods DivMelt uses computational algorithms to calculate HRM scores by identifying the start (T1) and end (T2) melting temperatures for a DNA sample and subtracting them (T2–T1 = HRM score). DivMelt contains many user-supplied analysis parameters to allow analyses to be tailored to different contexts. DivMelt analysis options were optimized to discriminate between recent and non-recent HIV infection and to maximize HRM score reproducibility. HRM scores calculated using DivMelt were compared to HRM scores obtained using a manual method that is based on visual inspection of DNA melting curves. Results HRM scores generated with DivMelt agreed with manually generated HRM scores obtained from the same DNA melting data. Optimal parameters for discriminating between recent and non-recent HIV infection were identified. DivMelt provided greater discrimination between recent and non-recent HIV infection than the manual method. Conclusion DivMelt provides a rapid, accurate method of determining HRM scores from melting curve data, facilitating use of the HRM diversity assay for large-scale studies. PMID:23240016

  3. Determination of dipole coupling constants using heteronuclear multiple quantum NMR

    NASA Astrophysics Data System (ADS)

    Weitekamp, D. P.; Garbow, J. R.; Pines, A.

    1982-09-01

    The problem of extracting dipole couplings from a system of N spins I = 1/2 and one spin S by NMR techniques is analyzed. The resolution attainable using a variety of single quantum methods is reviewed. The theory of heteronuclear multiple quantum (HMQ) NMR is developed, with particular emphasis being placed on the superior resolution available in HMQ spectra. Several novel pulse sequences are introduced, including a two-step method for the excitation of HMQ coherence. Experiments on partially oriented [1-13C] benzene demonstrate the excitation of the necessary HMQ coherence and illustrate the calculation of relative line intensities. Spectra of high order HMQ coherence under several different effective Hamiltonians achievable by multiple pulse sequences are discussed. A new effective Hamiltonian, scalar heteronuclear recoupled interactions by multiple pulse (SHRIMP), achieved by the simultaneous irradiation of both spin species with the same multiple pulse sequence, is introduced. Experiments are described which allow heteronuclear couplings to be correlated with an S-spin spreading parameter in spectra free of inhomogeneous broadening.

  4. Novel multiplex qualitative detection using universal primer-multiplex-PCR combined with pyrosequencing.

    PubMed

    Shang, Ying; Xu, Wentao; Wang, Yong; Xu, Yuancong; Huang, Kunlun

    2017-12-15

    This study described a novel multiplex qualitative detection method using pyrosequencing. Based on the principle of the universal primer-multiplex-PCR, only one sequencing primer was employed to realize the detection of the multiple targets. Samples containing three genetically modified (GM) crops in different proportions were used to validate the method. The dNTP dispensing order was designed based on the product sequences. Only 12 rounds (ATCTGATCGACT) of dNTPs addition and, often, as few as three rounds (CAT) under ideal conditions, were required to detect the GM events qualitatively, and sensitivity was as low as 1% of a mixture. However, when considering a mixture, calculating signal values allowed the proportion of each GM to be estimated. Based on these results, we concluded that our novel method not only realized detection but also allowed semi-quantitative detection of individual events. Copyright © 2017. Published by Elsevier Ltd.

  5. Large-Scale Biomonitoring of Remote and Threatened Ecosystems via High-Throughput Sequencing

    PubMed Central

    Gibson, Joel F.; Shokralla, Shadi; Curry, Colin; Baird, Donald J.; Monk, Wendy A.; King, Ian; Hajibabaei, Mehrdad

    2015-01-01

    Biodiversity metrics are critical for assessment and monitoring of ecosystems threatened by anthropogenic stressors. Existing sorting and identification methods are too expensive and labour-intensive to be scaled up to meet management needs. Alternately, a high-throughput DNA sequencing approach could be used to determine biodiversity metrics from bulk environmental samples collected as part of a large-scale biomonitoring program. Here we show that both morphological and DNA sequence-based analyses are suitable for recovery of individual taxonomic richness, estimation of proportional abundance, and calculation of biodiversity metrics using a set of 24 benthic samples collected in the Peace-Athabasca Delta region of Canada. The high-throughput sequencing approach was able to recover all metrics with a higher degree of taxonomic resolution than morphological analysis. The reduced cost and increased capacity of DNA sequence-based approaches will finally allow environmental monitoring programs to operate at the geographical and temporal scale required by industrial and regulatory end-users. PMID:26488407

  6. The August 2011 Virginia and Colorado Earthquake Sequences: Does Stress Drop Depend on Strain Rate?

    NASA Astrophysics Data System (ADS)

    Abercrombie, R. E.; Viegas, G.

    2011-12-01

    Our preliminary analysis of the August 2011 Virginia earthquake sequence finds the earthquakes to have high stress drops, similar to those of recent earthquakes in NE USA, while those of the August 2011 Trinidad, Colorado, earthquakes are moderate - in between those typical of interplate (California) and the east coast. These earthquakes provide an unprecedented opportunity to study such source differences in detail, and hence improve our estimates of seismic hazard. Previously, the lack of well-recorded earthquakes in the eastern USA severely limited our resolution of the source processes and hence the expected ground accelerations. Our preliminary findings are consistent with the idea that earthquake faults strengthen during longer recurrence times and intraplate faults fail at higher stress (and produce higher ground accelerations) than their interplate counterparts. We use the empirical Green's function (EGF) method to calculate source parameters for the Virginia mainshock and three larger aftershocks, and for the Trinidad mainshock and two larger foreshocks using IRIS-available stations. We select time windows around the direct P and S waves at the closest stations and calculate spectral ratios and source time functions using the multi-taper spectral approach (eg. Viegas et al., JGR 2010). Our preliminary results show that the Virginia sequence has high stress drops (~100-200 MPa, using Madariaga (1976) model), and the Colorado sequence has moderate stress drops (~20 MPa). These numbers are consistent with previous work in the regions, for example the Au Sable Forks (2002) earthquake, and the 2010 Germantown (MD) earthquake. We also calculate the radiated seismic energy and find the energy/moment ratio to be high for the Virginia earthquakes, and moderate for the Colorado sequence. We observe no evidence of a breakdown in constant stress drop scaling in this limited number of earthquakes. We extend our analysis to a larger number of earthquakes and stations. We calculate uncertainties in all our measurements, and also consider carefully the effects of variation in available bandwidth in order to improve our constraints on the source parameters.

  7. PyEvolve: a toolkit for statistical modelling of molecular evolution.

    PubMed

    Butterfield, Andrew; Vedagiri, Vivek; Lang, Edward; Lawrence, Cath; Wakefield, Matthew J; Isaev, Alexander; Huttley, Gavin A

    2004-01-05

    Examining the distribution of variation has proven an extremely profitable technique in the effort to identify sequences of biological significance. Most approaches in the field, however, evaluate only the conserved portions of sequences - ignoring the biological significance of sequence differences. A suite of sophisticated likelihood based statistical models from the field of molecular evolution provides the basis for extracting the information from the full distribution of sequence variation. The number of different problems to which phylogeny-based maximum likelihood calculations can be applied is extensive. Available software packages that can perform likelihood calculations suffer from a lack of flexibility and scalability, or employ error-prone approaches to model parameterisation. Here we describe the implementation of PyEvolve, a toolkit for the application of existing, and development of new, statistical methods for molecular evolution. We present the object architecture and design schema of PyEvolve, which includes an adaptable multi-level parallelisation schema. The approach for defining new methods is illustrated by implementing a novel dinucleotide model of substitution that includes a parameter for mutation of methylated CpG's, which required 8 lines of standard Python code to define. Benchmarking was performed using either a dinucleotide or codon substitution model applied to an alignment of BRCA1 sequences from 20 mammals, or a 10 species subset. Up to five-fold parallel performance gains over serial were recorded. Compared to leading alternative software, PyEvolve exhibited significantly better real world performance for parameter rich models with a large data set, reducing the time required for optimisation from approximately 10 days to approximately 6 hours. PyEvolve provides flexible functionality that can be used either for statistical modelling of molecular evolution, or the development of new methods in the field. The toolkit can be used interactively or by writing and executing scripts. The toolkit uses efficient processes for specifying the parameterisation of statistical models, and implements numerous optimisations that make highly parameter rich likelihood functions solvable within hours on multi-cpu hardware. PyEvolve can be readily adapted in response to changing computational demands and hardware configurations to maximise performance. PyEvolve is released under the GPL and can be downloaded from http://cbis.anu.edu.au/software.

  8. TaxI: a software tool for DNA barcoding using distance methods

    PubMed Central

    Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel

    2005-01-01

    DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755

  9. Genome Survey Sequencing of Luffa Cylindrica L. and Microsatellite High Resolution Melting (SSR-HRM) Analysis for Genetic Relationship of Luffa Genotypes.

    PubMed

    An, Jianyu; Yin, Mengqi; Zhang, Qin; Gong, Dongting; Jia, Xiaowen; Guan, Yajing; Hu, Jin

    2017-09-11

    Luffa cylindrica (L.) Roem. is an economically important vegetable crop in China. However, the genomic information on this species is currently unknown. In this study, for the first time, a genome survey of L. cylindrica was carried out using next-generation sequencing (NGS) technology. In total, 43.40 Gb sequence data of L. cylindrica , about 54.94× coverage of the estimated genome size of 789.97 Mb, were obtained from HiSeq 2500 sequencing, in which the guanine plus cytosine (GC) content was calculated to be 37.90%. The heterozygosity of genome sequences was only 0.24%. In total, 1,913,731 contigs (>200 bp) with 525 bp N 50 length and 1,410,117 scaffolds (>200 bp) with 885.01 Mb total length were obtained. From the initial assembled L. cylindrica genome, 431,234 microsatellites (SSRs) (≥5 repeats) were identified. The motif types of SSR repeats included 62.88% di-nucleotide, 31.03% tri-nucleotide, 4.59% tetra-nucleotide, 0.96% penta-nucleotide and 0.54% hexa-nucleotide. Eighty genomic SSR markers were developed, and 51/80 primers could be used in both "Zheda 23" and "Zheda 83". Nineteen SSRs were used to investigate the genetic diversity among 32 accessions through SSR-HRM analysis. The unweighted pair group method analysis (UPGMA) dendrogram tree was built by calculating the SSR-HRM raw data. SSR-HRM could be effectively used for genotype relationship analysis of Luffa species.

  10. Automated use of mutagenesis data in structure prediction.

    PubMed

    Nanda, Vikas; DeGrado, William F

    2005-05-15

    In the absence of experimental structural determination, numerous methods are available to indirectly predict or probe the structure of a target molecule. Genetic modification of a protein sequence is a powerful tool for identifying key residues involved in binding reactions or protein stability. Mutagenesis data is usually incorporated into the modeling process either through manual inspection of model compatibility with empirical data, or through the generation of geometric constraints linking sensitive residues to a binding interface. We present an approach derived from statistical studies of lattice models for introducing mutation information directly into the fitness score. The approach takes into account the phenotype of mutation (neutral or disruptive) and calculates the energy for a given structure over an ensemble of sequences. The structure prediction procedure searches for the optimal conformation where neutral sequences either have no impact or improve stability and disruptive sequences reduce stability relative to wild type. We examine three types of sequence ensembles: information from saturation mutagenesis, scanning mutagenesis, and homologous proteins. Incorporating multiple sequences into a statistical ensemble serves to energetically separate the native state and misfolded structures. As a result, the prediction of structure with a poor force field is sufficiently enhanced by mutational information to improve accuracy. Furthermore, by separating misfolded conformations from the target score, the ensemble energy serves to speed up conformational search algorithms such as Monte Carlo-based methods. Copyright 2005 Wiley-Liss, Inc.

  11. QQ-SNV: single nucleotide variant detection at low frequency by comparing the quality quantiles.

    PubMed

    Van der Borght, Koen; Thys, Kim; Wetzels, Yves; Clement, Lieven; Verbist, Bie; Reumers, Joke; van Vlijmen, Herman; Aerssens, Jeroen

    2015-11-10

    Next generation sequencing enables studying heterogeneous populations of viral infections. When the sequencing is done at high coverage depth ("deep sequencing"), low frequency variants can be detected. Here we present QQ-SNV (http://sourceforge.net/projects/qqsnv), a logistic regression classifier model developed for the Illumina sequencing platforms that uses the quantiles of the quality scores, to distinguish true single nucleotide variants from sequencing errors based on the estimated SNV probability. To train the model, we created a dataset of an in silico mixture of five HIV-1 plasmids. Testing of our method in comparison to the existing methods LoFreq, ShoRAH, and V-Phaser 2 was performed on two HIV and four HCV plasmid mixture datasets and one influenza H1N1 clinical dataset. For default application of QQ-SNV, variants were called using a SNV probability cutoff of 0.5 (QQ-SNV(D)). To improve the sensitivity we used a SNV probability cutoff of 0.0001 (QQ-SNV(HS)). To also increase specificity, SNVs called were overruled when their frequency was below the 80(th) percentile calculated on the distribution of error frequencies (QQ-SNV(HS-P80)). When comparing QQ-SNV versus the other methods on the plasmid mixture test sets, QQ-SNV(D) performed similarly to the existing approaches. QQ-SNV(HS) was more sensitive on all test sets but with more false positives. QQ-SNV(HS-P80) was found to be the most accurate method over all test sets by balancing sensitivity and specificity. When applied to a paired-end HCV sequencing study, with lowest spiked-in true frequency of 0.5%, QQ-SNV(HS-P80) revealed a sensitivity of 100% (vs. 40-60% for the existing methods) and a specificity of 100% (vs. 98.0-99.7% for the existing methods). In addition, QQ-SNV required the least overall computation time to process the test sets. Finally, when testing on a clinical sample, four putative true variants with frequency below 0.5% were consistently detected by QQ-SNV(HS-P80) from different generations of Illumina sequencers. We developed and successfully evaluated a novel method, called QQ-SNV, for highly efficient single nucleotide variant calling on Illumina deep sequencing virology data.

  12. An efficient method for the prediction of deleterious multiple-point mutations in the secondary structure of RNAs using suboptimal folding solutions

    PubMed Central

    Churkin, Alexander; Barash, Danny

    2008-01-01

    Background RNAmute is an interactive Java application which, given an RNA sequence, calculates the secondary structure of all single point mutations and organizes them into categories according to their similarity to the predicted structure of the wild type. The secondary structure predictions are performed using the Vienna RNA package. A more efficient implementation of RNAmute is needed, however, to extend from the case of single point mutations to the general case of multiple point mutations, which may often be desired for computational predictions alongside mutagenesis experiments. But analyzing multiple point mutations, a process that requires traversing all possible mutations, becomes highly expensive since the running time is O(nm) for a sequence of length n with m-point mutations. Using Vienna's RNAsubopt, we present a method that selects only those mutations, based on stability considerations, which are likely to be conformational rearranging. The approach is best examined using the dot plot representation for RNA secondary structure. Results Using RNAsubopt, the suboptimal solutions for a given wild-type sequence are calculated once. Then, specific mutations are selected that are most likely to cause a conformational rearrangement. For an RNA sequence of about 100 nts and 3-point mutations (n = 100, m = 3), for example, the proposed method reduces the running time from several hours or even days to several minutes, thus enabling the practical application of RNAmute to the analysis of multiple-point mutations. Conclusion A highly efficient addition to RNAmute that is as user friendly as the original application but that facilitates the practical analysis of multiple-point mutations is presented. Such an extension can now be exploited prior to site-directed mutagenesis experiments by virologists, for example, who investigate the change of function in an RNA virus via mutations that disrupt important motifs in its secondary structure. A complete explanation of the application, called MultiRNAmute, is available at [1]. PMID:18445289

  13. Expanding the 2011 Prague, OK Event Catalog: Detections, Relocations, and Stress Drop Estimates

    NASA Astrophysics Data System (ADS)

    Clerc, F.; Cochran, E. S.; Dougherty, S. L.; Keranen, K. M.; Harrington, R. M.

    2016-12-01

    The Mw 5.6 earthquake occurring on 6 Nov. 2011, near Prague, OK, is thought to have been triggered by a Mw 4.8 foreshock, which was likely induced by fluid injection into local wastewater disposal wells [Keranen et al., 2013; Sumy et al., 2014]. Previous stress drop estimates for the sequence have suggested values lower than those for most Central and Eastern U.S. tectonic events of similar magnitudes [Hough, 2014; Sun & Hartzell, 2014; Sumy & Neighbors et al., 2016]. Better stress drop estimates allow more realistic assessment of seismic hazard and more effective regulation of wastewater injection. More reliable estimates of source properties may help to differentiate induced events from natural ones. Using data from local and regional networks, we perform event detections, relocations, and stress drop calculations of the Prague aftershock sequence. We use the Match & Locate method, a variation on the matched-filter method which detects events of lower magnitudes by stacking cross-correlograms from different stations [Zhang & Wen, 2013; 2015], in order to create a more complete catalog from 6 Nov to 31 Dec 2011. We then relocate the detected events using the HypoDD double-difference algorithm. Using our enhanced catalog and relocations, we examine the seismicity distribution for evidence of migration and investigate implications for triggering mechanisms. To account for path and site effects, we calculate stress drops using the Empirical Green's Function (EGF) spectral ratio method, beginning with 2730 previously relocated events. We determine whether there is a correlation between the stress drop magnitudes and the spatial and temporal distribution of events, including depth, position relative to existing faults, and proximity to injection wells. Finally, we consider the range of stress drop values and scaling with respect to event magnitudes within the context of previously published work for the Prague sequence as well as other induced and natural sequences.

  14. Enhanced sampling simulations of DNA step parameters.

    PubMed

    Karolak, Aleksandra; van der Vaart, Arjan

    2014-12-15

    A novel approach for the selection of step parameters as reaction coordinates in enhanced sampling simulations of DNA is presented. The method uses three atoms per base and does not require coordinate overlays or idealized base pairs. This allowed for a highly efficient implementation of the calculation of all step parameters and their Cartesian derivatives in molecular dynamics simulations. Good correlation between the calculated and actual twist, roll, tilt, shift, and slide parameters is obtained, while the correlation with rise is modest. The method is illustrated by its application to the methylated and unmethylated 5'-CATGTGACGTCACATG-3' double stranded DNA sequence. One-dimensional umbrella simulations indicate that the flexibility of the central CG step is only marginally affected by methylation. © 2014 Wiley Periodicals, Inc.

  15. Computational protein design: validation and possible relevance as a tool for homology searching and fold recognition.

    PubMed

    Schmidt Am Busch, Marcel; Sedano, Audrey; Simonson, Thomas

    2010-05-05

    Protein fold recognition usually relies on a statistical model of each fold; each model is constructed from an ensemble of natural sequences belonging to that fold. A complementary strategy may be to employ sequence ensembles produced by computational protein design. Designed sequences can be more diverse than natural sequences, possibly avoiding some limitations of experimental databases. WE EXPLORE THIS STRATEGY FOR FOUR SCOP FAMILIES: Small Kunitz-type inhibitors (SKIs), Interleukin-8 chemokines, PDZ domains, and large Caspase catalytic subunits, represented by 43 structures. An automated procedure is used to redesign the 43 proteins. We use the experimental backbones as fixed templates in the folded state and a molecular mechanics model to compute the interaction energies between sidechain and backbone groups. Calculations are done with the Proteins@Home volunteer computing platform. A heuristic algorithm is used to scan the sequence and conformational space, yielding 200,000-300,000 sequences per backbone template. The results confirm and generalize our earlier study of SH2 and SH3 domains. The designed sequences ressemble moderately-distant, natural homologues of the initial templates; e.g., the SUPERFAMILY, profile Hidden-Markov Model library recognizes 85% of the low-energy sequences as native-like. Conversely, Position Specific Scoring Matrices derived from the sequences can be used to detect natural homologues within the SwissProt database: 60% of known PDZ domains are detected and around 90% of known SKIs and chemokines. Energy components and inter-residue correlations are analyzed and ways to improve the method are discussed. For some families, designed sequences can be a useful complement to experimental ones for homologue searching. However, improved tools are needed to extract more information from the designed profiles before the method can be of general use.

  16. Connection method of separated luminal regions of intestine from CT volumes

    NASA Astrophysics Data System (ADS)

    Oda, Masahiro; Kitasaka, Takayuki; Furukawa, Kazuhiro; Watanabe, Osamu; Ando, Takafumi; Hirooka, Yoshiki; Goto, Hidemi; Mori, Kensaku

    2015-03-01

    This paper proposes a connection method of separated luminal regions of the intestine for Crohn's disease diagnosis. Crohn's disease is an inflammatory disease of the digestive tract. Capsule or conventional endoscopic diagnosis is performed for Crohn's disease diagnosis. However, parts of the intestines may not be observed in the endoscopic diagnosis if intestinal stenosis occurs. Endoscopes cannot pass through the stenosed parts. CT image-based diagnosis is developed as an alternative choice of the Crohn's disease. CT image-based diagnosis enables physicians to observe the entire intestines even if stenosed parts exist. CAD systems for Crohn's disease using CT volumes are recently developed. Such CAD systems need to reconstruct separated luminal regions of the intestines to analyze intestines. We propose a connection method of separated luminal regions of the intestines segmented from CT volumes. The luminal regions of the intestines are segmented from a CT volume. The centerlines of the luminal regions are calculated by using a thinning process. We enumerate all the possible sequences of the centerline segments. In this work, we newly introduce a condition using distance between connected ends points of the centerline segments. This condition eliminates unnatural connections of the centerline segments. Also, this condition reduces processing time. After generating a sequence list of the centerline segments, the correct sequence is obtained by using an evaluation function. We connect the luminal regions based on the correct sequence. Our experiments using four CT volumes showed that our method connected 6.5 out of 8.0 centerline segments per case. Processing times of the proposed method were reduced from the previous method.

  17. Easy-to-use phylogenetic analysis system for hepatitis B virus infection.

    PubMed

    Sugiyama, Masaya; Inui, Ayano; Shin-I, Tadasu; Komatsu, Haruki; Mukaide, Motokazu; Masaki, Naohiko; Murata, Kazumoto; Ito, Kiyoaki; Nakanishi, Makoto; Fujisawa, Tomoo; Mizokami, Masashi

    2011-10-01

      The molecular phylogenetic analysis has been broadly applied to clinical and virological study. However, the appropriate settings and application of calculation parameters are difficult for non-specialists of molecular genetics. In the present study, the phylogenetic analysis tool was developed for the easy determination of genotypes and transmission route.   A total of 23 patients of 10 families infected with hepatitis B virus (HBV) were enrolled and expected to undergo intrafamilial transmission. The extracted HBV DNA were amplified and sequenced in a region of the S gene.   The software to automatically classify query sequence was constructed and installed on the Hepatitis Virus Database (HVDB). Reference sequences were retrieved from HVDB, which contained major genotypes from A to H. Multiple-alignments using CLUSTAL W were performed before the genetic distance matrix was calculated with the six-parameter method. The phylogenetic tree was output by the neighbor-joining method. User interface using WWW-browser was also developed for intuitive control. This system was named as the easy-to-use phylogenetic analysis system (E-PAS). Twenty-three sera of 10 families were analyzed to evaluate E-PAS. The queries obtained from nine families were genotype C and were located in one cluster per family. However, one patient of a family was classified into the cluster different from her family, suggesting that E-PAS detected the sample distinct from that of her family on the transmission route.   The E-PAS to output phylogenetic tree was developed since requisite material was sequence data only. E-PAS could expand to determine HBV genotypes as well as transmission routes. © 2011 The Japan Society of Hepatology.

  18. [Identification of Tibetan medicine "Dida" of Gentianaceae using DNA barcoding].

    PubMed

    Liu, Chuan; Zhang, Yu-Xin; Liu, Yue; Chen, Yi-Long; Fan, Gang; Xiang, Li; Xu, Jiang; Zhang, Yi

    2016-02-01

    The ITS2 barcode was used toidentify Tibetan medicine "Dida", and tosecure its quality and safety in medication. A total of 13 species, 151 experimental samples for the study from the Tibetan Plateau, including Gentianaceae Swertia, Halenia, Gentianopsis, Comastoma, Lomatogonium ITS2 sequences were amplified, and purified PCR products were sequenced. Sequence assembly and consensus sequence generation were performed using the CodonCode Aligner V3.7.1. The Kimura 2-Parameter (K2P) distances were calculated using MEGA 6.0. The neighbor-joining (NJ) phylogenetic trees were constructed. There are 31 haplotypes among 231 bp after alignment of all ITS2 sequence haplotypes, and the average G±C content of 61.40%. The NJ tree strongly supported that every species clustered into their own clade and high identification success rate, except that Swertia bifolia and Swertia wolfangiana could not be distinguished from each other based on the sequence divergences. DNA barcoding could be used as a fast and accurate identification method to distinguish Tibetan medicine "Dida" to ensure its safe use. Copyright© by the Chinese Pharmaceutical Association.

  19. Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation.

    PubMed

    Yang, Jian-Yi; Peng, Zhen-Ling; Yu, Zu-Guo; Zhang, Rui-Jie; Anh, Vo; Wang, Desheng

    2009-04-21

    In this paper, we intend to predict protein structural classes (alpha, beta, alpha+beta, or alpha/beta) for low-homology data sets. Two data sets were used widely, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence homology being 40% and 25%, respectively. We propose to decompose the chaos game representation of proteins into two kinds of time series. Then, a novel and powerful nonlinear analysis technique, recurrence quantification analysis (RQA), is applied to analyze these time series. For a given protein sequence, a total of 16 characteristic parameters can be calculated with RQA, which are treated as feature representation of protein sequences. Based on such feature representation, the structural class for each protein is predicted with Fisher's linear discriminant algorithm. The jackknife test is used to test and compare our method with other existing methods. The overall accuracies with step-by-step procedure are 65.8% and 64.2% for 1189 and 25PDB data sets, respectively. With one-against-others procedure used widely, we compare our method with five other existing methods. Especially, the overall accuracies of our method are 6.3% and 4.1% higher for the two data sets, respectively. Furthermore, only 16 parameters are used in our method, which is less than that used by other methods. This suggests that the current method may play a complementary role to the existing methods and is promising to perform the prediction of protein structural classes.

  20. Investigation of Post-mortem Tissue Effects Using Long-time Decorrelation Ultrasound

    NASA Astrophysics Data System (ADS)

    Csány, Gergely; Balogh, Lajos; Gyöngy, Miklós

    Decorrelation ultrasound is being increasingly used to investigate long-term biological phenomena. In the current work, ultrasound image sequences of mice who did not survive anesthesia (in a separate investigation) were analyzed and post-mortem tissue effects were observed via decorrelation calculation. A method was developed to obtain a quantitative parameter characterizing the rate of decorrelation. The results show that ultrasound decorrelation imaging is an effective method of observing post-mortem tissue effects and point to further studies elucidating the mechanism behind these effects.

  1. SIMAP--a comprehensive database of pre-calculated protein sequence similarities, domains, annotations and clusters.

    PubMed

    Rattei, Thomas; Tischler, Patrick; Götz, Stefan; Jehl, Marc-André; Hoser, Jonathan; Arnold, Roland; Conesa, Ana; Mewes, Hans-Werner

    2010-01-01

    The prediction of protein function as well as the reconstruction of evolutionary genesis employing sequence comparison at large is still the most powerful tool in sequence analysis. Due to the exponential growth of the number of known protein sequences and the subsequent quadratic growth of the similarity matrix, the computation of the Similarity Matrix of Proteins (SIMAP) becomes a computational intensive task. The SIMAP database provides a comprehensive and up-to-date pre-calculation of the protein sequence similarity matrix, sequence-based features and sequence clusters. As of September 2009, SIMAP covers 48 million proteins and more than 23 million non-redundant sequences. Novel features of SIMAP include the expansion of the sequence space by including databases such as ENSEMBL as well as the integration of metagenomes based on their consistent processing and annotation. Furthermore, protein function predictions by Blast2GO are pre-calculated for all sequences in SIMAP and the data access and query functions have been improved. SIMAP assists biologists to query the up-to-date sequence space systematically and facilitates large-scale downstream projects in computational biology. Access to SIMAP is freely provided through the web portal for individuals (http://mips.gsf.de/simap/) and for programmatic access through DAS (http://webclu.bio.wzw.tum.de/das/) and Web-Service (http://mips.gsf.de/webservices/services/SimapService2.0?wsdl).

  2. Use of life course work-family profiles to predict mortality risk among US women.

    PubMed

    Sabbath, Erika L; Guevara, Ivan Mejía; Glymour, M Maria; Berkman, Lisa F

    2015-04-01

    We examined relationships between US women's exposure to midlife work-family demands and subsequent mortality risk. We used data from women born 1935 to 1956 in the Health and Retirement Study to calculate employment, marital, and parenthood statuses for each age between 16 and 50 years. We used sequence analysis to identify 7 prototypical work-family trajectories. We calculated age-standardized mortality rates and hazard ratios (HRs) for mortality associated with work-family sequences, with adjustment for covariates and potentially explanatory later-life factors. Married women staying home with children briefly before reentering the workforce had the lowest mortality rates. In comparison, after adjustment for age, race/ethnicity, and education, HRs for mortality were 2.14 (95% confidence interval [CI] = 1.58, 2.90) among single nonworking mothers, 1.48 (95% CI = 1.06, 1.98) among single working mothers, and 1.36 (95% CI = 1.02, 1.80) among married nonworking mothers. Adjustment for later-life behavioral and economic factors partially attenuated risks. Sequence analysis is a promising exposure assessment tool for life course research. This method permitted identification of certain lifetime work-family profiles associated with mortality risk before age 75 years.

  3. Characterizing protein conformations by correlation analysis of coarse-grained contact matrices.

    PubMed

    Lindsay, Richard J; Siess, Jan; Lohry, David P; McGee, Trevor S; Ritchie, Jordan S; Johnson, Quentin R; Shen, Tongye

    2018-01-14

    We have developed a method to capture the essential conformational dynamics of folded biopolymers using statistical analysis of coarse-grained segment-segment contacts. Previously, the residue-residue contact analysis of simulation trajectories was successfully applied to the detection of conformational switching motions in biomolecular complexes. However, the application to large protein systems (larger than 1000 amino acid residues) is challenging using the description of residue contacts. Also, the residue-based method cannot be used to compare proteins with different sequences. To expand the scope of the method, we have tested several coarse-graining schemes that group a collection of consecutive residues into a segment. The definition of these segments may be derived from structural and sequence information, while the interaction strength of the coarse-grained segment-segment contacts is a function of the residue-residue contacts. We then perform covariance calculations on these coarse-grained contact matrices. We monitored how well the principal components of the contact matrices is preserved using various rendering functions. The new method was demonstrated to assist the reduction of the degrees of freedom for describing the conformation space, and it potentially allows for the analysis of a system that is approximately tenfold larger compared with the corresponding residue contact-based method. This method can also render a family of similar proteins into the same conformational space, and thus can be used to compare the structures of proteins with different sequences.

  4. Characterizing protein conformations by correlation analysis of coarse-grained contact matrices

    NASA Astrophysics Data System (ADS)

    Lindsay, Richard J.; Siess, Jan; Lohry, David P.; McGee, Trevor S.; Ritchie, Jordan S.; Johnson, Quentin R.; Shen, Tongye

    2018-01-01

    We have developed a method to capture the essential conformational dynamics of folded biopolymers using statistical analysis of coarse-grained segment-segment contacts. Previously, the residue-residue contact analysis of simulation trajectories was successfully applied to the detection of conformational switching motions in biomolecular complexes. However, the application to large protein systems (larger than 1000 amino acid residues) is challenging using the description of residue contacts. Also, the residue-based method cannot be used to compare proteins with different sequences. To expand the scope of the method, we have tested several coarse-graining schemes that group a collection of consecutive residues into a segment. The definition of these segments may be derived from structural and sequence information, while the interaction strength of the coarse-grained segment-segment contacts is a function of the residue-residue contacts. We then perform covariance calculations on these coarse-grained contact matrices. We monitored how well the principal components of the contact matrices is preserved using various rendering functions. The new method was demonstrated to assist the reduction of the degrees of freedom for describing the conformation space, and it potentially allows for the analysis of a system that is approximately tenfold larger compared with the corresponding residue contact-based method. This method can also render a family of similar proteins into the same conformational space, and thus can be used to compare the structures of proteins with different sequences.

  5. Alchemical Free Energy Calculations for Nucleotide Mutations in Protein-DNA Complexes.

    PubMed

    Gapsys, Vytautas; de Groot, Bert L

    2017-12-12

    Nucleotide-sequence-dependent interactions between proteins and DNA are responsible for a wide range of gene regulatory functions. Accurate and generalizable methods to evaluate the strength of protein-DNA binding have long been sought. While numerous computational approaches have been developed, most of them require fitting parameters to experimental data to a certain degree, e.g., machine learning algorithms or knowledge-based statistical potentials. Molecular-dynamics-based free energy calculations offer a robust, system-independent, first-principles-based method to calculate free energy differences upon nucleotide mutation. We present an automated procedure to set up alchemical MD-based calculations to evaluate free energy changes occurring as the result of a nucleotide mutation in DNA. We used these methods to perform a large-scale mutation scan comprising 397 nucleotide mutation cases in 16 protein-DNA complexes. The obtained prediction accuracy reaches 5.6 kJ/mol average unsigned deviation from experiment with a correlation coefficient of 0.57 with respect to the experimentally measured free energies. Overall, the first-principles-based approach performed on par with the molecular modeling approaches Rosetta and FoldX. Subsequently, we utilized the MD-based free energy calculations to construct protein-DNA binding profiles for the zinc finger protein Zif268. The calculation results compare remarkably well with the experimentally determined binding profiles. The software automating the structure and topology setup for alchemical calculations is a part of the pmx package; the utilities have also been made available online at http://pmx.mpibpc.mpg.de/dna_webserver.html .

  6. [Features of binding of proflavine to DNA at different DNA-ligand concentration ratios].

    PubMed

    Berezniak, E G; gladkovskaia, N A; Khrebtova, A S; Dukhopel'nikov, E V; Zinchenko, A V

    2009-01-01

    The binding of proflavine to calf thymus DNA has been studied using the methods of differential scanning calorimetry and spectrophotometry. It was shown that proflavine can interact with DNA by at least 3 binding modes. At high DNA-ligand concentration ratios (P/D), proflavine intercalates into both GC- and AT-sites, with a preference to GC-rich sequences. At low P/D ratios proflavine interacts with DNA by the external binding mode. From spectrophotometric concentration dependences, the parameters of complexing of proflavine with DNA were calculated. Thermodynamic parameters of DNA melting were calculated from differential scanning calorimetry data.

  7. Quantum Mechanical Calculations of Cytosine, Thiocytosine and Their Radical Ions

    NASA Astrophysics Data System (ADS)

    Singh, Rashmi

    2010-08-01

    The RNA and DNA are polymer that share some interesting similarities, for instance it is well known that cytosine is the one of the common nucleic acid base. The sulfur is characterized as a very reactive element and it has been used, in chemical warfare agents. Since the genetic information is based on the sequence of the nucleic acid bases. The quantum mechanical calculations of the energies, geometries, charges and vibrational characteristics of the cytosine and thiocytosine. and their corresponding radicals were carried out by using DFT method with b3lyp/6-311++g** basis set.

  8. A flexible new method for 3D measurement based on multi-view image sequences

    NASA Astrophysics Data System (ADS)

    Cui, Haihua; Zhao, Zhimin; Cheng, Xiaosheng; Guo, Changye; Jia, Huayu

    2016-11-01

    Three-dimensional measurement is the base part for reverse engineering. The paper developed a new flexible and fast optical measurement method based on multi-view geometry theory. At first, feature points are detected and matched with improved SIFT algorithm. The Hellinger Kernel is used to estimate the histogram distance instead of traditional Euclidean distance, which is immunity to the weak texture image; then a new filter three-principle for filtering the calculation of essential matrix is designed, the essential matrix is calculated using the improved a Contrario Ransac filter method. One view point cloud is constructed accurately with two view images; after this, the overlapped features are used to eliminate the accumulated errors caused by added view images, which improved the camera's position precision. At last, the method is verified with the application of dental restoration CAD/CAM, experiment results show that the proposed method is fast, accurate and flexible for tooth 3D measurement.

  9. A 3D model retrieval approach based on Bayesian networks lightfield descriptor

    NASA Astrophysics Data System (ADS)

    Xiao, Qinhan; Li, Yanjun

    2009-12-01

    A new 3D model retrieval methodology is proposed by exploiting a novel Bayesian networks lightfield descriptor (BNLD). There are two key novelties in our approach: (1) a BN-based method for building lightfield descriptor; and (2) a 3D model retrieval scheme based on the proposed BNLD. To overcome the disadvantages of the existing 3D model retrieval methods, we explore BN for building a new lightfield descriptor. Firstly, 3D model is put into lightfield, about 300 binary-views can be obtained along a sphere, then Fourier descriptors and Zernike moments descriptors can be calculated out from binaryviews. Then shape feature sequence would be learned into a BN model based on BN learning algorithm; Secondly, we propose a new 3D model retrieval method by calculating Kullback-Leibler Divergence (KLD) between BNLDs. Beneficial from the statistical learning, our BNLD is noise robustness as compared to the existing methods. The comparison between our method and the lightfield descriptor-based approach is conducted to demonstrate the effectiveness of our proposed methodology.

  10. Energy hyperspace for stacking interaction in AU/AU dinucleotide step: Dispersion-corrected density functional theory study.

    PubMed

    Mukherjee, Sanchita; Kailasam, Senthilkumar; Bansal, Manju; Bhattacharyya, Dhananjay

    2014-01-01

    Double helical structures of DNA and RNA are mostly determined by base pair stacking interactions, which give them the base sequence-directed features, such as small roll values for the purine-pyrimidine steps. Earlier attempts to characterize stacking interactions were mostly restricted to calculations on fiber diffraction geometries or optimized structure using ab initio calculations lacking variation in geometry to comment on rather unusual large roll values observed in AU/AU base pair step in crystal structures of RNA double helices. We have generated stacking energy hyperspace by modeling geometries with variations along the important degrees of freedom, roll, and slide, which were chosen via statistical analysis as maximally sequence dependent. Corresponding energy contours were constructed by several quantum chemical methods including dispersion corrections. This analysis established the most suitable methods for stacked base pair systems despite the limitation imparted by number of atom in a base pair step to employ very high level of theory. All the methods predict negative roll value and near-zero slide to be most favorable for the purine-pyrimidine steps, in agreement with Calladine's steric clash based rule. Successive base pairs in RNA are always linked by sugar-phosphate backbone with C3'-endo sugars and this demands C1'-C1' distance of about 5.4 Å along the chains. Consideration of an energy penalty term for deviation of C1'-C1' distance from the mean value, to the recent DFT-D functionals, specifically ωB97X-D appears to predict reliable energy contour for AU/AU step. Such distance-based penalty improves energy contours for the other purine-pyrimidine sequences also. © 2013 Wiley Periodicals, Inc. Biopolymers 101: 107-120, 2014. Copyright © 2013 Wiley Periodicals, Inc.

  11. Long sequence correlation coprocessor

    NASA Astrophysics Data System (ADS)

    Gage, Douglas W.

    1994-09-01

    A long sequence correlation coprocessor (LSCC) accelerates the bitwise correlation of arbitrarily long digital sequences by calculating in parallel the correlation score for 16, for example, adjacent bit alignments between two binary sequences. The LSCC integrated circuit is incorporated into a computer system with memory storage buffers and a separate general purpose computer processor which serves as its controller. Each of the LSCC's set of sequential counters simultaneously tallies a separate correlation coefficient. During each LSCC clock cycle, computer enable logic associated with each counter compares one bit of a first sequence with one bit of a second sequence to increment the counter if the bits are the same. A shift register assures that the same bit of the first sequence is simultaneously compared to different bits of the second sequence to simultaneously calculate the correlation coefficient by the different counters to represent different alignments of the two sequences.

  12. Scale-4 Analysis of Pressurized Water Reactor Critical Configurations: Volume 2-Sequoyah Unit 2 Cycle 3

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bowman, S.M.

    1995-01-01

    The requirements of ANSI/ANS 8.1 specify that calculational methods for away-from-reactor criticality safety analyses be validated against experimental measurements. If credit for the negative reactivity of the depleted (or spent) fuel isotopics is desired, it is necessary to benchmark computational methods against spent fuel critical configurations. This report summarizes a portion of the ongoing effort to benchmark away-from-reactor criticality analysis methods using critical configurations from commercial pressurized-water reactors. The analysis methodology selected for all the calculations reported herein is based on the codes and data provided in the SCALE-4 code system. The isotopic densities for the spent fuel assemblies inmore » the critical configurations were calculated using the SAS2H analytical sequence of the SCALE-4 system. The sources of data and the procedures for deriving SAS2H input parameters are described in detail. The SNIKR code module was used to extract the necessary isotopic densities from the SAS2H results and provide the data in the format required by the SCALE criticality analysis modules. The CSASN analytical sequence in SCALE-4 was used to perform resonance processing of the cross sections. The KENO V.a module of SCALE-4 was used to calculate the effective multiplication factor (k{sub eff}) of each case. The SCALE-4 27-group burnup library containing ENDF/B-IV (actinides) and ENDF/B-V (fission products) data was used for all the calculations. This volume of the report documents the SCALE system analysis of three reactor critical configurations for the Sequoyah Unit 2 Cycle 3. This unit and cycle were chosen because of the relevance in spent fuel benchmark applications: (1) the unit had a significantly long downtime of 2.7 years during the middle of cycle (MOC) 3, and (2) the core consisted entirely of burned fuel at the MOC restart. The first benchmark critical calculation was the MOC restart at hot, full-power (HFP) critical conditions. The other two benchmark critical calculations were the beginning-of-cycle (BOC) startup at both hot, zero-power (HZP) and HFP critical conditions. These latter calculations were used to check for consistency in the calculated results for different burnups and downtimes. The k{sub eff} results were in the range of 1.00014 to 1.00259 with a standard deviation of less than 0.001.« less

  13. Grammatical complexity for two-dimensional maps

    NASA Astrophysics Data System (ADS)

    Hagiwara, Ryouichi; Shudo, Akira

    2004-11-01

    We calculate the grammatical complexity of the symbol sequences generated from the Hénon map and the Lozi map using the recently developed methods to construct the pruning front. When the map is hyperbolic, the language of symbol sequences is regular in the sense of the Chomsky hierarchy and the corresponding grammatical complexity takes finite values. It is found that the complexity exhibits a self-similar structure as a function of the system parameter, and the similarity of the pruning fronts is discussed as an origin of such self-similarity. For non-hyperbolic cases, it is observed that the complexity monotonically increases as we increase the resolution of the pruning front.

  14. Single-Cell-Based Platform for Copy Number Variation Profiling through Digital Counting of Amplified Genomic DNA Fragments.

    PubMed

    Li, Chunmei; Yu, Zhilong; Fu, Yusi; Pang, Yuhong; Huang, Yanyi

    2017-04-26

    We develop a novel single-cell-based platform through digital counting of amplified genomic DNA fragments, named multifraction amplification (mfA), to detect the copy number variations (CNVs) in a single cell. Amplification is required to acquire genomic information from a single cell, while introducing unavoidable bias. Unlike prevalent methods that directly infer CNV profiles from the pattern of sequencing depth, our mfA platform denatures and separates the DNA molecules from a single cell into multiple fractions of a reaction mix before amplification. By examining the sequencing result of each fraction for a specific fragment and applying a segment-merge maximum likelihood algorithm to the calculation of copy number, we digitize the sequencing-depth-based CNV identification and thus provide a method that is less sensitive to the amplification bias. In this paper, we demonstrate a mfA platform through multiple displacement amplification (MDA) chemistry. When performing the mfA platform, the noise of MDA is reduced; therefore, the resolution of single-cell CNV identification can be improved to 100 kb. We can also determine the genomic region free of allelic drop-out with mfA platform, which is impossible for conventional single-cell amplification methods.

  15. Stable isotope, site-specific mass tagging for protein identification

    DOEpatents

    Chen, Xian

    2006-10-24

    Proteolytic peptide mass mapping as measured by mass spectrometry provides an important method for the identification of proteins, which are usually identified by matching the measured and calculated m/z values of the proteolytic peptides. A unique identification is, however, heavily dependent upon the mass accuracy and sequence coverage of the fragment ions generated by peptide ionization. The present invention describes a method for increasing the specificity, accuracy and efficiency of the assignments of particular proteolytic peptides and consequent protein identification, by the incorporation of selected amino acid residue(s) enriched with stable isotope(s) into the protein sequence without the need for ultrahigh instrumental accuracy. Selected amino acid(s) are labeled with .sup.13C/.sup.15N/.sup.2H and incorporated into proteins in a sequence-specific manner during cell culturing. Each of these labeled amino acids carries a defined mass change encoded in its monoisotopic distribution pattern. Through their characteristic patterns, the peptides with mass tag(s) can then be readily distinguished from other peptides in mass spectra. The present method of identifying unique proteins can also be extended to protein complexes and will significantly increase data search specificity, efficiency and accuracy for protein identifications.

  16. Whale song analyses using bioinformatics sequence analysis approaches

    NASA Astrophysics Data System (ADS)

    Chen, Yian A.; Almeida, Jonas S.; Chou, Lien-Siang

    2005-04-01

    Animal songs are frequently analyzed using discrete hierarchical units, such as units, themes and songs. Because animal songs and bio-sequences may be understood as analogous, bioinformatics analysis tools DNA/protein sequence alignment and alignment-free methods are proposed to quantify the theme similarities of the songs of false killer whales recorded off northeast Taiwan. The eighteen themes with discrete units that were identified in an earlier study [Y. A. Chen, masters thesis, University of Charleston, 2001] were compared quantitatively using several distance metrics. These metrics included the scores calculated using the Smith-Waterman algorithm with the repeated procedure; the standardized Euclidian distance and the angle metrics based on word frequencies. The theme classifications based on different metrics were summarized and compared in dendrograms using cluster analyses. The results agree with earlier classifications derived by human observation qualitatively. These methods further quantify the similarities among themes. These methods could be applied to the analyses of other animal songs on a larger scale. For instance, these techniques could be used to investigate song evolution and cultural transmission quantifying the dissimilarities of humpback whale songs across different seasons, years, populations, and geographic regions. [Work supported by SC Sea Grant, and Ilan County Government, Taiwan.

  17. Breaking the computational barriers of pairwise genome comparison.

    PubMed

    Torreno, Oscar; Trelles, Oswaldo

    2015-08-11

    Conventional pairwise sequence comparison software algorithms are being used to process much larger datasets than they were originally designed for. This can result in processing bottlenecks that limit software capabilities or prevent full use of the available hardware resources. Overcoming the barriers that limit the efficient computational analysis of large biological sequence datasets by retrofitting existing algorithms or by creating new applications represents a major challenge for the bioinformatics community. We have developed C libraries for pairwise sequence comparison within diverse architectures, ranging from commodity systems to high performance and cloud computing environments. Exhaustive tests were performed using different datasets of closely- and distantly-related sequences that span from small viral genomes to large mammalian chromosomes. The tests demonstrated that our solution is capable of generating high quality results with a linear-time response and controlled memory consumption, being comparable or faster than the current state-of-the-art methods. We have addressed the problem of pairwise and all-versus-all comparison of large sequences in general, greatly increasing the limits on input data size. The approach described here is based on a modular out-of-core strategy that uses secondary storage to avoid reaching memory limits during the identification of High-scoring Segment Pairs (HSPs) between the sequences under comparison. Software engineering concepts were applied to avoid intermediate result re-calculation, to minimise the performance impact of input/output (I/O) operations and to modularise the process, thus enhancing application flexibility and extendibility. Our computationally-efficient approach allows tasks such as the massive comparison of complete genomes, evolutionary event detection, the identification of conserved synteny blocks and inter-genome distance calculations to be performed more effectively.

  18. Bi-exponential T2 analysis of healthy and diseased Achilles tendons: an in vivo preliminary magnetic resonance study and correlation with clinical score.

    PubMed

    Juras, Vladimir; Apprich, Sebastian; Szomolanyi, Pavol; Bieri, Oliver; Deligianni, Xeni; Trattnig, Siegfried

    2013-10-01

    To compare mono- and bi-exponential T2 analysis in healthy and degenerated Achilles tendons using a recently introduced magnetic resonance variable-echo-time sequence (vTE) for T2 mapping. Ten volunteers and ten patients were included in the study. A variable-echo-time sequence was used with 20 echo times. Images were post-processed with both techniques, mono- and bi-exponential [T2 m, short T2 component (T2 s) and long T2 component (T2 l)]. The number of mono- and bi-exponentially decaying pixels in each region of interest was expressed as a ratio (B/M). Patients were clinically assessed with the Achilles Tendon Rupture Score (ATRS), and these values were correlated with the T2 values. The means for both T2 m and T2 s were statistically significantly different between patients and volunteers; however, for T2 s, the P value was lower. In patients, the Pearson correlation coefficient between ATRS and T2 s was -0.816 (P = 0.007). The proposed variable-echo-time sequence can be successfully used as an alternative method to UTE sequences with some added benefits, such as a short imaging time along with relatively high resolution and minimised blurring artefacts, and minimised susceptibility artefacts and chemical shift artefacts. Bi-exponential T2 calculation is superior to mono-exponential in terms of statistical significance for the diagnosis of Achilles tendinopathy. • Magnetic resonance imaging offers new insight into healthy and diseased Achilles tendons • Bi-exponential T2 calculation in Achilles tendons is more beneficial than mono-exponential • A short T2 component correlates strongly with clinical score • Variable echo time sequences successfully used instead of ultrashort echo time sequences.

  19. Effect of NaCrSi2O6 component on Lindsley's pyroxene thermometer: An evaluation based on strongly metamorphosed LL chondrites

    NASA Astrophysics Data System (ADS)

    Nakamuta, Y.; Urata, K.; Shibata, Y.; Kuwahara, Y.

    2017-03-01

    In Lindsley's thermometry, a revised sequence of calculation of components is proposed for clinopyroxene, in which kosmochlor component is added. Temperatures obtained for the components calculated by the revised method are about 50 °C lower than those obtained for the components calculated by the Lindsley's original method and agree well with temperatures obtained from orthopyroxenes. Ca-partitioning between clino- and orthopyroxenes is then thought to be equilibrated in types 5 to 7 ordinary chondrites. The temperatures for Tuxtuac (LL5), Dhurmsala (LL6), NWA 2092 (LL6/7), and Dho 011 (LL7) are 767-793°, 818-835°, 872-892°, and 917-936°C, respectively, suggesting that chondrites of higher petrographic types show higher equilibrium temperatures of pyroxenes. The regression equations which relate temperature and Wo and Fs contents in the temperature-contoured pyroxene quadrilateral of 1 atm of Lindsley (1983) are also determined by the least squares method. It is possible to reproduce temperatures with an error less than 20 °C (2SE) using the regression equations.

  20. A statistical method for assessing peptide identification confidence in accurate mass and time tag proteomics

    PubMed Central

    Stanley, Jeffrey R.; Adkins, Joshua N.; Slysz, Gordon W.; Monroe, Matthew E.; Purvine, Samuel O.; Karpievitch, Yuliya V.; Anderson, Gordon A.; Smith, Richard D.; Dabney, Alan R.

    2011-01-01

    Current algorithms for quantifying peptide identification confidence in the accurate mass and time (AMT) tag approach assume that the AMT tags themselves have been correctly identified. However, there is uncertainty in the identification of AMT tags, as this is based on matching LC-MS/MS fragmentation spectra to peptide sequences. In this paper, we incorporate confidence measures for the AMT tag identifications into the calculation of probabilities for correct matches to an AMT tag database, resulting in a more accurate overall measure of identification confidence for the AMT tag approach. The method is referred to as Statistical Tools for AMT tag Confidence (STAC). STAC additionally provides a Uniqueness Probability (UP) to help distinguish between multiple matches to an AMT tag and a method to calculate an overall false discovery rate (FDR). STAC is freely available for download as both a command line and a Windows graphical application. PMID:21692516

  1. repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects.

    PubMed

    Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen

    2015-04-15

    In order to develop powerful computational predictors for identifying the biological features or attributes of DNAs, one of the most challenging problems is to find a suitable approach to effectively represent the DNA sequences. To facilitate the studies of DNAs and nucleotides, we developed a Python package called representations of DNAs (repDNA) for generating the widely used features reflecting the physicochemical properties and sequence-order effects of DNAs and nucleotides. There are three feature groups composed of 15 features. The first group calculates three nucleic acid composition features describing the local sequence information by means of kmers; the second group calculates six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specific physicochemical properties; the third group calculates six pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence-order information via the physicochemical properties of its constituent oligonucleotides. In addition, these features can be easily calculated based on both the built-in and user-defined properties via using repDNA. The repDNA Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repDNA/. bliu@insun.hit.edu.cn or kcchou@gordonlifescience.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Genome Survey Sequencing of Luffa Cylindrica L. and Microsatellite High Resolution Melting (SSR-HRM) Analysis for Genetic Relationship of Luffa Genotypes

    PubMed Central

    An, Jianyu; Yin, Mengqi; Zhang, Qin; Gong, Dongting; Jia, Xiaowen; Guan, Yajing; Hu, Jin

    2017-01-01

    Luffa cylindrica (L.) Roem. is an economically important vegetable crop in China. However, the genomic information on this species is currently unknown. In this study, for the first time, a genome survey of L. cylindrica was carried out using next-generation sequencing (NGS) technology. In total, 43.40 Gb sequence data of L. cylindrica, about 54.94× coverage of the estimated genome size of 789.97 Mb, were obtained from HiSeq 2500 sequencing, in which the guanine plus cytosine (GC) content was calculated to be 37.90%. The heterozygosity of genome sequences was only 0.24%. In total, 1,913,731 contigs (>200 bp) with 525 bp N50 length and 1,410,117 scaffolds (>200 bp) with 885.01 Mb total length were obtained. From the initial assembled L. cylindrica genome, 431,234 microsatellites (SSRs) (≥5 repeats) were identified. The motif types of SSR repeats included 62.88% di-nucleotide, 31.03% tri-nucleotide, 4.59% tetra-nucleotide, 0.96% penta-nucleotide and 0.54% hexa-nucleotide. Eighty genomic SSR markers were developed, and 51/80 primers could be used in both “Zheda 23” and “Zheda 83”. Nineteen SSRs were used to investigate the genetic diversity among 32 accessions through SSR-HRM analysis. The unweighted pair group method analysis (UPGMA) dendrogram tree was built by calculating the SSR-HRM raw data. SSR-HRM could be effectively used for genotype relationship analysis of Luffa species. PMID:28891982

  3. Improvements in Block-Krylov Ritz Vectors and the Boundary Flexibility Method of Component Synthesis

    NASA Technical Reports Server (NTRS)

    Carney, Kelly Scott

    1997-01-01

    A method of dynamic substructuring is presented which utilizes a set of static Ritz vectors as a replacement for normal eigenvectors in component mode synthesis. This set of Ritz vectors is generated in a recurrence relationship, proposed by Wilson, which has the form of a block-Krylov subspace. The initial seed to the recurrence algorithm is based upon the boundary flexibility vectors of the component. Improvements have been made in the formulation of the initial seed to the Krylov sequence, through the use of block-filtering. A method to shift the Krylov sequence to create Ritz vectors that will represent the dynamic behavior of the component at target frequencies, the target frequency being determined by the applied forcing functions, has been developed. A method to terminate the Krylov sequence has also been developed. Various orthonormalization schemes have been developed and evaluated, including the Cholesky/QR method. Several auxiliary theorems and proofs which illustrate issues in component mode synthesis and loss of orthogonality in the Krylov sequence have also been presented. The resulting methodology is applicable to both fixed and free- interface boundary components, and results in a general component model appropriate for any type of dynamic analysis. The accuracy is found to be comparable to that of component synthesis based upon normal modes, using fewer generalized coordinates. In addition, the block-Krylov recurrence algorithm is a series of static solutions and so requires significantly less computation than solving the normal eigenspace problem. The requirement for less vectors to form the component, coupled with the lower computational expense of calculating these Ritz vectors, combine to create a method more efficient than traditional component mode synthesis.

  4. An Automated Pipeline for Engineering Many-Enzyme Pathways: Computational Sequence Design, Pathway Expression-Flux Mapping, and Scalable Pathway Optimization.

    PubMed

    Halper, Sean M; Cetnar, Daniel P; Salis, Howard M

    2018-01-01

    Engineering many-enzyme metabolic pathways suffers from the design curse of dimensionality. There are an astronomical number of synonymous DNA sequence choices, though relatively few will express an evolutionary robust, maximally productive pathway without metabolic bottlenecks. To solve this challenge, we have developed an integrated, automated computational-experimental pipeline that identifies a pathway's optimal DNA sequence without high-throughput screening or many cycles of design-build-test. The first step applies our Operon Calculator algorithm to design a host-specific evolutionary robust bacterial operon sequence with maximally tunable enzyme expression levels. The second step applies our RBS Library Calculator algorithm to systematically vary enzyme expression levels with the smallest-sized library. After characterizing a small number of constructed pathway variants, measurements are supplied to our Pathway Map Calculator algorithm, which then parameterizes a kinetic metabolic model that ultimately predicts the pathway's optimal enzyme expression levels and DNA sequences. Altogether, our algorithms provide the ability to efficiently map the pathway's sequence-expression-activity space and predict DNA sequences with desired metabolic fluxes. Here, we provide a step-by-step guide to applying the Pathway Optimization Pipeline on a desired multi-enzyme pathway in a bacterial host.

  5. POTAMOS mass spectrometry calculator: computer aided mass spectrometry to the post-translational modifications of proteins. A focus on histones.

    PubMed

    Vlachopanos, A; Soupsana, E; Politou, A S; Papamokos, G V

    2014-12-01

    Mass spectrometry is a widely used technique for protein identification and it has also become the method of choice in order to detect and characterize the post-translational modifications (PTMs) of proteins. Many software tools have been developed to deal with this complication. In this paper we introduce a new, free and user friendly online software tool, named POTAMOS Mass Spectrometry Calculator, which was developed in the open source application framework Ruby on Rails. It can provide calculated mass spectrometry data in a time saving manner, independently of instrumentation. In this web application we have focused on a well known protein family of histones whose PTMs are believed to play a crucial role in gene regulation, as suggested by the so called "histone code" hypothesis. The PTMs implemented in this software are: methylations of arginines and lysines, acetylations of lysines and phosphorylations of serines and threonines. The application is able to calculate the kind, the number and the combinations of the possible PTMs corresponding to a given peptide sequence and a given mass along with the full set of the unique primary structures produced by the possible distributions along the amino acid sequence. It can also calculate the masses and charges of a fragmented histone variant, which carries predefined modifications already implemented. Additional functionality is provided by the calculation of the masses of fragments produced upon protein cleavage by the proteolytic enzymes that are most widely used in proteomics studies. Copyright © 2014 Elsevier Ltd. All rights reserved.

  6. Specifics of the methodological approach to the study of nanoparticle impact on human health in the production of non-metallic nanomaterials for construction purposes

    NASA Astrophysics Data System (ADS)

    Ayzenshtadt, A. M.; Frolova, M. A.; Makhova, T. A.; Danilov, V. E.; Gupta, Piyush K.; Verma, Rama S.

    2018-01-01

    Minerals samples of mixed-genesis rocks in a finely dispersed state were obtained and studied, namely sand deposit (Kholmogory district) and basalt (Myandukha deposit, Plesetsk district) in Arkhangelsk region. The paper provides the chemical composition data used to calculate the specific mass atomization energy of rocks. The energy parameters of the micro and nano systems of the rock samples - free surface energy and surface activity - were calculated. For toxicological evaluation of the materials obtained, next-generation sequencing (NGS) was used to perform metagenomic analysis which allowed determining the species diversity of microorganisms in the samples under study. It was shown that the sequencing method and metagenomic analysis are applicable and provide good reproducibility for the analysis of the toxicological properties of selected rock samples. The correlation of the surface activity of finely dispersed rock systems and the species diversity of cultivated microorganisms on the raw material was observed.

  7. Hybrid Composite Laminates Reinforced with Kevlar/Carbon/Glass Woven Fabrics for Ballistic Impact Testing

    PubMed Central

    Randjbaran, Elias; Zahari, Rizal; Abdul Jalil, Nawal Aswan; Abang Abdul Majid, Dayang Laila

    2014-01-01

    Current study reported a facile method to investigate the effects of stacking sequence layers of hybrid composite materials on ballistic energy absorption by running the ballistic test at the high velocity ballistic impact conditions. The velocity and absorbed energy were accordingly calculated as well. The specimens were fabricated from Kevlar, carbon, and glass woven fabrics and resin and were experimentally investigated under impact conditions. All the specimens possessed equal mass, shape, and density; nevertheless, the layers were ordered in different stacking sequence. After running the ballistic test at the same conditions, the final velocities of the cylindrical AISI 4340 Steel pellet showed how much energy was absorbed by the samples. The energy absorption of each sample through the ballistic impact was calculated; accordingly, the proper ballistic impact resistance materials could be found by conducting the test. This paper can be further studied in order to characterise the material properties for the different layers. PMID:24955400

  8. High-pressure phases of Weyl semimetals NbP, NbAs, TaP, and TaAs

    NASA Astrophysics Data System (ADS)

    Guo, ZhaoPeng; Lu, PengChao; Chen, Tong; Wu, JueFei; Sun, Jian; Xing, DingYu

    2018-03-01

    In this study, we used the crystal structure search method and first-principles calculations to systematically explore the highpressure phase diagrams of the TaAs family (NbP, NbAs, TaP, and TaAs). Our calculation results show that NbAs and TaAs have similar phase diagrams, the same structural phase transition sequence I41 md→ P6¯ m2→ P21/ c→ Pm3¯ m, and slightly different transition pressures. The phase transition sequence of NbP and TaP differs somewhat from that of NbAs and TaAs, in which new structures emerge, such as the Cmcm structure in NbP and the Pmmn structure in TaP. Interestingly, we found that in the electronic structure of the high-pressure phase P6¯ m2-NbAs, there are coexistingWeyl points and triple degenerate points, similar to those found in high-pressure P6¯ m2-TaAs.

  9. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures.

    PubMed

    Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

    2009-01-01

    ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/

  10. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures

    PubMed Central

    Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

    2009-01-01

    ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/ PMID:18971256

  11. Controlling the Display of Capsule Endoscopy Video for Diagnostic Assistance

    NASA Astrophysics Data System (ADS)

    Vu, Hai; Echigo, Tomio; Sagawa, Ryusuke; Yagi, Keiko; Shiba, Masatsugu; Higuchi, Kazuhide; Arakawa, Tetsuo; Yagi, Yasushi

    Interpretations by physicians of capsule endoscopy image sequences captured over periods of 7-8 hours usually require 45 to 120 minutes of extreme concentration. This paper describes a novel method to reduce diagnostic time by automatically controlling the display frame rate. Unlike existing techniques, this method displays original images with no skipping of frames. The sequence can be played at a high frame rate in stable regions to save time. Then, in regions with rough changes, the speed is decreased to more conveniently ascertain suspicious findings. To realize such a system, cue information about the disparity of consecutive frames, including color similarity and motion displacements is extracted. A decision tree utilizes these features to classify the states of the image acquisitions. For each classified state, the delay time between frames is calculated by parametric functions. A scheme selecting the optimal parameters set determined from assessments by physicians is deployed. Experiments involved clinical evaluations to investigate the effectiveness of this method compared to a standard-view using an existing system. Results from logged action based analysis show that compared with an existing system the proposed method reduced diagnostic time to around 32.5 ± minutes per full sequence while the number of abnormalities found was similar. As well, physicians needed less effort because of the systems efficient operability. The results of the evaluations should convince physicians that they can safely use this method and obtain reduced diagnostic times.

  12. VarBin, a novel method for classifying true and false positive variants in NGS data

    PubMed Central

    2013-01-01

    Background Variant discovery for rare genetic diseases using Illumina genome or exome sequencing involves screening of up to millions of variants to find only the one or few causative variant(s). Sequencing or alignment errors create "false positive" variants, which are often retained in the variant screening process. Methods to remove false positive variants often retain many false positive variants. This report presents VarBin, a method to prioritize variants based on a false positive variant likelihood prediction. Methods VarBin uses the Genome Analysis Toolkit variant calling software to calculate the variant-to-wild type genotype likelihood ratio at each variant change and position divided by read depth. The resulting Phred-scaled, likelihood-ratio by depth (PLRD) was used to segregate variants into 4 Bins with Bin 1 variants most likely true and Bin 4 most likely false positive. PLRD values were calculated for a proband of interest and 41 additional Illumina HiSeq, exome and whole genome samples (proband's family or unrelated samples). At variant sites without apparent sequencing or alignment error, wild type/non-variant calls cluster near -3 PLRD and variant calls typically cluster above 10 PLRD. Sites with systematic variant calling problems (evident by variant quality scores and biases as well as displayed on the iGV viewer) tend to have higher and more variable wild type/non-variant PLRD values. Depending on the separation of a proband's variant PLRD value from the cluster of wild type/non-variant PLRD values for background samples at the same variant change and position, the VarBin method's classification is assigned to each proband variant (Bin 1 to Bin 4). Results To assess VarBin performance, Sanger sequencing was performed on 98 variants in the proband and background samples. True variants were confirmed in 97% of Bin 1 variants, 30% of Bin 2, and 0% of Bin 3/Bin 4. Conclusions These data indicate that VarBin correctly classifies the majority of true variants as Bin 1 and Bin 3/4 contained only false positive variants. The "uncertain" Bin 2 contained both true and false positive variants. Future work will further differentiate the variants in Bin 2. PMID:24266885

  13. Analysis of HIV using a high resolution melting (HRM) diversity assay: automation of HRM data analysis enhances the utility of the assay for analysis of HIV incidence.

    PubMed

    Cousins, Matthew M; Swan, David; Magaret, Craig A; Hoover, Donald R; Eshleman, Susan H

    2012-01-01

    HIV diversity may be a useful biomarker for discriminating between recent and non-recent HIV infection. The high resolution melting (HRM) diversity assay was developed to quantify HIV diversity in viral populations without sequencing. In this assay, HIV diversity is expressed as a single numeric HRM score that represents the width of a melting peak. HRM scores are highly associated with diversity measures obtained with next generation sequencing. In this report, a software package, the HRM Diversity Assay Analysis Tool (DivMelt), was developed to automate calculation of HRM scores from melting curve data. DivMelt uses computational algorithms to calculate HRM scores by identifying the start (T1) and end (T2) melting temperatures for a DNA sample and subtracting them (T2 - T1 =  HRM score). DivMelt contains many user-supplied analysis parameters to allow analyses to be tailored to different contexts. DivMelt analysis options were optimized to discriminate between recent and non-recent HIV infection and to maximize HRM score reproducibility. HRM scores calculated using DivMelt were compared to HRM scores obtained using a manual method that is based on visual inspection of DNA melting curves. HRM scores generated with DivMelt agreed with manually generated HRM scores obtained from the same DNA melting data. Optimal parameters for discriminating between recent and non-recent HIV infection were identified. DivMelt provided greater discrimination between recent and non-recent HIV infection than the manual method. DivMelt provides a rapid, accurate method of determining HRM scores from melting curve data, facilitating use of the HRM diversity assay for large-scale studies.

  14. TU-H-CAMPUS-JeP3-05: Adaptive Determination of Needle Sequence HDR Prostate Brachytherapy with Divergent Needle-By-Needle Delivery

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Borot de Battisti, M; Maenhout, M; Lagendijk, J J W

    Purpose: To develop a new method which adaptively determines the optimal needle insertion sequence for HDR prostate brachytherapy involving divergent needle-by-needle dose delivery by e.g. a robotic device. A needle insertion sequence is calculated at the beginning of the intervention and updated after each needle insertion with feedback on needle positioning errors. Methods: Needle positioning errors and anatomy changes may occur during HDR brachytherapy which can lead to errors in the delivered dose. A novel strategy was developed to calculate and update the needle sequence and the dose plan after each needle insertion with feedback on needle positioning errors. Themore » dose plan optimization was performed by numerical simulations. The proposed needle sequence determination optimizes the final dose distribution based on the dose coverage impact of each needle. This impact is predicted stochastically by needle insertion simulations. HDR procedures were simulated with varying number of needle insertions (4 to 12) using 11 patient MR data-sets with PTV, prostate, urethra, bladder and rectum delineated. Needle positioning errors were modeled by random normally distributed angulation errors (standard deviation of 3 mm at the needle’s tip). The final dose parameters were compared in the situations where the needle with the largest vs. the smallest dose coverage impact was selected at each insertion. Results: Over all scenarios, the percentage of clinically acceptable final dose distribution improved when the needle selected had the largest dose coverage impact (91%) compared to the smallest (88%). The differences were larger for few (4 to 6) needle insertions (maximum difference scenario: 79% vs. 60%). The computation time of the needle sequence optimization was below 60s. Conclusion: A new adaptive needle sequence determination for HDR prostate brachytherapy was developed. Coupled to adaptive planning, the selection of the needle with the largest dose coverage impact increases chances of reaching the clinical constraints. M. Borot de Battisti is funded by Philips Medical Systems Nederland B.V.; M. Moerland is principal investigator on a contract funded by Philips Medical Systems Nederland B.V.; G. Hautvast and D. Binnekamp are fulltime employees of Philips Medical Systems Nederland B.V.« less

  15. Comparing the normalization methods for the differential analysis of Illumina high-throughput RNA-Seq data.

    PubMed

    Li, Peipei; Piao, Yongjun; Shon, Ho Sun; Ryu, Keun Ho

    2015-10-28

    Recently, rapid improvements in technology and decrease in sequencing costs have made RNA-Seq a widely used technique to quantify gene expression levels. Various normalization approaches have been proposed, owing to the importance of normalization in the analysis of RNA-Seq data. A comparison of recently proposed normalization methods is required to generate suitable guidelines for the selection of the most appropriate approach for future experiments. In this paper, we compared eight non-abundance (RC, UQ, Med, TMM, DESeq, Q, RPKM, and ERPKM) and two abundance estimation normalization methods (RSEM and Sailfish). The experiments were based on real Illumina high-throughput RNA-Seq of 35- and 76-nucleotide sequences produced in the MAQC project and simulation reads. Reads were mapped with human genome obtained from UCSC Genome Browser Database. For precise evaluation, we investigated Spearman correlation between the normalization results from RNA-Seq and MAQC qRT-PCR values for 996 genes. Based on this work, we showed that out of the eight non-abundance estimation normalization methods, RC, UQ, Med, TMM, DESeq, and Q gave similar normalization results for all data sets. For RNA-Seq of a 35-nucleotide sequence, RPKM showed the highest correlation results, but for RNA-Seq of a 76-nucleotide sequence, least correlation was observed than the other methods. ERPKM did not improve results than RPKM. Between two abundance estimation normalization methods, for RNA-Seq of a 35-nucleotide sequence, higher correlation was obtained with Sailfish than that with RSEM, which was better than without using abundance estimation methods. However, for RNA-Seq of a 76-nucleotide sequence, the results achieved by RSEM were similar to without applying abundance estimation methods, and were much better than with Sailfish. Furthermore, we found that adding a poly-A tail increased alignment numbers, but did not improve normalization results. Spearman correlation analysis revealed that RC, UQ, Med, TMM, DESeq, and Q did not noticeably improve gene expression normalization, regardless of read length. Other normalization methods were more efficient when alignment accuracy was low; Sailfish with RPKM gave the best normalization results. When alignment accuracy was high, RC was sufficient for gene expression calculation. And we suggest ignoring poly-A tail during differential gene expression analysis.

  16. mPUMA: a computational approach to microbiota analysis by de novo assembly of operational taxonomic units based on protein-coding barcode sequences.

    PubMed

    Links, Matthew G; Chaban, Bonnie; Hemmingsen, Sean M; Muirhead, Kevin; Hill, Janet E

    2013-08-15

    Formation of operational taxonomic units (OTU) is a common approach to data aggregation in microbial ecology studies based on amplification and sequencing of individual gene targets. The de novo assembly of OTU sequences has been recently demonstrated as an alternative to widely used clustering methods, providing robust information from experimental data alone, without any reliance on an external reference database. Here we introduce mPUMA (microbial Profiling Using Metagenomic Assembly, http://mpuma.sourceforge.net), a software package for identification and analysis of protein-coding barcode sequence data. It was developed originally for Cpn60 universal target sequences (also known as GroEL or Hsp60). Using an unattended process that is independent of external reference sequences, mPUMA forms OTUs by DNA sequence assembly and is capable of tracking OTU abundance. mPUMA processes microbial profiles both in terms of the direct DNA sequence as well as in the translated amino acid sequence for protein coding barcodes. By forming OTUs and calculating abundance through an assembly approach, mPUMA is capable of generating inputs for several popular microbiota analysis tools. Using SFF data from sequencing of a synthetic community of Cpn60 sequences derived from the human vaginal microbiome, we demonstrate that mPUMA can faithfully reconstruct all expected OTU sequences and produce compositional profiles consistent with actual community structure. mPUMA enables analysis of microbial communities while empowering the discovery of novel organisms through OTU assembly.

  17. Understanding the mechanisms of protein-DNA interactions

    NASA Astrophysics Data System (ADS)

    Lavery, Richard

    2004-03-01

    Structural, biochemical and thermodynamic data on protein-DNA interactions show that specific recognition cannot be reduced to a simple set of binary interactions between the partners (such as hydrogen bonds, ion pairs or steric contacts). The mechanical properties of the partners also play a role and, in the case of DNA, variations in both conformation and flexibility as a function of base sequence can be a significant factor in guiding a protein to the correct binding site. All-atom molecular modeling offers a means of analyzing the role of different binding mechanisms within protein-DNA complexes of known structure. This however requires estimating the binding strengths for the full range of sequences with which a given protein can interact. Since this number grows exponentially with the length of the binding site it is necessary to find a method to accelerate the calculations. We have achieved this by using a multi-copy approach (ADAPT) which allows us to build a DNA fragment with a variable base sequence. The results obtained with this method correlate well with experimental consensus binding sequences. They enable us to show that indirect recognition mechanisms involving the sequence dependent properties of DNA play a significant role in many complexes. This approach also offers a means of predicting protein binding sites on the basis of binding energies, which is complementary to conventional lexical techniques.

  18. Discrete Ramanujan transform for distinguishing the protein coding regions from other regions.

    PubMed

    Hua, Wei; Wang, Jiasong; Zhao, Jian

    2014-01-01

    Based on the study of Ramanujan sum and Ramanujan coefficient, this paper suggests the concepts of discrete Ramanujan transform and spectrum. Using Voss numerical representation, one maps a symbolic DNA strand as a numerical DNA sequence, and deduces the discrete Ramanujan spectrum of the numerical DNA sequence. It is well known that of discrete Fourier power spectrum of protein coding sequence has an important feature of 3-base periodicity, which is widely used for DNA sequence analysis by the technique of discrete Fourier transform. It is performed by testing the signal-to-noise ratio at frequency N/3 as a criterion for the analysis, where N is the length of the sequence. The results presented in this paper show that the property of 3-base periodicity can be only identified as a prominent spike of the discrete Ramanujan spectrum at period 3 for the protein coding regions. The signal-to-noise ratio for discrete Ramanujan spectrum is defined for numerical measurement. Therefore, the discrete Ramanujan spectrum and the signal-to-noise ratio of a DNA sequence can be used for distinguishing the protein coding regions from the noncoding regions. All the exon and intron sequences in whole chromosomes 1, 2, 3 and 4 of Caenorhabditis elegans have been tested and the histograms and tables from the computational results illustrate the reliability of our method. In addition, we have analyzed theoretically and gotten the conclusion that the algorithm for calculating discrete Ramanujan spectrum owns the lower computational complexity and higher computational accuracy. The computational experiments show that the technique by using discrete Ramanujan spectrum for classifying different DNA sequences is a fast and effective method. Copyright © 2014 Elsevier Ltd. All rights reserved.

  19. [A study on identification of edible bird's nests by DNA barcodes].

    PubMed

    Chen, Yue-Juan; Liu, Wen-Jian; Chen, Dan-Na; Chieng, Sing-Hock; Jiang, Lin

    2017-12-01

    To provide theoretical basis for the traceability and quality evaluation of edible bird's nests (EBNs), the Cytb sequence was applied to identify the origin of EBNs. A total of 39 experiment samples were collected from Malaysia, Indonesia, Vietnam and Thailand. Genomic DNA was extracted for the PCR reaction. The amplified products were sequenced. 36 sequences were downloaded from Gen Bank including edible nest swiftlet, black nest swiftlet, mascarene swiftlet, pacific swiftlet and germain's swiftlet. MEGA 7.0 was used to analyze the distinction of sequences by the method of calculating the distances in intraspecific and interspecific divergences and constructing NJ and UPMGA phylogenetic tree based on Kimera-2-parameter model. The results showed that 39 samples were from three kinds of EBNs. Interspecific divergences were significantly greater than the intraspecific one. Samples could be successfully distinguished by NJ and UPMGA phylogenetic tree. In conclusion, Cytb sequence could be used to distinguish the origin of EBNs and it is efficient for tracing the origin species of EBNs. Copyright© by the Chinese Pharmaceutical Association.

  20. Comparison of a High-Resolution Melting Assay to Next-Generation Sequencing for Analysis of HIV Diversity

    PubMed Central

    Cousins, Matthew M.; Ou, San-San; Wawer, Maria J.; Munshaw, Supriya; Swan, David; Magaret, Craig A.; Mullis, Caroline E.; Serwadda, David; Porcella, Stephen F.; Gray, Ronald H.; Quinn, Thomas C.; Donnell, Deborah; Eshleman, Susan H.

    2012-01-01

    Next-generation sequencing (NGS) has recently been used for analysis of HIV diversity, but this method is labor-intensive, costly, and requires complex protocols for data analysis. We compared diversity measures obtained using NGS data to those obtained using a diversity assay based on high-resolution melting (HRM) of DNA duplexes. The HRM diversity assay provides a single numeric score that reflects the level of diversity in the region analyzed. HIV gag and env from individuals in Rakai, Uganda, were analyzed in a previous study using NGS (n = 220 samples from 110 individuals). Three sequence-based diversity measures were calculated from the NGS sequence data (percent diversity, percent complexity, and Shannon entropy). The amplicon pools used for NGS were analyzed with the HRM diversity assay. HRM scores were significantly associated with sequence-based measures of HIV diversity for both gag and env (P < 0.001 for all measures). The level of diversity measured by the HRM diversity assay and NGS increased over time in both regions analyzed (P < 0.001 for all measures except for percent complexity in gag), and similar amounts of diversification were observed with both methods (P < 0.001 for all measures except for percent complexity in gag). Diversity measures obtained using the HRM diversity assay were significantly associated with those from NGS, and similar increases in diversity over time were detected by both methods. The HRM diversity assay is faster and less expensive than NGS, facilitating rapid analysis of large studies of HIV diversity and evolution. PMID:22785188

  1. IVisTMSA: Interactive Visual Tools for Multiple Sequence Alignments.

    PubMed

    Pervez, Muhammad Tariq; Babar, Masroor Ellahi; Nadeem, Asif; Aslam, Naeem; Naveed, Nasir; Ahmad, Sarfraz; Muhammad, Shah; Qadri, Salman; Shahid, Muhammad; Hussain, Tanveer; Javed, Maryam

    2015-01-01

    IVisTMSA is a software package of seven graphical tools for multiple sequence alignments. MSApad is an editing and analysis tool. It can load 409% more data than Jalview, STRAP, CINEMA, and Base-by-Base. MSA comparator allows the user to visualize consistent and inconsistent regions of reference and test alignments of more than 21-MB size in less than 12 seconds. MSA comparator is 5,200% efficient and more than 40% efficient as compared to BALiBASE c program and FastSP, respectively. MSA reconstruction tool provides graphical user interfaces for four popular aligners and allows the user to load several sequence files at a time. FASTA generator converts seven formats of alignments of unlimited size into FASTA format in a few seconds. MSA ID calculator calculates identity matrix of more than 11,000 sequences with a sequence length of 2,696 base pairs in less than 100 seconds. Tree and Distance Matrix calculation tools generate phylogenetic tree and distance matrix, respectively, using neighbor joining% identity and BLOSUM 62 matrix.

  2. Direct Calculation of Protein Fitness Landscapes through Computational Protein Design

    PubMed Central

    Au, Loretta; Green, David F.

    2016-01-01

    Naturally selected amino-acid sequences or experimentally derived ones are often the basis for understanding how protein three-dimensional conformation and function are determined by primary structure. Such sequences for a protein family comprise only a small fraction of all possible variants, however, representing the fitness landscape with limited scope. Explicitly sampling and characterizing alternative, unexplored protein sequences would directly identify fundamental reasons for sequence robustness (or variability), and we demonstrate that computational methods offer an efficient mechanism toward this end, on a large scale. The dead-end elimination and A∗ search algorithms were used here to find all low-energy single mutant variants, and corresponding structures of a G-protein heterotrimer, to measure changes in structural stability and binding interactions to define a protein fitness landscape. We established consistency between these algorithms with known biophysical and evolutionary trends for amino-acid substitutions, and could thus recapitulate known protein side-chain interactions and predict novel ones. PMID:26745411

  3. Limitations of the method of complex basis functions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baumel, R.T.; Crocker, M.C.; Nuttall, J.

    1975-08-01

    The method of complex basis functions proposed by Rescigno and Reinhardt is applied to the calculation of the amplitude in a model problem which can be treated analytically. It is found for an important class of potentials, including some of infinite range and also the square well, that the method does not provide a converging sequence of approximations. However, in some cases, approximations of relatively low order might be close to the correct result. The method is also applied to S-wave e-H elastic scattering above the ionization threshold, and spurious ''convergence'' to the wrong result is found. A procedure whichmore » might overcome the difficulties of the method is proposed.« less

  4. Quantifying utricular stimulation during natural behavior

    PubMed Central

    Rivera, Angela R. V.; Davis, Julian; Grant, Wally; Blob, Richard W.; Peterson, Ellengene; Neiman, Alexander B.; Rowe, Michael

    2012-01-01

    The use of natural stimuli in neurophysiological studies has led to significant insights into the encoding strategies used by sensory neurons. To investigate these encoding strategies in vestibular receptors and neurons, we have developed a method for calculating the stimuli delivered to a vestibular organ, the utricle, during natural (unrestrained) behaviors, using the turtle as our experimental preparation. High-speed digital video sequences are used to calculate the dynamic gravito-inertial (GI) vector acting on the head during behavior. X-ray computed tomography (CT) scans are used to determine the orientation of the otoconial layer (OL) of the utricle within the head, and the calculated GI vectors are then rotated into the plane of the OL. Thus, the method allows us to quantify the spatio-temporal structure of stimuli to the OL during natural behaviors. In the future, these waveforms can be used as stimuli in neurophysiological experiments to understand how natural signals are encoded by vestibular receptors and neurons. We provide one example of the method which shows that turtle feeding behaviors can stimulate the utricle at frequencies higher than those typically used in vestibular studies. This method can be adapted to other species, to other vestibular end organs, and to other methods of quantifying head movements. PMID:22753360

  5. Modeling bias and variation in the stochastic processes of small RNA sequencing

    PubMed Central

    Etheridge, Alton; Sakhanenko, Nikita; Galas, David

    2017-01-01

    Abstract The use of RNA-seq as the preferred method for the discovery and validation of small RNA biomarkers has been hindered by high quantitative variability and biased sequence counts. In this paper we develop a statistical model for sequence counts that accounts for ligase bias and stochastic variation in sequence counts. This model implies a linear quadratic relation between the mean and variance of sequence counts. Using a large number of sequencing datasets, we demonstrate how one can use the generalized additive models for location, scale and shape (GAMLSS) distributional regression framework to calculate and apply empirical correction factors for ligase bias. Bias correction could remove more than 40% of the bias for miRNAs. Empirical bias correction factors appear to be nearly constant over at least one and up to four orders of magnitude of total RNA input and independent of sample composition. Using synthetic mixes of known composition, we show that the GAMLSS approach can analyze differential expression with greater accuracy, higher sensitivity and specificity than six existing algorithms (DESeq2, edgeR, EBSeq, limma, DSS, voom) for the analysis of small RNA-seq data. PMID:28369495

  6. Experimental studies of two-stage centrifugal dust concentrator

    NASA Astrophysics Data System (ADS)

    Vechkanova, M. V.; Fadin, Yu M.; Ovsyannikov, Yu G.

    2018-03-01

    The article presents data of experimental results of two-stage centrifugal dust concentrator, describes its design, and shows the development of a method of engineering calculation and laboratory investigations. For the experiments, the authors used quartz, ceramic dust and slag. Experimental dispersion analysis of dust particles was obtained by sedimentation method. To build a mathematical model of the process, dust collection was built using central composite rotatable design of the four factorial experiment. A sequence of experiments was conducted in accordance with the table of random numbers. Conclusion were made.

  7. A Web-based interface to calculate phonotactic probability for words and nonwords in English

    PubMed Central

    VITEVITCH, MICHAEL S.; LUCE, PAUL A.

    2008-01-01

    Phonotactic probability refers to the frequency with which phonological segments and sequences of phonological segments occur in words in a given language. We describe one method of estimating phonotactic probabilities based on words in American English. These estimates of phonotactic probability have been used in a number of previous studies and are now being made available to other researchers via a Web-based interface. Instructions for using the interface, as well as details regarding how the measures were derived, are provided in the present article. The Phonotactic Probability Calculator can be accessed at http://www.people.ku.edu/~mvitevit/PhonoProbHome.html. PMID:15641436

  8. Semiempirical Quantum Chemical Calculations Accelerated on a Hybrid Multicore CPU-GPU Computing Platform.

    PubMed

    Wu, Xin; Koslowski, Axel; Thiel, Walter

    2012-07-10

    In this work, we demonstrate that semiempirical quantum chemical calculations can be accelerated significantly by leveraging the graphics processing unit (GPU) as a coprocessor on a hybrid multicore CPU-GPU computing platform. Semiempirical calculations using the MNDO, AM1, PM3, OM1, OM2, and OM3 model Hamiltonians were systematically profiled for three types of test systems (fullerenes, water clusters, and solvated crambin) to identify the most time-consuming sections of the code. The corresponding routines were ported to the GPU and optimized employing both existing library functions and a GPU kernel that carries out a sequence of noniterative Jacobi transformations during pseudodiagonalization. The overall computation times for single-point energy calculations and geometry optimizations of large molecules were reduced by one order of magnitude for all methods, as compared to runs on a single CPU core.

  9. Numerical Calculation of the Spectrum of the Severe (1%) Lighting Current and Its First Derivative

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, C G; Ong, M M; Perkins, M P

    2010-02-12

    Recently, the direct-strike lighting environment for the stockpile-to-target sequence was updated [1]. In [1], the severe (1%) lightning current waveforms for first and subsequent return strokes are defined based on Heidler's waveform. This report presents numerical calculations of the spectra of those 1% lightning current waveforms and their first derivatives. First, the 1% lightning current models are repeated here for convenience. Then, the numerical method for calculating the spectra is presented and tested. The test uses a double-exponential waveform and its first derivative, which we fit to the previous 1% direct-strike lighting environment from [2]. Finally, the resulting spectra aremore » given and are compared with those of the double-exponential waveform and its first derivative.« less

  10. Pulse sequences for efficient multi-cycle terahertz generation in periodically poled lithium niobate.

    PubMed

    Ravi, Koustuban; Schimpf, Damian N; Kärtner, Franz X

    2016-10-31

    The use of laser pulse sequences to drive the cascaded difference frequency generation of high energy, high peak-power and multi-cycle terahertz pulses in cryogenically cooled (100 K) periodically poled Lithium Niobate is proposed and studied. Detailed simulations considering the coupled nonlinear interaction of terahertz and optical waves (or pump depletion), show that unprecedented optical-to-terahertz energy conversion efficiencies > 5%, peak electric fields of hundred(s) of mega volts/meter at terahertz pulse durations of hundred(s) of picoseconds can be achieved. The proposed methods are shown to circumvent laser induced damage limitations at Joule-level pumping by 1µm lasers to enable multi-cycle terahertz sources with pulse energies > 10 milli-joules. Various pulse sequence formats are proposed and analyzed. Numerical calculations for periodically poled structures accounting for cascaded difference frequency generation, self-phase-modulation, cascaded second harmonic generation and laser induced damage are introduced. The physics governing terahertz generation using pulse sequences in this high conversion efficiency regime, limitations and practical considerations are discussed. It is shown that varying the poling period along the crystal length and further reduction of absorption can lead to even higher energy conversion efficiencies >10%. In addition to numerical calculations, an analytic formulation valid for arbitrary pulse formats and closed-form expressions for important cases are presented. Parameters optimizing conversion efficiency in the 0.1-1 THz range, the corresponding peak electric fields, crystal lengths and terahertz pulse properties are furnished.

  11. Design of a fast echo matching algorithm to reduce crosstalk with Doppler shifts in ultrasonic ranging

    NASA Astrophysics Data System (ADS)

    Liu, Lei; Guo, Rui; Wu, Jun-an

    2017-02-01

    Crosstalk is a main factor for wrong distance measurement by ultrasonic sensors, and this problem becomes more difficult to deal with under Doppler effects. In this paper, crosstalk reduction with Doppler shifts on small platforms is focused on, and a fast echo matching algorithm (FEMA) is proposed on the basis of chaotic sequences and pulse coding technology, then verified through applying it to match practical echoes. Finally, we introduce how to select both better mapping methods for chaotic sequences, and algorithm parameters for higher achievable maximum of cross-correlation peaks. The results indicate the following: logistic mapping is preferred to generate good chaotic sequences, with high autocorrelation even when the length is very limited; FEMA can not only match echoes and calculate distance accurately with an error degree mostly below 5%, but also generates nearly the same calculation cost level for static or kinematic ranging, much lower than that by direct Doppler compensation (DDC) with the same frequency compensation step; The sensitivity to threshold value selection and performance of FEMA depend significantly on the achievable maximum of cross-correlation peaks, and a higher peak is preferred, which can be considered as a criterion for algorithm parameter optimization under practical conditions.

  12. Generative technique for dynamic infrared image sequences

    NASA Astrophysics Data System (ADS)

    Zhang, Qian; Cao, Zhiguo; Zhang, Tianxu

    2001-09-01

    The generative technique of the dynamic infrared image was discussed in this paper. Because infrared sensor differs from CCD camera in imaging mechanism, it generates the infrared image by incepting the infrared radiation of scene (including target and background). The infrared imaging sensor is affected deeply by the atmospheric radiation, the environmental radiation and the attenuation of atmospheric radiation transfers. Therefore at first in this paper the imaging influence of all kinds of the radiations was analyzed and the calculation formula of radiation was provided, in addition, the passive scene and the active scene were analyzed separately. Then the methods of calculation in the passive scene were provided, and the functions of the scene model, the atmospheric transmission model and the material physical attribute databases were explained. Secondly based on the infrared imaging model, the design idea, the achievable way and the software frame for the simulation software of the infrared image sequence were introduced in SGI workstation. Under the guidance of the idea above, in the third segment of the paper an example of simulative infrared image sequences was presented, which used the sea and sky as background and used the warship as target and used the aircraft as eye point. At last the simulation synthetically was evaluated and the betterment scheme was presented.

  13. Quantum mechanical calculations related to ionization and charge transfer in DNA

    NASA Astrophysics Data System (ADS)

    Cauët, E.; Valiev, M.; Weare, J. H.; Liévin, J.

    2012-07-01

    Ionization and charge migration in DNA play crucial roles in mechanisms of DNA damage caused by ionizing radiation, oxidizing agents and photo-irradiation. Therefore, an evaluation of the ionization properties of the DNA bases is central to the full interpretation and understanding of the elementary reactive processes that occur at the molecular level during the initial exposure and afterwards. Ab initio quantum mechanical (QM) methods have been successful in providing highly accurate evaluations of key parameters, such as ionization energies (IE) of DNA bases. Hence, in this study, we performed high-level QM calculations to characterize the molecular energy levels and potential energy surfaces, which shed light on ionization and charge migration between DNA bases. In particular, we examined the IEs of guanine, the most easily oxidized base, isolated and embedded in base clusters, and investigated the mechanism of charge migration over two and three stacked guanines. The IE of guanine in the human telomere sequence has also been evaluated. We report a simple molecular orbital analysis to explain how modifications in the base sequence are expected to change the efficiency of the sequence as a hole trap. Finally, the application of a hybrid approach combining quantum mechanics with molecular mechanics brings an interesting discussion as to how the native aqueous DNA environment affects the IE threshold of nucleobases.

  14. Computer-Aided Design Of Turbine Blades And Vanes

    NASA Technical Reports Server (NTRS)

    Hsu, Wayne Q.

    1988-01-01

    Quasi-three-dimensional method for determining aerothermodynamic configuration of turbine uses computer-interactive analysis and design and computer-interactive graphics. Design procedure executed rapidly so designer easily repeats it to arrive at best performance, size, structural integrity, and engine life. Sequence of events in aerothermodynamic analysis and design starts with engine-balance equations and ends with boundary-layer analysis and viscous-flow calculations. Analysis-and-design procedure interactive and iterative throughout.

  15. Using a local low rank plus sparse reconstruction to accelerate dynamic hyperpolarized 13C imaging using the bSSFP sequence

    NASA Astrophysics Data System (ADS)

    Milshteyn, Eugene; von Morze, Cornelius; Reed, Galen D.; Shang, Hong; Shin, Peter J.; Larson, Peder E. Z.; Vigneron, Daniel B.

    2018-05-01

    Acceleration of dynamic 2D (T2 Mapping) and 3D hyperpolarized 13C MRI acquisitions using the balanced steady-state free precession sequence was achieved with a specialized reconstruction method, based on the combination of low rank plus sparse and local low rank reconstructions. Methods were validated using both retrospectively and prospectively undersampled in vivo data from normal rats and tumor-bearing mice. Four-fold acceleration of 1-2 mm isotropic 3D dynamic acquisitions with 2-5 s temporal resolution and two-fold acceleration of 0.25-1 mm2 2D dynamic acquisitions was achieved. This enabled visualization of the biodistribution of [2-13C]pyruvate, [1-13C]lactate, [13C, 15N2]urea, and HP001 within heart, kidneys, vasculature, and tumor, as well as calculation of high resolution T2 maps.

  16. Investigations of Escherichia coli promoter sequences with artificial neural networks: New signals discovered upstream of the transcriptional startpoint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pedersen, A.G.; Engelbrecht, J.

    1995-12-31

    In this paper we present a novel method for using the learning ability of a neural network as a measure of information in local regions of input data. Using the method to analyze Escherichia coli promoters, we discover all previously described signals, and furthermore find new signals that are regularly spaced along the promoter region. The spacing of all signals correspond to the helical periodicity of DNA, meaning that the signals are all present on the same face of the DNA helix in the promoter region. This is consistent with a model where the RNA polymerase contacts the promoter onmore » one side of the DNA, and suggests that the regions important for promoter recognition may include more positions on the DNA than usually assumed. We furthermore analyze the E.coli promoters by calculating the Kullback Leibler distance, and by constructing sequence logos.« less

  17. Determination of a Screening Metric for High Diversity DNA Libraries.

    PubMed

    Guido, Nicholas J; Handerson, Steven; Joseph, Elaine M; Leake, Devin; Kung, Li A

    2016-01-01

    The fields of antibody engineering, enzyme optimization and pathway construction rely increasingly on screening complex variant DNA libraries. These highly diverse libraries allow researchers to sample a maximized sequence space; and therefore, more rapidly identify proteins with significantly improved activity. The current state of the art in synthetic biology allows for libraries with billions of variants, pushing the limits of researchers' ability to qualify libraries for screening by measuring the traditional quality metrics of fidelity and diversity of variants. Instead, when screening variant libraries, researchers typically use a generic, and often insufficient, oversampling rate based on a common rule-of-thumb. We have developed methods to calculate a library-specific oversampling metric, based on fidelity, diversity, and representation of variants, which informs researchers, prior to screening the library, of the amount of oversampling required to ensure that the desired fraction of variant molecules will be sampled. To derive this oversampling metric, we developed a novel alignment tool to efficiently measure frequency counts of individual nucleotide variant positions using next-generation sequencing data. Next, we apply a method based on the "coupon collector" probability theory to construct a curve of upper bound estimates of the sampling size required for any desired variant coverage. The calculated oversampling metric will guide researchers to maximize their efficiency in using highly variant libraries.

  18. AMS 4.0: consensus prediction of post-translational modifications in protein sequences.

    PubMed

    Plewczynski, Dariusz; Basu, Subhadip; Saha, Indrajit

    2012-08-01

    We present here the 2011 update of the AutoMotif Service (AMS 4.0) that predicts the wide selection of 88 different types of the single amino acid post-translational modifications (PTM) in protein sequences. The selection of experimentally confirmed modifications is acquired from the latest UniProt and Phospho.ELM databases for training. The sequence vicinity of each modified residue is represented using amino acids physico-chemical features encoded using high quality indices (HQI) obtaining by automatic clustering of known indices extracted from AAindex database. For each type of the numerical representation, the method builds the ensemble of Multi-Layer Perceptron (MLP) pattern classifiers, each optimising different objectives during the training (for example the recall, precision or area under the ROC curve (AUC)). The consensus is built using brainstorming technology, which combines multi-objective instances of machine learning algorithm, and the data fusion of different training objects representations, in order to boost the overall prediction accuracy of conserved short sequence motifs. The performance of AMS 4.0 is compared with the accuracy of previous versions, which were constructed using single machine learning methods (artificial neural networks, support vector machine). Our software improves the average AUC score of the earlier version by close to 7 % as calculated on the test datasets of all 88 PTM types. Moreover, for the selected most-difficult sequence motifs types it is able to improve the prediction performance by almost 32 %, when compared with previously used single machine learning methods. Summarising, the brainstorming consensus meta-learning methodology on the average boosts the AUC score up to around 89 %, averaged over all 88 PTM types. Detailed results for single machine learning methods and the consensus methodology are also provided, together with the comparison to previously published methods and state-of-the-art software tools. The source code and precompiled binaries of brainstorming tool are available at http://code.google.com/p/automotifserver/ under Apache 2.0 licensing.

  19. Using distances between Top-n-gram and residue pairs for protein remote homology detection.

    PubMed

    Liu, Bin; Xu, Jinghao; Zou, Quan; Xu, Ruifeng; Wang, Xiaolong; Chen, Qingcai

    2014-01-01

    Protein remote homology detection is one of the central problems in bioinformatics, which is important for both basic research and practical application. Currently, discriminative methods based on Support Vector Machines (SVMs) achieve the state-of-the-art performance. Exploring feature vectors incorporating the position information of amino acids or other protein building blocks is a key step to improve the performance of the SVM-based methods. Two new methods for protein remote homology detection were proposed, called SVM-DR and SVM-DT. SVM-DR is a sequence-based method, in which the feature vector representation for protein is based on the distances between residue pairs. SVM-DT is a profile-based method, which considers the distances between Top-n-gram pairs. Top-n-gram can be viewed as a profile-based building block of proteins, which is calculated from the frequency profiles. These two methods are position dependent approaches incorporating the sequence-order information of protein sequences. Various experiments were conducted on a benchmark dataset containing 54 families and 23 superfamilies. Experimental results showed that these two new methods are very promising. Compared with the position independent methods, the performance improvement is obvious. Furthermore, the proposed methods can also provide useful insights for studying the features of protein families. The better performance of the proposed methods demonstrates that the position dependant approaches are efficient for protein remote homology detection. Another advantage of our methods arises from the explicit feature space representation, which can be used to analyze the characteristic features of protein families. The source code of SVM-DT and SVM-DR is available at http://bioinformatics.hitsz.edu.cn/DistanceSVM/index.jsp.

  20. Secuencias evolutivas e isocronas para estrellas de baja masa e intermedia

    NASA Astrophysics Data System (ADS)

    Panei, J.; Baume, G.

    2016-08-01

    We present theoretical evolutionary sequences for low- and intermediate-mass stars. The masses calculated range from 1.7 to 10 M. The initial chemical composition is . In addition, we have taken into account a nuclear network with 17 isotopes and 34 nuclear reactions. With respect to the mix, we considered overshooting with a parameter . The evolutionary calculations were initialized from the region of instability of Hayashi, in order to calculate isochrones of pre-sequence, too.

  1. Calculating the quality of public high-throughput sequencing data to obtain a suitable subset for reanalysis from the Sequence Read Archive

    PubMed Central

    Nakazato, Takeru; Bono, Hidemasa

    2017-01-01

    Abstract It is important for public data repositories to promote the reuse of archived data. In the growing field of omics science, however, the increasing number of submissions of high-throughput sequencing (HTSeq) data to public repositories prevents users from choosing a suitable data set from among the large number of search results. Repository users need to be able to set a threshold to reduce the number of results to obtain a suitable subset of high-quality data for reanalysis. We calculated the quality of sequencing data archived in a public data repository, the Sequence Read Archive (SRA), by using the quality control software FastQC. We obtained quality values for 1 171 313 experiments, which can be used to evaluate the suitability of data for reuse. We also visualized the data distribution in SRA by integrating the quality information and metadata of experiments and samples. We provide quality information of all of the archived sequencing data, which enable users to obtain sufficient quality sequencing data for reanalyses. The calculated quality data are available to the public in various formats. Our data also provide an example of enhancing the reuse of public data by adding metadata to published research data by a third party. PMID:28449062

  2. SU-E-T-605: Performance Evaluation of MLC Leaf-Sequencing Algorithms in Head-And-Neck IMRT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jing, J; Lin, H; Chow, J

    2015-06-15

    Purpose: To investigate the efficiency of three multileaf collimator (MLC) leaf-sequencing algorithms proposed by Galvin et al, Chen et al and Siochi et al using external beam treatment plans for head-and-neck intensity modulated radiation therapy (IMRT). Methods: IMRT plans for head-and-neck were created using the CORVUS treatment planning system. The plans were optimized and the fluence maps for all photon beams determined. Three different MLC leaf-sequencing algorithms based on Galvin et al, Chen et al and Siochi et al were used to calculate the final photon segmental fields and their monitor units in delivery. For comparison purpose, the maximum intensitymore » of fluence map was kept constant in different plans. The number of beam segments and total number of monitor units were calculated for the three algorithms. Results: From results of number of beam segments and total number of monitor units, we found that algorithm of Galvin et al had the largest number of monitor unit which was about 70% larger than the other two algorithms. Moreover, both algorithms of Galvin et al and Siochi et al have relatively lower number of beam segment compared to Chen et al. Although values of number of beam segment and total number of monitor unit calculated by different algorithms varied with the head-and-neck plans, it can be seen that algorithms of Galvin et al and Siochi et al performed well with a lower number of beam segment, though algorithm of Galvin et al had a larger total number of monitor units than Siochi et al. Conclusion: Although performance of the leaf-sequencing algorithm varied with different IMRT plans having different fluence maps, an evaluation is possible based on the calculated number of beam segment and monitor unit. In this study, algorithm by Siochi et al was found to be more efficient in the head-and-neck IMRT. The Project Sponsored by the Fundamental Research Funds for the Central Universities (J2014HGXJ0094) and the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry.« less

  3. Amino acid positions subject to multiple coevolutionary constraints can be robustly identified by their eigenvector network centrality scores.

    PubMed

    Parente, Daniel J; Ray, J Christian J; Swint-Kruse, Liskin

    2015-12-01

    As proteins evolve, amino acid positions key to protein structure or function are subject to mutational constraints. These positions can be detected by analyzing sequence families for amino acid conservation or for coevolution between pairs of positions. Coevolutionary scores are usually rank-ordered and thresholded to reveal the top pairwise scores, but they also can be treated as weighted networks. Here, we used network analyses to bypass a major complication of coevolution studies: For a given sequence alignment, alternative algorithms usually identify different, top pairwise scores. We reconciled results from five commonly-used, mathematically divergent algorithms (ELSC, McBASC, OMES, SCA, and ZNMI), using the LacI/GalR and 1,6-bisphosphate aldolase protein families as models. Calculations used unthresholded coevolution scores from which column-specific properties such as sequence entropy and random noise were subtracted; "central" positions were identified by calculating various network centrality scores. When compared among algorithms, network centrality methods, particularly eigenvector centrality, showed markedly better agreement than comparisons of the top pairwise scores. Positions with large centrality scores occurred at key structural locations and/or were functionally sensitive to mutations. Further, the top central positions often differed from those with top pairwise coevolution scores: instead of a few strong scores, central positions often had multiple, moderate scores. We conclude that eigenvector centrality calculations reveal a robust evolutionary pattern of constraints-detectable by divergent algorithms--that occur at key protein locations. Finally, we discuss the fact that multiple patterns coexist in evolutionary data that, together, give rise to emergent protein functions. © 2015 Wiley Periodicals, Inc.

  4. NGS-based likelihood ratio for identifying contributors in two- and three-person DNA mixtures.

    PubMed

    Chan Mun Wei, Joshua; Zhao, Zicheng; Li, Shuai Cheng; Ng, Yen Kaow

    2018-06-01

    DNA fingerprinting, also known as DNA profiling, serves as a standard procedure in forensics to identify a person by the short tandem repeat (STR) loci in their DNA. By comparing the STR loci between DNA samples, practitioners can calculate a probability of match to identity the contributors of a DNA mixture. Most existing methods are based on 13 core STR loci which were identified by the Federal Bureau of Investigation (FBI). Analyses based on these loci of DNA mixture for forensic purposes are highly variable in procedures, and suffer from subjectivity as well as bias in complex mixture interpretation. With the emergence of next-generation sequencing (NGS) technologies, the sequencing of billions of DNA molecules can be parallelized, thus greatly increasing throughput and reducing the associated costs. This allows the creation of new techniques that incorporate more loci to enable complex mixture interpretation. In this paper, we propose a computation for likelihood ratio that uses NGS (next generation sequencing) data for DNA testing on mixed samples. We have applied the method to 4480 simulated DNA mixtures, which consist of various mixture proportions of 8 unrelated whole-genome sequencing data. The results confirm the feasibility of utilizing NGS data in DNA mixture interpretations. We observed an average likelihood ratio as high as 285,978 for two-person mixtures. Using our method, all 224 identity tests for two-person mixtures and three-person mixtures were correctly identified. Copyright © 2018 Elsevier Ltd. All rights reserved.

  5. Rapid comparison of protein binding site surfaces with Property Encoded Shape Distributions (PESD)

    PubMed Central

    Das, Sourav; Kokardekar, Arshad

    2009-01-01

    Patterns in shape and property distributions on the surface of binding sites are often conserved across functional proteins without significant conservation of the underlying amino-acid residues. To explore similarities of these sites from the viewpoint of a ligand, a sequence and fold-independent method was created to rapidly and accurately compare binding sites of proteins represented by property-mapped triangulated Gauss-Connolly surfaces. Within this paradigm, signatures for each binding site surface are produced by calculating their property-encoded shape distributions (PESD), a measure of the probability that a particular property will be at a specific distance to another on the molecular surface. Similarity between the signatures can then be treated as a measure of similarity between binding sites. As postulated, the PESD method rapidly detected high levels of similarity in binding site surface characteristics even in cases where there was very low similarity at the sequence level. In a screening experiment involving each member of the PDBBind 2005 dataset as a query against the rest of the set, PESD was able to retrieve a binding site with identical E.C. (Enzyme Commission) numbers as the top match in 79.5% of cases. The ability of the method in detecting similarity in binding sites with low sequence conservations were compared with state-of-the-art binding site comparison methods. PMID:19919089

  6. SU-G-BRB-05: Automation of the Photon Dosimetric Quality Assurance Program of a Linear Accelerator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lebron, S; Lu, B; Yan, G

    Purpose: To develop an automated method to calculate a linear accelerator (LINAC) photon radiation field size, flatness, symmetry, output and beam quality in a single delivery for flattened (FF) and flattening-filter-free (FFF) beams using an ionization chamber array. Methods: The proposed method consists of three control points that deliver 30×30, 10×10 and 5×5cm{sup 2} fields (FF or FFF) in a step-and-shoot sequence where the number of monitor units is weighted for each field size. The IC Profiler (Sun Nuclear Inc.) with 5mm detector spacing was used for this study. The corrected counts (CCs) were calculated and the locations of themore » maxima and minima values of the first-order gradient determined data of each sub field. Then, all CCs for each field size are summed in order to obtain the final profiles. For each profile, the radiation field size, symmetry, flatness, output factor and beam quality were calculated. For field size calculation, a parameterized gradient method was used. For method validation, profiles were collected in the detector array both, individually and as part of the step-and-shoot plan, with 9.9cm buildup for FF and FFF beams at 90cm source-to-surface distance. The same data were collected with the device (plus buildup) placed on a movable platform to achieve a 1mm resolution. Results: The differences between the dosimetric quantities calculated from both deliveries, individually and step-and-shoot, were within 0.31±0.20% and 0.04±0.02mm. The differences between the calculated field sizes with 5mm and 1mm resolution were ±0.1mm. Conclusion: The proposed single delivery method proved to be simple and efficient in automating the photon dosimetric monthly and annual quality assurance.« less

  7. A statistical method for the conservative adjustment of false discovery rate (q-value).

    PubMed

    Lai, Yinglei

    2017-03-14

    q-value is a widely used statistical method for estimating false discovery rate (FDR), which is a conventional significance measure in the analysis of genome-wide expression data. q-value is a random variable and it may underestimate FDR in practice. An underestimated FDR can lead to unexpected false discoveries in the follow-up validation experiments. This issue has not been well addressed in literature, especially in the situation when the permutation procedure is necessary for p-value calculation. We proposed a statistical method for the conservative adjustment of q-value. In practice, it is usually necessary to calculate p-value by a permutation procedure. This was also considered in our adjustment method. We used simulation data as well as experimental microarray or sequencing data to illustrate the usefulness of our method. The conservativeness of our approach has been mathematically confirmed in this study. We have demonstrated the importance of conservative adjustment of q-value, particularly in the situation that the proportion of differentially expressed genes is small or the overall differential expression signal is weak.

  8. Free energy determinants of secondary structure formation: III. beta-turns and their role in protein folding.

    PubMed

    Yang, A S; Hitz, B; Honig, B

    1996-06-21

    The stability of beta-turns is calculated as a function of sequence and turn type with a Monte Carlo sampling technique. The conformational energy of four internal hydrogen-bonded turn types, I, I', II and II', is obtained by evaluating their gas phase energy with the CHARMM force field and accounting for solvation effects with the Finite Difference Poisson-Boltzmann (FDPB) method. All four turn types are found to be less stable than the coil state, independent of the sequence in the turn. The free-energy penalties associated with turn formation vary between 1.6 kcal/mol and 7.7 kcal/mol, depending on the sequence and turn type. Differences in turn stability arise mainly from intraresidue interactions within the two central residues of the turn. For each combination of the two central residues, except for -Gly-Gly-, the most stable beta-turn type is always found to occur most commonly in native proteins. The fact that a model based on local interactions accounts for the observed preference of specific sequences suggests that long-range tertiary interactions tend to play a secondary role in determining turn conformation. In contrast, for beta-hairpins, long-range interactions appear to dominate. Specifically, due to the right-handed twist of beta-strands, type I' turns for -Gly-Gly- are found to occur with high frequency, even when local energetics would dictate otherwise. The fact that any combination of two residues is found able to adopt a relatively low-energy turn structure explains why the amino acid sequence in turns is highly variable. The calculated free-energy cost of turn formation, when combined with related numbers obtained for alpha-helices and beta-sheets, suggests a model for the initiation of protein folding based on metastable fragments of secondary structure.

  9. Method for depleting BWRs using optimal control rod patterns

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Taner, M.S.; Levine, S.H.; Hsiao, M.Y.

    1991-01-01

    Control rod (CR) programming is an essential core management activity for boiling water reactors (BWRs). After establishing a core reload design for a BWR, CR programming is performed to develop a sequence of exposure-dependent CR patterns that assure the safe and effective depletion of the core through a reactor cycle. A time-variant target power distribution approach has been assumed in this study. The authors have developed OCTOPUS to implement a new two-step method for designing semioptimal CR programs for BWRs. The optimization procedure of OCTOPUS is based on the method of approximation programming and uses the SIMULATE-E code for nucleonicsmore » calculations.« less

  10. Solving for source parameters using nested array data: A case study from the Canterbury, New Zealand earthquake sequence

    USGS Publications Warehouse

    Neighbors, Corrie; Cochran, Elizabeth S.; Ryan, Kenneth; Kaiser, Anna E.

    2017-01-01

    The seismic spectrum can be constructed by assuming a Brune spectral model and estimating the parameters of seismic moment (M0), corner frequency (fc), and high-frequency site attenuation (κ). Using seismic data collected during the 2010–2011 Canterbury, New Zealand, earthquake sequence, we apply the non-linear least-squares Gauss–Newton method, a deterministic downhill optimization technique, to simultaneously determine the M0, fc, and κ for each event-station pair. We fit the Brune spectral acceleration model to Fourier-transformed S-wave records following application of path and site corrections to the data. For each event, we solve for a single M0 and fc, while any remaining residual kappa, κr">κrκr, is allowed to differ per station record to reflect varying high-frequency falloff due to path and site attenuation. We use a parametric forward modeling method, calculating initial M0 and fc values from the local GNS New Zealand catalog Mw, GNS magnitudes and measuring an initial κr">κrκr using an automated high-frequency linear regression method. Final solutions for M0, fc, and κr">κrκr are iteratively computed through minimization of the residual function, and the Brune model stress drop is then calculated from the final, best-fit fc. We perform the spectral fitting routine on nested array seismic data that include the permanent GeoNet accelerometer network as well as a dense network of nearly 200 Quake Catcher Network (QCN) MEMs accelerometers, analyzing over 180 aftershocks Mw,GNS ≥ 3.5 that occurred from 9 September 2010 to 31 July 2011. QCN stations were hosted by public volunteers and served to fill spatial gaps between existing GeoNet stations. Moment magnitudes determined using the spectral fitting procedure (Mw,SF) range from 3.5 to 5.7 and agree well with Mw,GNS, with a median difference of 0.09 and 0.17 for GeoNet and QCN records, respectively, and 0.11 when data from both networks are combined. The majority of events are calculated to have stress drops between 1.7 and 13 MPa (20th and 80th percentile, correspondingly) for the combined networks. The overall median stress drop for the combined networks is 3.2 MPa, which is similar to median stress drops previously reported for the Canterbury sequence. We do not observe a correlation between stress drop and depth for this region, nor a relationship between stress drop and magnitude over the catalog considered. Lateral spatial patterns in stress drop, such as a cluster of aftershocks near the eastern extent of the Greendale fault with higher stress drops and lower stress drops for aftershocks of the 2011 Mw,GNS 6.2 Christchurch mainshock, are found to be in agreement with previous reports. As stress drop is arguably a method-dependent calculation and subject to high spatial variability, our results using the parametric Gauss–Newton algorithm strengthen conclusions that the Canterbury sequence has stress drops that are more similar to those found in intraplate regions, with overall higher stress drops that are typically observed in tectonically active areas.

  11. Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysis.

    PubMed

    Buldyrev, S V; Goldberger, A L; Havlin, S; Mantegna, R N; Matsa, M E; Peng, C K; Simons, M; Stanley, H E

    1995-05-01

    An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.

  12. Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysis

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Matsa, M. E.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.

  13. The practical evaluation of DNA barcode efficacy.

    PubMed

    Spouge, John L; Mariño-Ramírez, Leonardo

    2012-01-01

    This chapter describes a workflow for measuring the efficacy of a barcode in identifying species. First, assemble individual sequence databases corresponding to each barcode marker. A controlled collection of taxonomic data is preferable to GenBank data, because GenBank data can be problematic, particularly when comparing barcodes based on more than one marker. To ensure proper controls when evaluating species identification, specimens not having a sequence in every marker database should be discarded. Second, select a computer algorithm for assigning species to barcode sequences. No algorithm has yet improved notably on assigning a specimen to the species of its nearest neighbor within a barcode database. Because global sequence alignments (e.g., with the Needleman-Wunsch algorithm, or some related algorithm) examine entire barcode sequences, they generally produce better species assignments than local sequence alignments (e.g., with BLAST). No neighboring method (e.g., global sequence similarity, global sequence distance, or evolutionary distance based on a global alignment) has yet shown a notable superiority in identifying species. Finally, "the probability of correct identification" (PCI) provides an appropriate measurement of barcode efficacy. The overall PCI for a data set is the average of the species PCIs, taken over all species in the data set. This chapter states explicitly how to calculate PCI, how to estimate its statistical sampling error, and how to use data on PCR failure to set limits on how much improvements in PCR technology can improve species identification.

  14. Quantification of glomerular filtration rate by measurement of gadobutrol clearance from the extracellular fluid volume: comparison of a TurboFLASH and a TrueFISP approach

    NASA Astrophysics Data System (ADS)

    Boss, Andreas; Martirosian, Petros; Artunc, Ferruh; Risler, Teut; Claussen, Claus D.; Schlemmer, Heinz-Peter; Schick, Fritz

    2007-03-01

    Purpose: As the MR contrast-medium gadobutrol is completely eliminated via glomerular filtration, the glomerular filtration rate (GFR) can be quantified after bolus-injection of gadobutrol and complete mixing in the extracellular fluid volume (ECFV) by measuring the signal decrease within the liver parenchyma. Two different navigator-gated single-shot saturation-recovery sequences have been tested for suitability of GFR quantification: a TurboFLASH and a TrueFISP readout technique. Materials and Methods: Ten healthy volunteers (mean age 26.1+/-3.6) were equally devided in two subgroups. After bolus-injection of 0.05 mmol/kg gadobutrol, coronal single-slice images of the liver were recorded every 4-5 seconds during free breathing using either the TurboFLASH or the TrueFISP technique. Time-intensity curves were determined from manually drawn regions-of-interest over the liver parenchyma. Both sequences were subsequently evaluated regarding signal to noise ratio (SNR) and the behaviour of signal intensity curves. The calculated GFR values were compared to an iopromide clearance gold standard. Results: The TrueFISP sequence exhibited a 3.4-fold higher SNR as compared to the TurboFLASH sequence and markedly lower variability of the recorded time-intensity curves. The calculated mean GFR values were 107.0+/-16.1 ml/min/1.73m2 (iopromide: 92.1+/-14.5 ml/min/1.73m2) for the TrueFISP technique and 125.6+/-24.1 ml/min/1.73m2 (iopromide: 97.7+/-6.3 ml/min/1.73m2) for the TurboFLASH approach. The mean paired differences with TrueFISP was lower (15.0 ml/min/1.73m2) than in the TurboFLASH method (27.9 ml/min/1.73m2). Conclusion: The global GFR can be quantified via measurement of gadobutrol clearance from the ECFV. A saturation-recovery TrueFISP sequence allows for more reliable GFR quantification as a saturation recovery TurboFLASH technique.

  15. A New Method for Generating Probability Tables in the Unresolved Resonance Region

    DOE PAGES

    Holcomb, Andrew M.; Leal, Luiz C.; Rahnema, Farzad; ...

    2017-04-18

    One new method for constructing probability tables in the unresolved resonance region (URR) has been developed. This new methodology is an extensive modification of the single-level Breit-Wigner (SLBW) pseudo-resonance pair sequence method commonly used to generate probability tables in the URR. The new method uses a Monte Carlo process to generate many pseudo-resonance sequences by first sampling the average resonance parameter data in the URR and then converting the sampled resonance parameters to the more robust R-matrix limited (RML) format. Furthermore, for each sampled set of pseudo-resonance sequences, the temperature-dependent cross sections are reconstructed on a small grid around themore » energy of reference using the Reich-Moore formalism and the Leal-Hwang Doppler broadening methodology. We then use the effective cross sections calculated at the energies of reference to construct probability tables in the URR. The RML cross-section reconstruction algorithm has been rigorously tested for a variety of isotopes, including 16O, 19F, 35Cl, 56Fe, 63Cu, and 65Cu. The new URR method also produced normalized cross-section factor probability tables for 238U that were found to be in agreement with current standards. The modified 238U probability tables were shown to produce results in excellent agreement with several standard benchmarks, including the IEU-MET-FAST-007 (BIG TEN), IEU-MET-FAST-003, and IEU-COMP-FAST-004 benchmarks.« less

  16. Stark width regularities within spectral series of the lithium isoelectronic sequence

    NASA Astrophysics Data System (ADS)

    Tapalaga, Irinel; Trklja, Nora; Dojčinović, Ivan P.; Purić, Jagoš

    2018-03-01

    Stark width regularities within spectral series of the lithium isoelectronic sequence have been studied in an approach that includes both neutrals and ions. The influence of environmental conditions and certain atomic parameters on the Stark widths of spectral lines has been investigated. This study gives a simple model for the calculation of Stark broadening data for spectral lines within the lithium isoelectronic sequence. The proposed model requires fewer parameters than any other model. The obtained relations were used for predictions of Stark widths for transitions that have not yet been measured or calculated. In the framework of the present research, three algorithms for fast data processing have been made and they enable quality control and provide verification of the theoretically calculated results.

  17. Shifted termination assay (STA) fragment analysis to detect BRAF V600 mutations in papillary thyroid carcinomas

    PubMed Central

    2013-01-01

    Background BRAF mutation is an important diagnostic and prognostic marker in patients with papillary thyroid carcinoma (PTC). To be applicable in clinical laboratories with limited equipment, diverse testing methods are required to detect BRAF mutation. Methods A shifted termination assay (STA) fragment analysis was used to detect common V600 BRAF mutations in 159 PTCs with DNAs extracted from formalin-fixed paraffin-embedded tumor tissue. The results of STA fragment analysis were compared to those of direct sequencing. Serial dilutions of BRAF mutant cell line (SNU-790) were used to calculate limit of detection (LOD). Results BRAF mutations were detected in 119 (74.8%) PTCs by STA fragment analysis. In direct sequencing, BRAF mutations were observed in 118 (74.2%) cases. The results of STA fragment analysis had high correlation with those of direct sequencing (p < 0.00001, κ = 0.98). The LOD of STA fragment analysis and direct sequencing was 6% and 12.5%, respectively. In PTCs with pT3/T4 stages, BRAF mutation was observed in 83.8% of cases. In pT1/T2 carcinomas, BRAF mutation was detected in 65.9% and this difference was statistically significant (p = 0.007). Moreover, BRAF mutation was more frequent in PTCs with extrathyroidal invasion than tumors without extrathyroidal invasion (84.7% versus 62.2%, p = 0.001). To prepare and run the reactions, direct sequencing required 450 minutes while STA fragment analysis needed 290 minutes. Conclusions STA fragment analysis is a simple and sensitive method to detect BRAF V600 mutations in formalin-fixed paraffin-embedded clinical samples. Virtual Slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/5684057089135749 PMID:23883275

  18. Comparison of Free-Breathing With Navigator-Triggered Technique in Diffusion Weighted Imaging for Evaluation of Small Hepatocellular Carcinoma: Effect on Image Quality and Intravoxel Incoherent Motion Parameters.

    PubMed

    Shan, Yan; Zeng, Meng-su; Liu, Kai; Miao, Xi-Yin; Lin, Jiang; Fu, Cai xia; Xu, Peng-ju

    2015-01-01

    To evaluate the effect on image quality and intravoxel incoherent motion (IVIM) parameters of small hepatocellular carcinoma (HCC) from choice of either free-breathing (FB) or navigator-triggered (NT) diffusion-weighted (DW) imaging. Thirty patients with 37 small HCCs underwent IVIM DW imaging using 12 b values (0-800 s/mm) with 2 sequences: NT, FB. A biexponential analysis with the Bayesian method yielded true diffusion coefficient (D), pseudodiffusion coefficient (D*), and perfusion fraction (f) in small HCCs and liver parenchyma. Apparent diffusion coefficient (ADC) was also calculated. The acquisition time and image quality scores were assessed for 2 sequences. Independent sample t test was used to compare image quality, signal intensity ratio, IVIM parameters, and ADC values between the 2 sequences; reproducibility of IVIM parameters, and ADC values between 2 sequences was assessed with the Bland-Altman method (BA-LA). Image quality with NT sequence was superior to that with FB acquisition (P = 0.02). The mean acquisition time for FB scheme was shorter than that of NT sequence (6 minutes 14 seconds vs 10 minutes 21 seconds ± 10 seconds P < 0.01). The signal intensity ratio of small HCCs did not vary significantly between the 2 sequences. The ADC and IVIM parameters from the 2 sequences show no significant difference. Reproducibility of D*and f parameters in small HCC was poor (BA-LA: 95% confidence interval, -180.8% to 189.2% for D* and -133.8% to 174.9% for f). A moderate reproducibility of D and ADC parameters was observed (BA-LA: 95% confidence interval, -83.5% to 76.8% for D and -74.4% to 88.2% for ADC) between the 2 sequences. The NT DW imaging technique offers no advantage in IVIM parameters measurements of small HCC except better image quality, whereas FB technique offers greater confidence in fitted diffusion parameters for matched acquisition periods.

  19. SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea

    PubMed Central

    2010-01-01

    Background Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L.) Walp). We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process. Results Forward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i) to normalize the data effectively using spike-in control spot normalization, and (ii) to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value < 0.05). Enrichment ratio 2 calculations showed that > 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self-BLAST function within SSHdb grouped redundant clones together and illustrated that the SSHscreen plots are a useful tool for choosing anonymous clones for sequencing, since redundant clones cluster together on the enrichment ratio plots. Conclusions We developed the SSHscreen-SSHdb software pipeline, which greatly facilitates gene discovery using suppression subtractive hybridization by improving the selection of clones for sequencing after screening the library on a small number of microarrays. Annotation of the sequence information and collaboration was further enhanced through a web-based SSHdb database, and we illustrated this through identification of drought responsive genes from cowpea, which can now be investigated in gene function studies. SSH is a popular and powerful gene discovery tool, and therefore this pipeline will have application for gene discovery in any biological system, particularly non-model organisms. SSHscreen 2.0.1 and a link to SSHdb are available from http://microarray.up.ac.za/SSHscreen. PMID:20359330

  20. Creation of parallel algorithms for the solution of problems of gas dynamics on multi-core computers and GPU

    NASA Astrophysics Data System (ADS)

    Rybakin, B.; Bogatencov, P.; Secrieru, G.; Iliuha, N.

    2013-10-01

    The paper deals with a parallel algorithm for calculations on multiprocessor computers and GPU accelerators. The calculations of shock waves interaction with low-density bubble results and the problem of the gas flow with the forces of gravity are presented. This algorithm combines a possibility to capture a high resolution of shock waves, the second-order accuracy for TVD schemes, and a possibility to observe a low-level diffusion of the advection scheme. Many complex problems of continuum mechanics are numerically solved on structured or unstructured grids. To improve the accuracy of the calculations is necessary to choose a sufficiently small grid (with a small cell size). This leads to the drawback of a substantial increase of computation time. Therefore, for the calculations of complex problems it is reasonable to use the method of Adaptive Mesh Refinement. That is, the grid refinement is performed only in the areas of interest of the structure, where, e.g., the shock waves are generated, or a complex geometry or other such features exist. Thus, the computing time is greatly reduced. In addition, the execution of the application on the resulting sequence of nested, decreasing nets can be parallelized. Proposed algorithm is based on the AMR method. Utilization of AMR method can significantly improve the resolution of the difference grid in areas of high interest, and from other side to accelerate the processes of the multi-dimensional problems calculating. Parallel algorithms of the analyzed difference models realized for the purpose of calculations on graphic processors using the CUDA technology [1].

  1. Classification of circulation type sequences applied to snow avalanches over the eastern Pyrenees (Andorra and Catalonia)

    NASA Astrophysics Data System (ADS)

    Esteban, Pere; Beck, Christoph; Philipp, Andreas

    2010-05-01

    Using data associated with accidents or damages caused by snow avalanches over the eastern Pyrenees (Andorra and Catalonia) several atmospheric circulation type catalogues have been obtained. For this purpose, different circulation type classification methods based on Principal Component Analysis (T-mode and S-mode using the extreme scores) and on optimization procedures (Improved K-means and SANDRA) were applied . Considering the characteristics of the phenomena studied, not only single day circulation patterns were taken into account but also sequences of circulation types of varying length. Thus different classifications with different numbers of types and for different sequence lengths were obtained using the different classification methods. Simple between type variability, within type variability, and outlier detection procedures have been applied for selecting the best result concerning snow avalanches type classifications. Furthermore, days without occurrence of the hazards were also related to the avalanche centroids using pattern-correlations, facilitating the calculation of the anomalies between hazardous and no hazardous days, and also frequencies of occurrence of hazardous events for each circulation type. Finally, the catalogues statistically considered the best results are evaluated using the avalanche forecaster expert knowledge. Consistent explanation of snow avalanches occurrence by means of circulation sequences is obtained, but always considering results from classifications with different sequence length. This work has been developed in the framework of the COST Action 733 (Harmonisation and Applications of Weather Type Classifications for European regions).

  2. Verification of Ribosomal Proteins of Aspergillus fumigatus for Use as Biomarkers in MALDI-TOF MS Identification.

    PubMed

    Nakamura, Sayaka; Sato, Hiroaki; Tanaka, Reiko; Yaguchi, Takashi

    2016-01-01

    We have previously proposed a rapid identification method for bacterial strains based on the profiles of their ribosomal subunit proteins (RSPs), observed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). This method can perform phylogenetic characterization based on the mass of housekeeping RSP biomarkers, ideally calculated from amino acid sequence information registered in public protein databases. With the aim of extending its field of application to medical mycology, this study investigates the actual state of information of RSPs of eukaryotic fungi registered in public protein databases through the characterization of ribosomal protein fractions extracted from genome-sequenced Aspergillus fumigatus strains Af293 and A1163 as a model. In this process, we have found that the public protein databases harbor problems. The RSP names are in confusion, so we have provisionally unified them using the yeast naming system. The most serious problem is that many incorrect sequences are registered in the public protein databases. Surprisingly, more than half of the sequences are incorrect, due chiefly to mis-annotation of exon/intron structures. These errors could be corrected by a combination of in silico inspection by sequence homology analysis and MALDI-TOF MS measurements. We were also able to confirm conserved post-translational modifications in eleven RSPs. After these verifications, the masses of 31 expressed RSPs under 20,000 Da could be accurately confirmed. These RSPs have a potential to be useful biomarkers for identifying clinical isolates of A. fumigatus .

  3. Statistical properties of filtered pseudorandom digital sequences formed from the sum of maximum-length sequences

    NASA Technical Reports Server (NTRS)

    Wallace, G. R.; Weathers, G. D.; Graf, E. R.

    1973-01-01

    The statistics of filtered pseudorandom digital sequences called hybrid-sum sequences, formed from the modulo-two sum of several maximum-length sequences, are analyzed. The results indicate that a relation exists between the statistics of the filtered sequence and the characteristic polynomials of the component maximum length sequences. An analysis procedure is developed for identifying a large group of sequences with good statistical properties for applications requiring the generation of analog pseudorandom noise. By use of the analysis approach, the filtering process is approximated by the convolution of the sequence with a sum of unit step functions. A parameter reflecting the overall statistical properties of filtered pseudorandom sequences is derived. This parameter is called the statistical quality factor. A computer algorithm to calculate the statistical quality factor for the filtered sequences is presented, and the results for two examples of sequence combinations are included. The analysis reveals that the statistics of the signals generated with the hybrid-sum generator are potentially superior to the statistics of signals generated with maximum-length generators. Furthermore, fewer calculations are required to evaluate the statistics of a large group of hybrid-sum generators than are required to evaluate the statistics of the same size group of approximately equivalent maximum-length sequences.

  4. MR Fingerprinting Using The Quick Echo Splitting NMR Imaging Technique

    PubMed Central

    Jiang, Yun; Ma, Dan; Jerecic, Renate; Duerk, Jeffrey; Seiberlich, Nicole; Gulani, Vikas; Griswold, Mark A.

    2016-01-01

    Purpose The purpose of the study is to develop a quantitative method for the relaxation properties with a reduced radio frequency (RF) power deposition by combining Magnetic Resonance Fingerprinting (MRF) technique with Quick Echo Splitting NMR Imaging Technique (QUEST). Methods A QUEST-based MRF sequence was implemented to acquire high order echoes by increasing the gaps between RF pulses. Bloch simulations were used to calculate a dictionary containing the range of physically plausible signal evolutions using a range of T1 and T2 values based on the pulse sequence. MRF-QUEST was evaluated by comparing to the results of spin-echo methods. The SAR of QUEST-MRF was compared to the clinically available methods. Results MRF-QUEST quantifies the relaxation properties with good accuracy at the estimated head Specific Absorption Rate (SAR) of 0.03 W/kg. T1 and T2 values estimated by MRF-QUEST are in good agreement with the traditional methods. Conclusion The combination of the MRF and the QUEST provides an accurate quantification of T1 and T2 simultaneously with reduced RF power deposition. The resulting lower SAR may provide a new acquisition strategy for MRF when RF energy deposition is problematic. PMID:26924639

  5. Improved telescope focus using only two focus images

    NASA Astrophysics Data System (ADS)

    Barrick, Gregory; Vermeulen, Tom; Thomas, James

    2008-07-01

    In an effort to reduce the amount of time spent focusing the telescope and to improve the quality of the focus, a new procedure has been investigated and implemented at the Canada-France-Hawaii Telescope (CFHT). The new procedure is based on a paper by Tokovinin and Heathcote and requires only two out-of-focus images to determine the best focus for the telescope. Using only two images provides a great time savings over the five or more images required for a standard through-focus sequence. In addition, it has been found that this method is significantly less sensitive to seeing variations than the traditional through-focus procedure, so the quality of the resulting focus is better. Finally, the new procedure relies on a second moment calculation and so is computationally easier and more robust than methods using a FWHM calculation. The new method has been implemented for WIRCam for the past 18 months, for MegaPrime for the past year, and has recently been implemented for ESPaDOnS.

  6. a Novel Approach to Camera Calibration Method for Smart Phones Under Road Environment

    NASA Astrophysics Data System (ADS)

    Lee, Bijun; Zhou, Jian; Ye, Maosheng; Guo, Yuan

    2016-06-01

    Monocular vision-based lane departure warning system has been increasingly used in advanced driver assistance systems (ADAS). By the use of the lane mark detection and identification, we proposed an automatic and efficient camera calibration method for smart phones. At first, we can detect the lane marker feature in a perspective space and calculate edges of lane markers in image sequences. Second, because of the width of lane marker and road lane is fixed under the standard structural road environment, we can automatically build a transformation matrix between perspective space and 3D space and get a local map in vehicle coordinate system. In order to verify the validity of this method, we installed a smart phone in the `Tuzhi' self-driving car of Wuhan University and recorded more than 100km image data on the road in Wuhan. According to the result, we can calculate the positions of lane markers which are accurate enough for the self-driving car to run smoothly on the road.

  7. GuiTope: an application for mapping random-sequence peptides to protein sequences.

    PubMed

    Halperin, Rebecca F; Stafford, Phillip; Emery, Jack S; Navalkar, Krupa Arun; Johnston, Stephen Albert

    2012-01-03

    Random-sequence peptide libraries are a commonly used tool to identify novel ligands for binding antibodies, other proteins, and small molecules. It is often of interest to compare the selected peptide sequences to the natural protein binding partners to infer the exact binding site or the importance of particular residues. The ability to search a set of sequences for similarity to a set of peptides may sometimes enable the prediction of an antibody epitope or a novel binding partner. We have developed a software application designed specifically for this task. GuiTope provides a graphical user interface for aligning peptide sequences to protein sequences. All alignment parameters are accessible to the user including the ability to specify the amino acid frequency in the peptide library; these frequencies often differ significantly from those assumed by popular alignment programs. It also includes a novel feature to align di-peptide inversions, which we have found improves the accuracy of antibody epitope prediction from peptide microarray data and shows utility in analyzing phage display datasets. Finally, GuiTope can randomly select peptides from a given library to estimate a null distribution of scores and calculate statistical significance. GuiTope provides a convenient method for comparing selected peptide sequences to protein sequences, including flexible alignment parameters, novel alignment features, ability to search a database, and statistical significance of results. The software is available as an executable (for PC) at http://www.immunosignature.com/software and ongoing updates and source code will be available at sourceforge.net.

  8. Evaluation and Selection of Best Priority Sequencing Rule in Job Shop Scheduling using Hybrid MCDM Technique

    NASA Astrophysics Data System (ADS)

    Kiran Kumar, Kalla; Nagaraju, Dega; Gayathri, S.; Narayanan, S.

    2017-05-01

    Priority Sequencing Rules provide the guidance for the order in which the jobs are to be processed at a workstation. The application of different priority rules in job shop scheduling gives different order of scheduling. More experimentation needs to be conducted before a final choice is made to know the best priority sequencing rule. Hence, a comprehensive method of selecting the right choice is essential in managerial decision making perspective. This paper considers seven different priority sequencing rules in job shop scheduling. For evaluation and selection of the best priority sequencing rule, a set of eight criteria are considered. The aim of this work is to demonstrate the methodology of evaluating and selecting the best priority sequencing rule by using hybrid multi criteria decision making technique (MCDM), i.e., analytical hierarchy process (AHP) with technique for order preference by similarity to ideal solution (TOPSIS). The criteria weights are calculated by using AHP whereas the relative closeness values of all priority sequencing rules are computed based on TOPSIS with the help of data acquired from the shop floor of a manufacturing firm. Finally, from the findings of this work, the priority sequencing rules are ranked from most important to least important. The comprehensive methodology presented in this paper is very much essential for the management of a workstation to choose the best priority sequencing rule among the available alternatives for processing the jobs with maximum benefit.

  9. Calculating the quality of public high-throughput sequencing data to obtain a suitable subset for reanalysis from the Sequence Read Archive.

    PubMed

    Ohta, Tazro; Nakazato, Takeru; Bono, Hidemasa

    2017-06-01

    It is important for public data repositories to promote the reuse of archived data. In the growing field of omics science, however, the increasing number of submissions of high-throughput sequencing (HTSeq) data to public repositories prevents users from choosing a suitable data set from among the large number of search results. Repository users need to be able to set a threshold to reduce the number of results to obtain a suitable subset of high-quality data for reanalysis. We calculated the quality of sequencing data archived in a public data repository, the Sequence Read Archive (SRA), by using the quality control software FastQC. We obtained quality values for 1 171 313 experiments, which can be used to evaluate the suitability of data for reuse. We also visualized the data distribution in SRA by integrating the quality information and metadata of experiments and samples. We provide quality information of all of the archived sequencing data, which enable users to obtain sufficient quality sequencing data for reanalyses. The calculated quality data are available to the public in various formats. Our data also provide an example of enhancing the reuse of public data by adding metadata to published research data by a third party. © The Authors 2017. Published by Oxford University Press.

  10. Interim Reliability Evaluation Program: analysis of the Browns Ferry, Unit 1, nuclear plant. Main report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mays, S.E.; Poloski, J.P.; Sullivan, W.H.

    1982-07-01

    A probabilistic risk assessment (PRA) was made of the Browns Ferry, Unit 1, nuclear plant as part of the Nuclear Regulatory Commission's Interim Reliability Evaluation Program (IREP). Specific goals of the study were to identify the dominant contributors to core melt, develop a foundation for more extensive use of PRA methods, expand the cadre of experienced PRA practitioners, and apply procedures for extension of IREP analyses to other domestic light water reactors. Event tree and fault tree analyses were used to estimate the frequency of accident sequences initiated by transients and loss of coolant accidents. External events such as floods,more » fires, earthquakes, and sabotage were beyond the scope of this study and were, therefore, excluded. From these sequences, the dominant contributors to probable core melt frequency were chosen. Uncertainty and sensitivity analyses were performed on these sequences to better understand the limitations associated with the estimated sequence frequencies. Dominant sequences were grouped according to common containment failure modes and corresponding release categories on the basis of comparison with analyses of similar designs rather than on the basis of detailed plant-specific calculations.« less

  11. Modeling backbone flexibility to achieve sequence diversity: The design of novel alpha-helical ligands for Bcl-xL

    PubMed Central

    Fu, Xiaoran; Apgar, James R.; Keating, Amy E.

    2007-01-01

    Computational protein design can be used to select sequences that are compatible with a fixed-backbone template. This strategy has been used in numerous instances to engineer novel proteins. However, the fixed-backbone assumption severely restricts the sequence space that is accessible via design. For challenging problems, such as the design of functional proteins, this may not be acceptable. In this paper, we present a method for introducing backbone flexibility into protein design calculations and apply it to the design of diverse helical BH3 ligands that bind to the anti-apoptotic protein Bcl-xL, a member of the Bcl-2 protein family. We demonstrate how normal mode analysis can be used to sample different BH3 backbones, and show that this leads to a larger and more diverse set of low-energy solutions than can be achieved using a native high-resolution Bcl-xL complex crystal structure as a template. We tested several of the designed solutions experimentally and found that this approach worked well when normal mode calculations were used to deform a native BH3 helix structure, but less well when they were used to deform an idealized helix. A subsequent round of design and testing identified a likely source of the problem as inadequate sampling of the helix pitch. In all, we tested seventeen designed BH3 peptide sequences, including several point mutants. Of these, eight bound well to Bcl-xL and four others showed weak but detectable binding. The successful designs showed a diversity of sequences that would have been difficult or impossible to achieve using only a fixed backbone. Thus, introducing backbone flexibility via normal mode analysis effectively broadened the set of sequences identified by computational design, and provided insight into positions important for binding Bcl-xL. PMID:17597151

  12. General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies

    PubMed Central

    Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong

    2013-01-01

    We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515

  13. Using digital inpainting to estimate incident light intensity for the calculation of red blood cell oxygen saturation from microscopy images.

    PubMed

    Sové, Richard J; Drakos, Nicole E; Fraser, Graham M; Ellis, Christopher G

    2018-05-25

    Red blood cell oxygen saturation is an important indicator of oxygen supply to tissues in the body. Oxygen saturation can be measured by taking advantage of spectroscopic properties of hemoglobin. When this technique is applied to transmission microscopy, the calculation of saturation requires determination of incident light intensity at each pixel occupied by the red blood cell; this value is often approximated from a sequence of images as the maximum intensity over time. This method often fails when the red blood cells are moving too slowly, or if hematocrit is too large since there is not a large enough gap between the cells to accurately calculate the incident intensity value. A new method of approximating incident light intensity is proposed using digital inpainting. This novel approach estimates incident light intensity with an average percent error of approximately 3%, which exceeds the accuracy of the maximum intensity based method in most cases. The error in incident light intensity corresponds to a maximum error of approximately 2% saturation. Therefore, though this new method is computationally more demanding than the traditional technique, it can be used in cases where the maximum intensity-based method fails (e.g. stationary cells), or when higher accuracy is required. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  14. The Stability Analysis Method of the Cohesive Granular Slope on the Basis of Graph Theory.

    PubMed

    Guan, Yanpeng; Liu, Xiaoli; Wang, Enzhi; Wang, Sijing

    2017-02-27

    This paper attempted to provide a method to calculate progressive failure of the cohesivefrictional granular geomaterial and the spatial distribution of the stability of the cohesive granular slope. The methodology can be divided into two parts: the characterization method of macro-contact and the analysis of the slope stability. Based on the graph theory, the vertexes, the edges and the edge sequences are abstracted out to characterize the voids, the particle contact and the macro-contact, respectively, bridging the gap between the mesoscopic and macro scales of granular materials. This paper adopts this characterization method to extract a graph from a granular slope and characterize the macro sliding surface, then the weighted graph is analyzed to calculate the slope safety factor. Each edge has three weights representing the sliding moment, the anti-sliding moment and the braking index of contact-bond, respectively, . The safety factor of the slope is calculated by presupposing a certain number of sliding routes and reducing Weight repeatedly and counting the mesoscopic failure of the edge. It is a kind of slope analysis method from mesoscopic perspective so it can present more detail of the mesoscopic property of the granular slope. In the respect of macro scale, the spatial distribution of the stability of the granular slope is in agreement with the theoretical solution.

  15. Benefits of Applying Hierarchical Models to the Empirical Green's Function Approach

    NASA Astrophysics Data System (ADS)

    Denolle, M.; Van Houtte, C.

    2017-12-01

    Stress drops calculated from source spectral studies currently show larger variability than what is implied by empirical ground motion models. One of the potential origins of the inflated variability is the simplified model-fitting techniques used in most source spectral studies. This study improves upon these existing methods, and shows that the fitting method may explain some of the discrepancy. In particular, Bayesian hierarchical modelling is shown to be a method that can reduce bias, better quantify uncertainties and allow additional effects to be resolved. The method is applied to the Mw7.1 Kumamoto, Japan earthquake, and other global, moderate-magnitude, strike-slip earthquakes between Mw5 and Mw7.5. It is shown that the variation of the corner frequency, fc, and the falloff rate, n, across the focal sphere can be reliably retrieved without overfitting the data. Additionally, it is shown that methods commonly used to calculate corner frequencies can give substantial biases. In particular, if fc were calculated for the Kumamoto earthquake using a model with a falloff rate fixed at 2 instead of the best fit 1.6, the obtained fc would be as large as twice its realistic value. The reliable retrieval of the falloff rate allows deeper examination of this parameter for a suite of global, strike-slip earthquakes, and its scaling with magnitude. The earthquake sequences considered in this study are from Japan, New Zealand, Haiti and California.

  16. Visual Attention Modeling for Stereoscopic Video: A Benchmark and Computational Model.

    PubMed

    Fang, Yuming; Zhang, Chi; Li, Jing; Lei, Jianjun; Perreira Da Silva, Matthieu; Le Callet, Patrick

    2017-10-01

    In this paper, we investigate the visual attention modeling for stereoscopic video from the following two aspects. First, we build one large-scale eye tracking database as the benchmark of visual attention modeling for stereoscopic video. The database includes 47 video sequences and their corresponding eye fixation data. Second, we propose a novel computational model of visual attention for stereoscopic video based on Gestalt theory. In the proposed model, we extract the low-level features, including luminance, color, texture, and depth, from discrete cosine transform coefficients, which are used to calculate feature contrast for the spatial saliency computation. The temporal saliency is calculated by the motion contrast from the planar and depth motion features in the stereoscopic video sequences. The final saliency is estimated by fusing the spatial and temporal saliency with uncertainty weighting, which is estimated by the laws of proximity, continuity, and common fate in Gestalt theory. Experimental results show that the proposed method outperforms the state-of-the-art stereoscopic video saliency detection models on our built large-scale eye tracking database and one other database (DML-ITRACK-3D).

  17. Pressure-induced structural transformations and polymerization in ThC2

    PubMed Central

    Guo, Yongliang; Yu, Cun; Lin, Jun; Wang, Changying; Ren, Cuilan; Sun, Baoxing; Huai, Ping; Xie, Ruobing; Ke, Xuezhi; Zhu, Zhiyuan; Xu, Hongjie

    2017-01-01

    Thorium-carbon systems have been thought as promising nuclear fuel for Generation IV reactors which require high-burnup and safe nuclear fuel. Existing knowledge on thorium carbides under extreme condition remains insufficient and some is controversial due to limited studies. Here we systematically predict all stable structures of thorium dicarbide (ThC2) under the pressure ranging from ambient to 300 GPa by merging ab initio total energy calculations and unbiased structure searching method, which are in sequence of C2/c, C2/m, Cmmm, Immm and P6/mmm phases. Among these phases, the C2/m is successfully observed for the first time via in situ synchrotron XRD measurements, which exhibits an excellent structural correspondence to our theoretical predictions. The transition sequence and the critical pressures are predicted. The calculated results also reveal the polymerization behaviors of the carbon atoms and the corresponding characteristic C-C bonding under various pressures. Our work provides key information on the fundamental material behavior and insights into the underlying mechanisms that lay the foundation for further exploration and application of ThC2. PMID:28383571

  18. Pressure-induced structural transformations and polymerization in ThC2

    NASA Astrophysics Data System (ADS)

    Guo, Yongliang; Yu, Cun; Lin, Jun; Wang, Changying; Ren, Cuilan; Sun, Baoxing; Huai, Ping; Xie, Ruobing; Ke, Xuezhi; Zhu, Zhiyuan; Xu, Hongjie

    2017-04-01

    Thorium-carbon systems have been thought as promising nuclear fuel for Generation IV reactors which require high-burnup and safe nuclear fuel. Existing knowledge on thorium carbides under extreme condition remains insufficient and some is controversial due to limited studies. Here we systematically predict all stable structures of thorium dicarbide (ThC2) under the pressure ranging from ambient to 300 GPa by merging ab initio total energy calculations and unbiased structure searching method, which are in sequence of C2/c, C2/m, Cmmm, Immm and P6/mmm phases. Among these phases, the C2/m is successfully observed for the first time via in situ synchrotron XRD measurements, which exhibits an excellent structural correspondence to our theoretical predictions. The transition sequence and the critical pressures are predicted. The calculated results also reveal the polymerization behaviors of the carbon atoms and the corresponding characteristic C-C bonding under various pressures. Our work provides key information on the fundamental material behavior and insights into the underlying mechanisms that lay the foundation for further exploration and application of ThC2.

  19. Pressure-induced structural transformations and polymerization in ThC2.

    PubMed

    Guo, Yongliang; Yu, Cun; Lin, Jun; Wang, Changying; Ren, Cuilan; Sun, Baoxing; Huai, Ping; Xie, Ruobing; Ke, Xuezhi; Zhu, Zhiyuan; Xu, Hongjie

    2017-04-06

    Thorium-carbon systems have been thought as promising nuclear fuel for Generation IV reactors which require high-burnup and safe nuclear fuel. Existing knowledge on thorium carbides under extreme condition remains insufficient and some is controversial due to limited studies. Here we systematically predict all stable structures of thorium dicarbide (ThC 2 ) under the pressure ranging from ambient to 300 GPa by merging ab initio total energy calculations and unbiased structure searching method, which are in sequence of C2/c, C2/m, Cmmm, Immm and P6/mmm phases. Among these phases, the C2/m is successfully observed for the first time via in situ synchrotron XRD measurements, which exhibits an excellent structural correspondence to our theoretical predictions. The transition sequence and the critical pressures are predicted. The calculated results also reveal the polymerization behaviors of the carbon atoms and the corresponding characteristic C-C bonding under various pressures. Our work provides key information on the fundamental material behavior and insights into the underlying mechanisms that lay the foundation for further exploration and application of ThC 2 .

  20. On new classes of solutions of nonlinear partial differential equations in the form of convergent special series

    NASA Astrophysics Data System (ADS)

    Filimonov, M. Yu.

    2017-12-01

    The method of special series with recursively calculated coefficients is used to solve nonlinear partial differential equations. The recurrence of finding the coefficients of the series is achieved due to a special choice of functions, in powers of which the solution is expanded in a series. We obtain a sequence of linear partial differential equations to find the coefficients of the series constructed. In many cases, one can deal with a sequence of linear ordinary differential equations. We construct classes of solutions in the form of convergent series for a certain class of nonlinear evolution equations. A new class of solutions of generalized Boussinesque equation with an arbitrary function in the form of a convergent series is constructed.

  1. OrthoANI: An improved algorithm and software for calculating average nucleotide identity.

    PubMed

    Lee, Imchang; Ouk Kim, Yeong; Park, Sang-Cheol; Chun, Jongsik

    2016-02-01

    Species demarcation in Bacteria and Archaea is mainly based on overall genome relatedness, which serves a framework for modern microbiology. Current practice for obtaining these measures between two strains is shifting from experimentally determined similarity obtained by DNA-DNA hybridization (DDH) to genome-sequence-based similarity. Average nucleotide identity (ANI) is a simple algorithm that mimics DDH. Like DDH, ANI values between two genome sequences may be different from each other when reciprocal calculations are compared. We compared 63 690 pairs of genome sequences and found that the differences in reciprocal ANI values are significantly high, exceeding 1 % in some cases. To resolve this problem of not being symmetrical, a new algorithm, named OrthoANI, was developed to accommodate the concept of orthology for which both genome sequences were fragmented and only orthologous fragment pairs taken into consideration for calculating nucleotide identities. OrthoANI is highly correlated with ANI (using BLASTn) and the former showed approximately 0.1 % higher values than the latter. In conclusion, OrthoANI provides a more robust and faster means of calculating average nucleotide identity for taxonomic purposes. The standalone software tools are freely available at http://www.ezbiocloud.net/sw/oat.

  2. SU-D-207A-01: Female Pelvic Synthetic CT Generation Based On Joint Shape and Intensity Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, L; Jolly, S; Cao, Y

    Purpose: To develop a method for generating female pelvic synthetic CT (MRCT) images from a single MR scan and evaluate its utility in radiotherapy. Methods: Under IRB-approval, an imaging sequence (T1-VIBE-Dixon) was acquired for 10 patients. This sequence yields 3 useful image volumes of different contrast (“in-phase” T1-weighted, fat and water). A previously published pelvic bone shape model was used to generate a rough bone mask for each patient. A modified fuzzy c-means classification was performed on the multi spectral MR data, with a regularization term that utilizes the prior knowledge provided by the bone mask and addresses the intensitymore » overlap between different tissue types. A weighted sum of classification probabilities with attenuation values yielded MRCT volumes. The mean absolute error (MAE) between MRCT and real CT on various regions was calculated following deformable alignment (Velocity). Intensity modulated Treatment plans based on actual CT and MRCT were made and compared. Results: The average/standard deviation of MAE across 10 patients was 10.1/6.7 HU for muscle, 6.7/4.6 HU for fat, 136.9/53.5 HU for bony tissues under 850 HU (97% of total bone volume), 188.9/119.3 HU for bony tissues above 850 HU and 17.3/13.3 HU for intrapelvic soft tissues. Calculated doses were comparable for plans generated on CT and calculated using MRCT densities or vice versa, with differences in PTV D99% (mean/σ) of (–0.1/0.2 Gy) and (0.3/0.2 Gy), PTV D0.5cc of (–0.3/0.2 Gy) and (–0.4/1.7 Gy). OAR differences were similarly small for comparable structures, with differences in bowel V50Gy of (–0.3/0.2%) and (0.0/0.2%), femur V30Gy of (0.7/1.2%) and (0.2/1.2%), sacrum V20GY of (0.0/0.1%) and (–0.1/1.1%) and mean pelvic V20Gy of (0.0/0.1%) and (0.6/1.8%). Conclusion: MRCT based on a single imaging sequence in the female pelvis is feasible, with acceptably small variations in attenuation estimates and calculated doses to target and critical organs. Work supported by NIHR01EB016079.« less

  3. Accurate Simulation and Detection of Coevolution Signals in Multiple Sequence Alignments

    PubMed Central

    Ackerman, Sharon H.; Tillier, Elisabeth R.; Gatti, Domenico L.

    2012-01-01

    Background While the conserved positions of a multiple sequence alignment (MSA) are clearly of interest, non-conserved positions can also be important because, for example, destabilizing effects at one position can be compensated by stabilizing effects at another position. Different methods have been developed to recognize the evolutionary relationship between amino acid sites, and to disentangle functional/structural dependencies from historical/phylogenetic ones. Methodology/Principal Findings We have used two complementary approaches to test the efficacy of these methods. In the first approach, we have used a new program, MSAvolve, for the in silico evolution of MSAs, which records a detailed history of all covarying positions, and builds a global coevolution matrix as the accumulated sum of individual matrices for the positions forced to co-vary, the recombinant coevolution, and the stochastic coevolution. We have simulated over 1600 MSAs for 8 protein families, which reflect sequences of different sizes and proteins with widely different functions. The calculated coevolution matrices were compared with the coevolution matrices obtained for the same evolved MSAs with different coevolution detection methods. In a second approach we have evaluated the capacity of the different methods to predict close contacts in the representative X-ray structures of an additional 150 protein families using only experimental MSAs. Conclusions/Significance Methods based on the identification of global correlations between pairs were found to be generally superior to methods based only on local correlations in their capacity to identify coevolving residues using either simulated or experimental MSAs. However, the significant variability in the performance of different methods with different proteins suggests that the simulation of MSAs that replicate the statistical properties of the experimental MSA can be a valuable tool to identify the coevolution detection method that is most effective in each case. PMID:23091608

  4. Universal multiplex PCR and CE for quantification of SMN1/SMN2 genes in spinal muscular atrophy.

    PubMed

    Wang, Chun-Chi; Chang, Jan-Gowth; Jong, Yuh-Jyh; Wu, Shou-Mei

    2009-04-01

    We established a universal multiplex PCR and CE to calculate the copy number of survival motor neuron (SMN1 and SMN2) genes for clinical screening of spinal muscular atrophy (SMA). In this study, one universal fluorescent primer was designed and applied for multiplex PCR of SMN1, SMN2 and two internal standards (CYBB and KRIT1). These amplicons were separated by conformation sensitive CE. Mixture of hydroxyethyl cellulose and hydroxypropyl cellulose were used in this CE system. Our method provided the potential to separate two 390-bp PCR products that differ in a single nucleotide. Differentiation and quantification of SMN1 and SMN2 are essential for clinical screening of SMA patients and carriers. The DNA samples included 22 SMA patients, 45 parents of SMA patients (obligatory carriers) and 217 controls. For evaluating accuracy, those 284 samples were blind-analyzed by this method and denaturing high pressure liquid chromatography (DHPLC). Eight of the total samples showed different results. Among them, two samples were diagnosed as having only SMN2 gene by DHPLC, however, they contained both SMN1 and SMN2 by our method. They were further confirmed by DNA sequencing. Our method showed good agreement with the DNA sequencing. The multiplex ligation-dependent probe amplification (MLPA) was used for confirming the other five samples, and showed the same results with our CE method. For only one sample, our CE showed different results with MLPA and DNA sequencing. One out of 284 samples (0.35%) belonged to mismatching. Our method provided a better accurate method and convenient method for clinical genotyping of SMA disease.

  5. Non-Rigid Structure Estimation in Trajectory Space from Monocular Vision

    PubMed Central

    Wang, Yaming; Tong, Lingling; Jiang, Mingfeng; Zheng, Junbao

    2015-01-01

    In this paper, the problem of non-rigid structure estimation in trajectory space from monocular vision is investigated. Similar to the Point Trajectory Approach (PTA), based on characteristic points’ trajectories described by a predefined Discrete Cosine Transform (DCT) basis, the structure matrix was also calculated by using a factorization method. To further optimize the non-rigid structure estimation from monocular vision, the rank minimization problem about structure matrix is proposed to implement the non-rigid structure estimation by introducing the basic low-rank condition. Moreover, the Accelerated Proximal Gradient (APG) algorithm is proposed to solve the rank minimization problem, and the initial structure matrix calculated by the PTA method is optimized. The APG algorithm can converge to efficient solutions quickly and lessen the reconstruction error obviously. The reconstruction results of real image sequences indicate that the proposed approach runs reliably, and effectively improves the accuracy of non-rigid structure estimation from monocular vision. PMID:26473863

  6. Hylleraas-Configuration Interaction study of the 1S ground state of the negative Li ion.

    PubMed

    Sims, James S

    2017-12-28

    In a previous work Sims and Hagstrom [J. Chem. Phys. 140, 224312 (2014)] reported Hylleraas-Configuration Interaction (Hy-CI) method variational calculations for the neutral atom and positive ion 1 S ground states of the beryllium isoelectronic sequence. The Li - ion, nominally the first member of this series, has a decidedly different electronic structure. This paper reports the results of a large, comparable calculation for the Li - ground state to explore how well the Hy-CI method can represent the more diffuse L shell of Li - which is representative of the Be(2sns) excited states as well. The best non-relativistic energy obtained was -7.500 776 596 hartree, indicating that 10 - 20 nh accuracy is attainable in Hy-CI and that convergence of the r 12 r 34 double cusp is fast and that this correlation type can be accurately represented within the Hy-CI model.

  7. Applying Ancestry and Sex Computation as a Quality Control Tool in Targeted Next-Generation Sequencing.

    PubMed

    Mathias, Patrick C; Turner, Emily H; Scroggins, Sheena M; Salipante, Stephen J; Hoffman, Noah G; Pritchard, Colin C; Shirts, Brian H

    2016-03-01

    To apply techniques for ancestry and sex computation from next-generation sequencing (NGS) data as an approach to confirm sample identity and detect sample processing errors. We combined a principal component analysis method with k-nearest neighbors classification to compute the ancestry of patients undergoing NGS testing. By combining this calculation with X chromosome copy number data, we determined the sex and ancestry of patients for comparison with self-report. We also modeled the sensitivity of this technique in detecting sample processing errors. We applied this technique to 859 patient samples with reliable self-report data. Our k-nearest neighbors ancestry screen had an accuracy of 98.7% for patients reporting a single ancestry. Visual inspection of principal component plots was consistent with self-report in 99.6% of single-ancestry and mixed-ancestry patients. Our model demonstrates that approximately two-thirds of potential sample swaps could be detected in our patient population using this technique. Patient ancestry can be estimated from NGS data incidentally sequenced in targeted panels, enabling an inexpensive quality control method when coupled with patient self-report. © American Society for Clinical Pathology, 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. Segmentation and tracking in echocardiographic sequences: active contours guided by optical flow estimates

    NASA Technical Reports Server (NTRS)

    Mikic, I.; Krucinski, S.; Thomas, J. D.

    1998-01-01

    This paper presents a method for segmentation and tracking of cardiac structures in ultrasound image sequences. The developed algorithm is based on the active contour framework. This approach requires initial placement of the contour close to the desired position in the image, usually an object outline. Best contour shape and position are then calculated, assuming that at this configuration a global energy function, associated with a contour, attains its minimum. Active contours can be used for tracking by selecting a solution from a previous frame as an initial position in a present frame. Such an approach, however, fails for large displacements of the object of interest. This paper presents a technique that incorporates the information on pixel velocities (optical flow) into the estimate of initial contour to enable tracking of fast-moving objects. The algorithm was tested on several ultrasound image sequences, each covering one complete cardiac cycle. The contour successfully tracked boundaries of mitral valve leaflets, aortic root and endocardial borders of the left ventricle. The algorithm-generated outlines were compared against manual tracings by expert physicians. The automated method resulted in contours that were within the boundaries of intraobserver variability.

  9. A Pipeline for High-Throughput Concentration Response Modeling of Gene Expression for Toxicogenomics

    PubMed Central

    House, John S.; Grimm, Fabian A.; Jima, Dereje D.; Zhou, Yi-Hui; Rusyn, Ivan; Wright, Fred A.

    2017-01-01

    Cell-based assays are an attractive option to measure gene expression response to exposure, but the cost of whole-transcriptome RNA sequencing has been a barrier to the use of gene expression profiling for in vitro toxicity screening. In addition, standard RNA sequencing adds variability due to variable transcript length and amplification. Targeted probe-sequencing technologies such as TempO-Seq, with transcriptomic representation that can vary from hundreds of genes to the entire transcriptome, may reduce some components of variation. Analyses of high-throughput toxicogenomics data require renewed attention to read-calling algorithms and simplified dose–response modeling for datasets with relatively few samples. Using data from induced pluripotent stem cell-derived cardiomyocytes treated with chemicals at varying concentrations, we describe here and make available a pipeline for handling expression data generated by TempO-Seq to align reads, clean and normalize raw count data, identify differentially expressed genes, and calculate transcriptomic concentration–response points of departure. The methods are extensible to other forms of concentration–response gene-expression data, and we discuss the utility of the methods for assessing variation in susceptibility and the diseased cellular state. PMID:29163636

  10. Multiple ECG Fiducial Points-Based Random Binary Sequence Generation for Securing Wireless Body Area Networks.

    PubMed

    Zheng, Guanglou; Fang, Gengfa; Shankaran, Rajan; Orgun, Mehmet A; Zhou, Jie; Qiao, Li; Saleem, Kashif

    2017-05-01

    Generating random binary sequences (BSes) is a fundamental requirement in cryptography. A BS is a sequence of N bits, and each bit has a value of 0 or 1. For securing sensors within wireless body area networks (WBANs), electrocardiogram (ECG)-based BS generation methods have been widely investigated in which interpulse intervals (IPIs) from each heartbeat cycle are processed to produce BSes. Using these IPI-based methods to generate a 128-bit BS in real time normally takes around half a minute. In order to improve the time efficiency of such methods, this paper presents an ECG multiple fiducial-points based binary sequence generation (MFBSG) algorithm. The technique of discrete wavelet transforms is employed to detect arrival time of these fiducial points, such as P, Q, R, S, and T peaks. Time intervals between them, including RR, RQ, RS, RP, and RT intervals, are then calculated based on this arrival time, and are used as ECG features to generate random BSes with low latency. According to our analysis on real ECG data, these ECG feature values exhibit the property of randomness and, thus, can be utilized to generate random BSes. Compared with the schemes that solely rely on IPIs to generate BSes, this MFBSG algorithm uses five feature values from one heart beat cycle, and can be up to five times faster than the solely IPI-based methods. So, it achieves a design goal of low latency. According to our analysis, the complexity of the algorithm is comparable to that of fast Fourier transforms. These randomly generated ECG BSes can be used as security keys for encryption or authentication in a WBAN system.

  11. Using time-delayed mutual information to discover and interpret temporal correlation structure in complex populations

    NASA Astrophysics Data System (ADS)

    Albers, D. J.; Hripcsak, George

    2012-03-01

    This paper addresses how to calculate and interpret the time-delayed mutual information (TDMI) for a complex, diversely and sparsely measured, possibly non-stationary population of time-series of unknown composition and origin. The primary vehicle used for this analysis is a comparison between the time-delayed mutual information averaged over the population and the time-delayed mutual information of an aggregated population (here, aggregation implies the population is conjoined before any statistical estimates are implemented). Through the use of information theoretic tools, a sequence of practically implementable calculations are detailed that allow for the average and aggregate time-delayed mutual information to be interpreted. Moreover, these calculations can also be used to understand the degree of homo or heterogeneity present in the population. To demonstrate that the proposed methods can be used in nearly any situation, the methods are applied and demonstrated on the time series of glucose measurements from two different subpopulations of individuals from the Columbia University Medical Center electronic health record repository, revealing a picture of the composition of the population as well as physiological features.

  12. Efficient mixing scheme for self-consistent all-electron charge density

    NASA Astrophysics Data System (ADS)

    Shishidou, Tatsuya; Weinert, Michael

    2015-03-01

    In standard ab initio density-functional theory calculations, the charge density ρ is gradually updated using the ``input'' and ``output'' densities of the current and previous iteration steps. To accelerate the convergence, Pulay mixing has been widely used with great success. It expresses an ``optimal'' input density ρopt and its ``residual'' Ropt by a linear combination of the densities of the iteration sequences. In large-scale metallic systems, however, the long range nature of Coulomb interaction often causes the ``charge sloshing'' phenomenon and significantly impacts the convergence. Two treatments, represented in reciprocal space, are known to suppress the sloshing: (i) the inverse Kerker metric for Pulay optimization and (ii) Kerker-type preconditioning in mixing Ropt. In all-electron methods, where the charge density does not have a converging Fourier representation, treatments equivalent or similar to (i) and (ii) have not been described so far. In this work, we show that, by going through the calculation of Hartree potential, one can accomplish the procedures (i) and (ii) without entering the reciprocal space. Test calculations are done with a FLAPW method.

  13. Model-based coefficient method for calculation of N leaching from agricultural fields applied to small catchments and the effects of leaching reducing measures

    NASA Astrophysics Data System (ADS)

    Kyllmar, K.; Mårtensson, K.; Johnsson, H.

    2005-03-01

    A method to calculate N leaching from arable fields using model-calculated N leaching coefficients (NLCs) was developed. Using the process-based modelling system SOILNDB, leaching of N was simulated for four leaching regions in southern Sweden with 20-year climate series and a large number of randomised crop sequences based on regional agricultural statistics. To obtain N leaching coefficients, mean values of annual N leaching were calculated for each combination of main crop, following crop and fertilisation regime for each leaching region and soil type. The field-NLC method developed could be useful for following up water quality goals in e.g. small monitoring catchments, since it allows normal leaching from actual crop rotations and fertilisation to be determined regardless of the weather. The method was tested using field data from nine small intensively monitored agricultural catchments. The agreement between calculated field N leaching and measured N transport in catchment stream outlets, 19-47 and 8-38 kg ha -1 yr -1, respectively, was satisfactory in most catchments when contributions from land uses other than arable land and uncertainties in groundwater flows were considered. The possibility of calculating effects of crop combinations (crop and following crop) is of considerable value since changes in crop rotation constitute a large potential for reducing N leaching. When the effect of a number of potential measures to reduce N leaching (i.e. applying manure in spring instead of autumn; postponing ploughing-in of ley and green fallow in autumn; undersowing a catch crop in cereals and oilseeds; and increasing the area of catch crops by substituting winter cereals and winter oilseeds with corresponding spring crops) was calculated for the arable fields in the catchments using field-NLCs, N leaching was reduced by between 34 and 54% for the separate catchments when the best possible effect on the entire potential area was assumed.

  14. Principles of Quantitative MR Imaging with Illustrated Review of Applicable Modular Pulse Diagrams.

    PubMed

    Mills, Andrew F; Sakai, Osamu; Anderson, Stephan W; Jara, Hernan

    2017-01-01

    Continued improvements in diagnostic accuracy using magnetic resonance (MR) imaging will require development of methods for tissue analysis that complement traditional qualitative MR imaging studies. Quantitative MR imaging is based on measurement and interpretation of tissue-specific parameters independent of experimental design, compared with qualitative MR imaging, which relies on interpretation of tissue contrast that results from experimental pulse sequence parameters. Quantitative MR imaging represents a natural next step in the evolution of MR imaging practice, since quantitative MR imaging data can be acquired using currently available qualitative imaging pulse sequences without modifications to imaging equipment. The article presents a review of the basic physical concepts used in MR imaging and how quantitative MR imaging is distinct from qualitative MR imaging. Subsequently, the article reviews the hierarchical organization of major applicable pulse sequences used in this article, with the sequences organized into conventional, hybrid, and multispectral sequences capable of calculating the main tissue parameters of T1, T2, and proton density. While this new concept offers the potential for improved diagnostic accuracy and workflow, awareness of this extension to qualitative imaging is generally low. This article reviews the basic physical concepts in MR imaging, describes commonly measured tissue parameters in quantitative MR imaging, and presents the major available pulse sequences used for quantitative MR imaging, with a focus on the hierarchical organization of these sequences. © RSNA, 2017.

  15. Regular Pentagons and the Fibonacci Sequence.

    ERIC Educational Resources Information Center

    French, Doug

    1989-01-01

    Illustrates how to draw a regular pentagon. Shows the sequence of a succession of regular pentagons formed by extending the sides. Calculates the general formula of the Lucas and Fibonacci sequences. Presents a regular icosahedron as an example of the golden ratio. (YP)

  16. Optimization of Multilocus Sequence Analysis for Identification of Species in the Genus Vibrio

    PubMed Central

    Gabriel, Michael W.; Matsui, George Y.; Friedman, Robert

    2014-01-01

    Multilocus sequence analysis (MLSA) is an important method for identification of taxa that are not well differentiated by 16S rRNA gene sequences alone. In this procedure, concatenated sequences of selected genes are constructed and then analyzed. The effects that the number and the order of genes used in MLSA have on reconstruction of phylogenetic relationships were examined. The recA, rpoA, gapA, 16S rRNA gene, gyrB, and ftsZ sequences from 56 species of the genus Vibrio were used to construct molecular phylogenies, and these were evaluated individually and using various gene combinations. Phylogenies from two-gene sequences employing recA and rpoA in both possible gene orders were different. The addition of the gapA gene sequence, producing all six possible concatenated sequences, reduced the differences in phylogenies to degrees of statistical (bootstrap) support for some nodes. The overall statistical support for the phylogenetic tree, assayed on the basis of a reliability score (calculated from the number of nodes having bootstrap values of ≥80 divided by the total number of nodes) increased with increasing numbers of genes used, up to a maximum of four. No further improvement was observed from addition of the fifth gene sequence (ftsZ), and addition of the sixth gene (gyrB) resulted in lower proportions of strongly supported nodes. Reductions in the numbers of strongly supported nodes were also observed when maximum parsimony was employed for tree construction. Use of a small number of gene sequences in MLSA resulted in accurate identification of Vibrio species. PMID:24951781

  17. Dual-threshold segmentation using Arimoto entropy based on chaotic bee colony optimization

    NASA Astrophysics Data System (ADS)

    Li, Li

    2018-03-01

    In order to extract target from complex background more quickly and accurately, and to further improve the detection effect of defects, a method of dual-threshold segmentation using Arimoto entropy based on chaotic bee colony optimization was proposed. Firstly, the method of single-threshold selection based on Arimoto entropy was extended to dual-threshold selection in order to separate the target from the background more accurately. Then intermediate variables in formulae of Arimoto entropy dual-threshold selection was calculated by recursion to eliminate redundant computation effectively and to reduce the amount of calculation. Finally, the local search phase of artificial bee colony algorithm was improved by chaotic sequence based on tent mapping. The fast search for two optimal thresholds was achieved using the improved bee colony optimization algorithm, thus the search could be accelerated obviously. A large number of experimental results show that, compared with the existing segmentation methods such as multi-threshold segmentation method using maximum Shannon entropy, two-dimensional Shannon entropy segmentation method, two-dimensional Tsallis gray entropy segmentation method and multi-threshold segmentation method using reciprocal gray entropy, the proposed method can segment target more quickly and accurately with superior segmentation effect. It proves to be an instant and effective method for image segmentation.

  18. Probabilistic arithmetic automata and their applications.

    PubMed

    Marschall, Tobias; Herms, Inke; Kaltenbach, Hans-Michael; Rahmann, Sven

    2012-01-01

    We present a comprehensive review on probabilistic arithmetic automata (PAAs), a general model to describe chains of operations whose operands depend on chance, along with two algorithms to numerically compute the distribution of the results of such probabilistic calculations. PAAs provide a unifying framework to approach many problems arising in computational biology and elsewhere. We present five different applications, namely 1) pattern matching statistics on random texts, including the computation of the distribution of occurrence counts, waiting times, and clump sizes under hidden Markov background models; 2) exact analysis of window-based pattern matching algorithms; 3) sensitivity of filtration seeds used to detect candidate sequence alignments; 4) length and mass statistics of peptide fragments resulting from enzymatic cleavage reactions; and 5) read length statistics of 454 and IonTorrent sequencing reads. The diversity of these applications indicates the flexibility and unifying character of the presented framework. While the construction of a PAA depends on the particular application, we single out a frequently applicable construction method: We introduce deterministic arithmetic automata (DAAs) to model deterministic calculations on sequences, and demonstrate how to construct a PAA from a given DAA and a finite-memory random text model. This procedure is used for all five discussed applications and greatly simplifies the construction of PAAs. Implementations are available as part of the MoSDi package. Its application programming interface facilitates the rapid development of new applications based on the PAA framework.

  19. Layered motion segmentation and depth ordering by tracking edges.

    PubMed

    Smith, Paul; Drummond, Tom; Cipolla, Roberto

    2004-04-01

    This paper presents a new Bayesian framework for motion segmentation--dividing a frame from an image sequence into layers representing different moving objects--by tracking edges between frames. Edges are found using the Canny edge detector, and the Expectation-Maximization algorithm is then used to fit motion models to these edges and also to calculate the probabilities of the edges obeying each motion model. The edges are also used to segment the image into regions of similar color. The most likely labeling for these regions is then calculated by using the edge probabilities, in association with a Markov Random Field-style prior. The identification of the relative depth ordering of the different motion layers is also determined, as an integral part of the process. An efficient implementation of this framework is presented for segmenting two motions (foreground and background) using two frames. It is then demonstrated how, by tracking the edges into further frames, the probabilities may be accumulated to provide an even more accurate and robust estimate, and segment an entire sequence. Further extensions are then presented to address the segmentation of more than two motions. Here, a hierarchical method of initializing the Expectation-Maximization algorithm is described, and it is demonstrated that the Minimum Description Length principle may be used to automatically select the best number of motion layers. The results from over 30 sequences (demonstrating both two and three motions) are presented and discussed.

  20. Incorporating sequence information into the scoring function: a hidden Markov model for improved peptide identification.

    PubMed

    Khatun, Jainab; Hamlett, Eric; Giddings, Morgan C

    2008-03-01

    The identification of peptides by tandem mass spectrometry (MS/MS) is a central method of proteomics research, but due to the complexity of MS/MS data and the large databases searched, the accuracy of peptide identification algorithms remains limited. To improve the accuracy of identification we applied a machine-learning approach using a hidden Markov model (HMM) to capture the complex and often subtle links between a peptide sequence and its MS/MS spectrum. Our model, HMM_Score, represents ion types as HMM states and calculates the maximum joint probability for a peptide/spectrum pair using emission probabilities from three factors: the amino acids adjacent to each fragmentation site, the mass dependence of ion types and the intensity dependence of ion types. The Viterbi algorithm is used to calculate the most probable assignment between ion types in a spectrum and a peptide sequence, then a correction factor is added to account for the propensity of the model to favor longer peptides. An expectation value is calculated based on the model score to assess the significance of each peptide/spectrum match. We trained and tested HMM_Score on three data sets generated by two different mass spectrometer types. For a reference data set recently reported in the literature and validated using seven identification algorithms, HMM_Score produced 43% more positive identification results at a 1% false positive rate than the best of two other commonly used algorithms, Mascot and X!Tandem. HMM_Score is a highly accurate platform for peptide identification that works well for a variety of mass spectrometer and biological sample types. The program is freely available on ProteomeCommons via an OpenSource license. See http://bioinfo.unc.edu/downloads/ for the download link.

  1. A universal protocol to generate consensus level genome sequences for foot-and-mouth disease virus and other positive-sense polyadenylated RNA viruses using the Illumina MiSeq.

    PubMed

    Logan, Grace; Freimanis, Graham L; King, David J; Valdazo-González, Begoña; Bachanek-Bankowska, Katarzyna; Sanderson, Nicholas D; Knowles, Nick J; King, Donald P; Cottam, Eleanor M

    2014-09-30

    Next-Generation Sequencing (NGS) is revolutionizing molecular epidemiology by providing new approaches to undertake whole genome sequencing (WGS) in diagnostic settings for a variety of human and veterinary pathogens. Previous sequencing protocols have been subject to biases such as those encountered during PCR amplification and cell culture, or are restricted by the need for large quantities of starting material. We describe here a simple and robust methodology for the generation of whole genome sequences on the Illumina MiSeq. This protocol is specific for foot-and-mouth disease virus (FMDV) or other polyadenylated RNA viruses and circumvents both the use of PCR and the requirement for large amounts of initial template. The protocol was successfully validated using five FMDV positive clinical samples from the 2001 epidemic in the United Kingdom, as well as a panel of representative viruses from all seven serotypes. In addition, this protocol was successfully used to recover 94% of an FMDV genome that had previously been identified as cell culture negative. Genome sequences from three other non-FMDV polyadenylated RNA viruses (EMCV, ERAV, VESV) were also obtained with minor protocol amendments. We calculated that a minimum coverage depth of 22 reads was required to produce an accurate consensus sequence for FMDV O. This was achieved in 5 FMDV/O/UKG isolates and the type O FMDV from the serotype panel with the exception of the 5' genomic termini and area immediately flanking the poly(C) region. We have developed a universal WGS method for FMDV and other polyadenylated RNA viruses. This method works successfully from a limited quantity of starting material and eliminates the requirement for genome-specific PCR amplification. This protocol has the potential to generate consensus-level sequences within a routine high-throughput diagnostic environment.

  2. A novel Multi-Agent Ada-Boost algorithm for predicting protein structural class with the information of protein secondary structure.

    PubMed

    Fan, Ming; Zheng, Bin; Li, Lihua

    2015-10-01

    Knowledge of the structural class of a given protein is important for understanding its folding patterns. Although a lot of efforts have been made, it still remains a challenging problem for prediction of protein structural class solely from protein sequences. The feature extraction and classification of proteins are the main problems in prediction. In this research, we extended our earlier work regarding these two aspects. In protein feature extraction, we proposed a scheme by calculating the word frequency and word position from sequences of amino acid, reduced amino acid, and secondary structure. For an accurate classification of the structural class of protein, we developed a novel Multi-Agent Ada-Boost (MA-Ada) method by integrating the features of Multi-Agent system into Ada-Boost algorithm. Extensive experiments were taken to test and compare the proposed method using four benchmark datasets in low homology. The results showed classification accuracies of 88.5%, 96.0%, 88.4%, and 85.5%, respectively, which are much better compared with the existing methods. The source code and dataset are available on request.

  3. Bovine leukaemia virus genotypes 5 and 6 are circulating in cattle from the state of São Paulo, Brazil.

    PubMed

    Gregory, Lilian; Carrillo Gaeta, Natália; Araújo, Jansen; Matsumiya Thomazelli, Luciano; Harakawa, Ricardo; Ikuno, Alice A; Hiromi Okuda, Liria; de Stefano, Eliana; Pituco, Edviges Maristela

    2017-12-01

    Enzootic bovine leucosis (EBL) is a silent disease caused by a retrovirus [bovine leukaemia virus (BLV)]. BLV is classified into almost 10 genotypes that are distributed in several countries. The present research aimed to describe two BLV gp51 env sequences of strains detected in the state of São Paulo, Brazil and perform a phylogenetic analysis to compare them to other BLV gp51 env sequences of strains around the world. Two bovines from different herds were admitted to the Bovine and Small Ruminant Hospital, School of Veterinary Medicine and Animal Science, University of São Paulo, Brazil. In both, lymphosarcoma was detected and the presence of BLV was confirmed by nested PCR. The neighbour-joining algorithm distance method was used to genotype the BLV sequences by phylogenetic reconstruction, and the maximum likelihood method was used for the phylogenetic reconstruction. The phylogeny estimates were calculated by performing 1000 bootstrap replicates. Analysis of the partial envelope glycoprotein (env) gene sequences from two isolates (25 and 31) revealed two different genotypes of BLV. Isolate 25 clustered with ten genotype 6 isolates from Brazil, Argentina, Thailand and Paraguay. On the other hand, isolate 31 clustered with two genotype 5 isolates (one was also from São Paulo and one was from Costa Rica). The detected genotypes corroborate the results of previous studies conducted in the state of São Paulo, Brazil. The prediction of amino acids showed substitutions, particularly between positions 136 and 150 in 11 out of 13 sequences analysed, including sequences from GenBank. BLV is still important in Brazil and this research should be continued.

  4. NMR-based automated protein structure determination.

    PubMed

    Würz, Julia M; Kazemi, Sina; Schmidt, Elena; Bagaria, Anurag; Güntert, Peter

    2017-08-15

    NMR spectra analysis for protein structure determination can now in many cases be performed by automated computational methods. This overview of the computational methods for NMR protein structure analysis presents recent automated methods for signal identification in multidimensional NMR spectra, sequence-specific resonance assignment, collection of conformational restraints, and structure calculation, as implemented in the CYANA software package. These algorithms are sufficiently reliable and integrated into one software package to enable the fully automated structure determination of proteins starting from NMR spectra without manual interventions or corrections at intermediate steps, with an accuracy of 1-2 Å backbone RMSD in comparison with manually solved reference structures. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. An Unconditional Test for Change Point Detection in Binary Sequences with Applications to Clinical Registries.

    PubMed

    Ellenberger, David; Friede, Tim

    2016-08-05

    Methods for change point (also sometimes referred to as threshold or breakpoint) detection in binary sequences are not new and were introduced as early as 1955. Much of the research in this area has focussed on asymptotic and exact conditional methods. Here we develop an exact unconditional test. An unconditional exact test is developed which assumes the total number of events as random instead of conditioning on the number of observed events. The new test is shown to be uniformly more powerful than Worsley's exact conditional test and means for its efficient numerical calculations are given. Adaptions of methods by Berger and Boos are made to deal with the issue that the unknown event probability imposes a nuisance parameter. The methods are compared in a Monte Carlo simulation study and applied to a cohort of patients undergoing traumatic orthopaedic surgery involving external fixators where a change in pin site infections is investigated. The unconditional test controls the type I error rate at the nominal level and is uniformly more powerful than (or to be more precise uniformly at least as powerful as) Worsley's exact conditional test which is very conservative for small sample sizes. In the application a beneficial effect associated with the introduction of a new treatment procedure for pin site care could be revealed. We consider the new test an effective and easy to use exact test which is recommended in small sample size change point problems in binary sequences.

  6. A computational method for detecting copy number variations using scale-space filtering

    PubMed Central

    2013-01-01

    Background As next-generation sequencing technology made rapid and cost-effective sequencing available, the importance of computational approaches in finding and analyzing copy number variations (CNVs) has been amplified. Furthermore, most genome projects need to accurately analyze sequences with fairly low-coverage read data. It is urgently needed to develop a method to detect the exact types and locations of CNVs from low coverage read data. Results Here, we propose a new CNV detection method, CNV_SS, which uses scale-space filtering. The scale-space filtering is evaluated by applying to the read coverage data the Gaussian convolution for various scales according to a given scaling parameter. Next, by differentiating twice and finding zero-crossing points, inflection points of scale-space filtered read coverage data are calculated per scale. Then, the types and the exact locations of CNVs are obtained by analyzing the finger print map, the contours of zero-crossing points for various scales. Conclusions The performance of CNV_SS showed that FNR and FPR stay in the range of 1.27% to 2.43% and 1.14% to 2.44%, respectively, even at a relatively low coverage (0.5x ≤C ≤2x). CNV_SS gave also much more effective results than the conventional methods in the evaluation of FNR, at 3.82% at least and 76.97% at most even when the coverage level of read data is low. CNV_SS source code is freely available from http://dblab.hallym.ac.kr/CNV SS/. PMID:23418726

  7. Through-the-Wall Localization of a Moving Target by Two Independent Ultra Wideband (UWB) Radar Systems

    PubMed Central

    Kocur, Dušan; Švecová, Mária; Rovňáková, Jana

    2013-01-01

    In the case of through-the-wall localization of moving targets by ultra wideband (UWB) radars, there are applications in which handheld sensors equipped only with one transmitting and two receiving antennas are applied. Sometimes, the radar using such a small antenna array is not able to localize the target with the required accuracy. With a view to improve through-the-wall target localization, cooperative positioning based on a fusion of data retrieved from two independent radar systems can be used. In this paper, the novel method of the cooperative localization referred to as joining intersections of the ellipses is introduced. This method is based on a geometrical interpretation of target localization where the target position is estimated using a properly created cluster of the ellipse intersections representing potential positions of the target. The performance of the proposed method is compared with the direct calculation method and two alternative methods of cooperative localization using data obtained by measurements with the M-sequence UWB radars. The direct calculation method is applied for the target localization by particular radar systems. As alternative methods of cooperative localization, the arithmetic average of the target coordinates estimated by two single independent UWB radars and the Taylor series method is considered. PMID:24021968

  8. Through-the-wall localization of a moving target by two independent ultra wideband (UWB) radar systems.

    PubMed

    Kocur, Dušan; Svecová, Mária; Rovňáková, Jana

    2013-09-09

    In the case of through-the-wall localization of moving targets by ultra wideband (UWB) radars, there are applications in which handheld sensors equipped only with one transmitting and two receiving antennas are applied. Sometimes, the radar using such a small antenna array is not able to localize the target with the required accuracy. With a view to improve through-the-wall target localization, cooperative positioning based on a fusion of data retrieved from two independent radar systems can be used. In this paper, the novel method of the cooperative localization referred to as joining intersections of the ellipses is introduced. This method is based on a geometrical interpretation of target localization where the target position is estimated using a properly created cluster of the ellipse intersections representing potential positions of the target. The performance of the proposed method is compared with the direct calculation method and two alternative methods of cooperative localization using data obtained by measurements with the M-sequence UWB radars. The direct calculation method is applied for the target localization by particular radar systems. As alternative methods of cooperative localization, the arithmetic average of the target coordinates estimated by two single independent UWB radars and the Taylor series method is considered.

  9. Efficient and precise calculation of the b-matrix elements in diffusion-weighted imaging pulse sequences.

    PubMed

    Zubkov, Mikhail; Stait-Gardner, Timothy; Price, William S

    2014-06-01

    Precise NMR diffusion measurements require detailed knowledge of the cumulative dephasing effect caused by the numerous gradient pulses present in most NMR pulse sequences. This effect, which ultimately manifests itself as the diffusion-related NMR signal attenuation, is usually described by the b-value or the b-matrix in the case of multidirectional diffusion weighting, the latter being common in diffusion-weighted NMR imaging. Neglecting some of the gradient pulses introduces an error in the calculated diffusion coefficient reaching in some cases 100% of the expected value. Therefore, ensuring the b-matrix calculation includes all the known gradient pulses leads to significant error reduction. Calculation of the b-matrix for simple gradient waveforms is rather straightforward, yet it grows cumbersome when complexly shaped and/or numerous gradient pulses are introduced. Making three broad assumptions about the gradient pulse arrangement in a sequence results in an efficient framework for calculation of b-matrices as well providing some insight into optimal gradient pulse placement. The framework allows accounting for the diffusion-sensitising effect of complexly shaped gradient waveforms with modest computational time and power. This is achieved by using the b-matrix elements of the simple unmodified pulse sequence and minimising the integration of the complexly shaped gradient waveform in the modified sequence. Such re-evaluation of the b-matrix elements retains all the analytical relevance of the straightforward approach, yet at least halves the amount of symbolic integration required. The application of the framework is demonstrated with the evaluation of the expression describing the diffusion-sensitizing effect, caused by different bipolar gradient pulse modules. Copyright © 2014 Elsevier Inc. All rights reserved.

  10. The Application of Some Hartree-Fock Model Calculation to the Analysis of Atomic and Free-Ion Optical Spectra

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hayhurst, Thomas Laine

    1980-08-06

    Techniques for applying ab-initio calculations to the is of atomic spectra are investigated, along with the relationship between the semi-empirical and ab-initio forms of Slater-Condon theory. Slater-Condon theory is reviewed with a focus on the essential features that lead to the effective Hamiltonians associated with the semi-empirical form of the theory. Ab-initio spectroscopic parameters are calculated from wavefunctions obtained via self-consistent field methods, while multi-configuration Hamiltonian matrices are constructed and diagonalized with computer codes written by Robert Cowan of Los Alamos Scientific Laboratory. Group theoretical analysis demonstrates that wavefunctions more general than Slater determinants (i.e. wavefunctions with radial correlations betweenmore » electrons) lead to essentially the same parameterization of effective Hamiltonians. In the spirit of this analysis, a strategy is developed for adjusting ab-initio values of the spectroscopic parameters, reproducing parameters obtained by fitting the corresponding effective Hamiltonian. Secondary parameters are used to "screen" the calculated (primary) spectroscopic parameters, their values determined by least squares. Extrapolations of the secondary parameters determined from analyzed spectra are attempted to correct calculations of atoms and ions without experimental levels. The adjustment strategy and extrapolations are tested on the K I sequence from K 0+ through Fe 7+, fitting to experimental levels for V 4+, and Cr 5+; unobserved levels and spectra are predicted for several members of the sequence. A related problem is also discussed: Energy levels of the Uranium hexahalide complexes, (UX 6) 2- for X= F, Cl, Br, and I, are fit to an effective Hamiltonian (the f 2 configuration in O h symmetry) with corrections proposed by Brian Judd.« less

  11. An automated and universal method for measuring mean grain size from a digital image of sediment

    USGS Publications Warehouse

    Buscombe, Daniel D.; Rubin, David M.; Warrick, Jonathan A.

    2010-01-01

    Existing methods for estimating mean grain size of sediment in an image require either complicated sequences of image processing (filtering, edge detection, segmentation, etc.) or statistical procedures involving calibration. We present a new approach which uses Fourier methods to calculate grain size directly from the image without requiring calibration. Based on analysis of over 450 images, we found the accuracy to be within approximately 16% across the full range from silt to pebbles. Accuracy is comparable to, or better than, existing digital methods. The new method, in conjunction with recent advances in technology for taking appropriate images of sediment in a range of natural environments, promises to revolutionize the logistics and speed at which grain-size data may be obtained from the field.

  12. Phenotypic and genotypic detection of Candida albicans and Candida dubliniensis strains isolated from oral mucosa of AIDS pediatric patients

    PubMed Central

    Livério, Harisson Oliveira; Ruiz, Luciana da Silva; de Freitas, Roseli Santos; Nishikaku, Angela; de Souza, Ana Clara; Paula, Claudete Rodrigues; Domaneschi, Carina

    2017-01-01

    ABSTRACT The aim of this study was to assess a collection of yeasts to verify the presence of Candida dubliniensis among strains isolated from the oral mucosa of AIDS pediatric patients which were initially characterized as Candida albicans by the traditional phenotypic method, as well as to evaluate the main phenotypic methods used in the discrimination between the two species and confirm the identification through genotypic techniques, i.e., DNA sequencing. Twenty-nine samples of C. albicans isolated from this population and kept in a fungi collection were evaluated and re-characterized. In order to differentiate the two species, phenotypic tests (Thermotolerance tests, Chromogenic medium, Staib agar, Tobacco agar, Hypertonic medium) were performed and genotypic techniques using DNA sequencing were employed for confirmation of isolated species. Susceptibility and specificity were calculated for each test. No phenotypic test alone was sufficient to provide definitive identification of C. dubliniensis or C. albicans, as opposed to results of molecular tests. After amplification and sequencing of specific regions of the 29 studied strains, 93.1% of the isolates were identified as C. albicans and 6.9% as C. dubliniensis. The Staib agar assay showed a higher susceptibility (96.3%) in comparison with other phenotypic techniques. Therefore, genotypic methods are indispensable for the conclusive identification and differentiation between these species. PMID:28423089

  13. Phenotypic and genotypic detection of Candida albicans and Candida dubliniensis strains isolated from oral mucosa of AIDS pediatric patients.

    PubMed

    Livério, Harisson Oliveira; Ruiz, Luciana da Silva; Freitas, Roseli Santos de; Nishikaku, Angela; Souza, Ana Clara de; Paula, Claudete Rodrigues; Domaneschi, Carina

    2017-04-13

    The aim of this study was to assess a collection of yeasts to verify the presence of Candida dubliniensis among strains isolated from the oral mucosa of AIDS pediatric patients which were initially characterized as Candida albicans by the traditional phenotypic method, as well as to evaluate the main phenotypic methods used in the discrimination between the two species and confirm the identification through genotypic techniques, i.e., DNA sequencing. Twenty-nine samples of C. albicans isolated from this population and kept in a fungi collection were evaluated and re-characterized. In order to differentiate the two species, phenotypic tests (Thermotolerance tests, Chromogenic medium, Staib agar, Tobacco agar, Hypertonic medium) were performed and genotypic techniques using DNA sequencing were employed for confirmation of isolated species. Susceptibility and specificity were calculated for each test. No phenotypic test alone was sufficient to provide definitive identification of C. dubliniensis or C. albicans, as opposed to results of molecular tests. After amplification and sequencing of specific regions of the 29 studied strains, 93.1% of the isolates were identified as C. albicans and 6.9% as C. dubliniensis. The Staib agar assay showed a higher susceptibility (96.3%) in comparison with other phenotypic techniques. Therefore, genotypic methods are indispensable for the conclusive identification and differentiation between these species.

  14. Phylo-mLogo: an interactive and hierarchical multiple-logo visualization tool for alignment of many sequences

    PubMed Central

    Shih, Arthur Chun-Chieh; Lee, DT; Peng, Chin-Lin; Wu, Yu-Wei

    2007-01-01

    Background When aligning several hundreds or thousands of sequences, such as epidemic virus sequences or homologous/orthologous sequences of some big gene families, to reconstruct the epidemiological history or their phylogenies, how to analyze and visualize the alignment results of many sequences has become a new challenge for computational biologists. Although there are several tools available for visualization of very long sequence alignments, few of them are applicable to the alignments of many sequences. Results A multiple-logo alignment visualization tool, called Phylo-mLogo, is presented in this paper. Phylo-mLogo calculates the variabilities and homogeneities of alignment sequences by base frequencies or entropies. Different from the traditional representations of sequence logos, Phylo-mLogo not only displays the global logo patterns of the whole alignment of multiple sequences, but also demonstrates their local homologous logos for each clade hierarchically. In addition, Phylo-mLogo also allows the user to focus only on the analysis of some important, structurally or functionally constrained sites in the alignment selected by the user or by built-in automatic calculation. Conclusion With Phylo-mLogo, the user can symbolically and hierarchically visualize hundreds of aligned sequences simultaneously and easily check the changes of their amino acid sites when analyzing many homologous/orthologous or influenza virus sequences. More information of Phylo-mLogo can be found at URL . PMID:17319966

  15. Identification of characteristic oligonucleotides in the bacterial 16S ribosomal RNA sequence dataset

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; Willson, Richard C.; Fox, George E.

    2002-01-01

    MOTIVATION: The phylogenetic structure of the bacterial world has been intensively studied by comparing sequences of 16S ribosomal RNA (16S rRNA). This database of sequences is now widely used to design probes for the detection of specific bacteria or groups of bacteria one at a time. The success of such methods reflects the fact that there are local sequence segments that are highly characteristic of particular organisms or groups of organisms. It is not clear, however, the extent to which such signature sequences exist in the 16S rRNA dataset. A better understanding of the numbers and distribution of highly informative oligonucleotide sequences may facilitate the design of hybridization arrays that can characterize the phylogenetic position of an unknown organism or serve as the basis for the development of novel approaches for use in bacterial identification. RESULTS: A computer-based algorithm that characterizes the extent to which any individual oligonucleotide sequence in 16S rRNA is characteristic of any particular bacterial grouping was developed. A measure of signature quality, Q(s), was formulated and subsequently calculated for every individual oligonucleotide sequence in the size range of 5-11 nucleotides and for 15mers with reference to each cluster and subcluster in a 929 organism representative phylogenetic tree. Subsequently, the perfect signature sequences were compared to the full set of 7322 sequences to see how common false positives were. The work completed here establishes beyond any doubt that highly characteristic oligonucleotides exist in the bacterial 16S rRNA sequence dataset in large numbers. Over 16,000 15mers were identified that might be useful as signatures. Signature oligonucleotides are available for over 80% of the nodes in the representative tree.

  16. A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

    PubMed

    Luczak, Brian B; James, Benjamin T; Girgis, Hani Z

    2017-12-06

    Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.

  17. Wavelengths and energy levels for the Zn I isoelectronic sequence Sn{sup 20+} through U{sup 62+}

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, C.M.; Seely, J.F.; Kania, D.R.

    Calculated and experimentally determined transition energies are presented for the Zn I isoelectronic sequence for the elements with atomic numbers Z = 50-92. The excitation energies were calculated for the 84 levels belonging to the 10 configurations of the type 4l4l{prime} by using the Hebrew University Lawrence Livermore Atomic Code (HULLAC). The analysis of the energy level structure along the isoelectronic sequence accounted for 20 avoided level crossings. The differences between the calculated and experimental transition energies were determined for 16 transitions, and the excitation energies of the levels belonging to the 4s4p, 4p{sup 2}, 4s4d, and 4s4f configurations weremore » derived from the semiempirically corrected transition energies. 16 refs., 3 figs., 1 tab.« less

  18. Experimental and analytical study of high velocity impact on Kevlar/Epoxy composite plates

    NASA Astrophysics Data System (ADS)

    Sikarwar, Rahul S.; Velmurugan, Raman; Madhu, Velmuri

    2012-12-01

    In the present study, impact behavior of Kevlar/Epoxy composite plates has been carried out experimentally by considering different thicknesses and lay-up sequences and compared with analytical results. The effect of thickness, lay-up sequence on energy absorbing capacity has been studied for high velocity impact. Four lay-up sequences and four thickness values have been considered. Initial velocities and residual velocities are measured experimentally to calculate the energy absorbing capacity of laminates. Residual velocity of projectile and energy absorbed by laminates are calculated analytically. The results obtained from analytical study are found to be in good agreement with experimental results. It is observed from the study that 0/90 lay-up sequence is most effective for impact resistance. Delamination area is maximum on the back side of the plate for all thickness values and lay-up sequences. The delamination area on the back is maximum for 0/90/45/-45 laminates compared to other lay-up sequences.

  19. Variational Dirac-Hartree-Fock calculation of the Breit interaction

    NASA Astrophysics Data System (ADS)

    Goldman, S. P.

    1988-04-01

    The calculation of the retarded version of the Breit interaction in the context of the VDHF method is discussed. With the use of Slater-type basis functions, all the terms involved can be calculated in closed form. The results are expressed as an expansion in powers of one-electron energy differences and linear combinations of hypergeometric functions. Convergence is fast and high accuracy is obtained with a small number of terms in the expansion even for high values of the nuclear charge. An added advantage is that the lowest order cancellations occurring in the retardation terms are accounted for exactly a priori. A comparison of the number of terms in the total expansion needed for an accuracy of 12 significant digits in the total energy, as well as a comparison of the results with an without retardation and in the local potential approximation, are presented for the carbon isoelectronic sequence.

  20. Verification of Ribosomal Proteins of Aspergillus fumigatus for Use as Biomarkers in MALDI-TOF MS Identification

    PubMed Central

    Nakamura, Sayaka; Sato, Hiroaki; Tanaka, Reiko; Yaguchi, Takashi

    2016-01-01

    We have previously proposed a rapid identification method for bacterial strains based on the profiles of their ribosomal subunit proteins (RSPs), observed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). This method can perform phylogenetic characterization based on the mass of housekeeping RSP biomarkers, ideally calculated from amino acid sequence information registered in public protein databases. With the aim of extending its field of application to medical mycology, this study investigates the actual state of information of RSPs of eukaryotic fungi registered in public protein databases through the characterization of ribosomal protein fractions extracted from genome-sequenced Aspergillus fumigatus strains Af293 and A1163 as a model. In this process, we have found that the public protein databases harbor problems. The RSP names are in confusion, so we have provisionally unified them using the yeast naming system. The most serious problem is that many incorrect sequences are registered in the public protein databases. Surprisingly, more than half of the sequences are incorrect, due chiefly to mis-annotation of exon/intron structures. These errors could be corrected by a combination of in silico inspection by sequence homology analysis and MALDI-TOF MS measurements. We were also able to confirm conserved post-translational modifications in eleven RSPs. After these verifications, the masses of 31 expressed RSPs under 20,000 Da could be accurately confirmed. These RSPs have a potential to be useful biomarkers for identifying clinical isolates of A. fumigatus. PMID:27843740

  1. A high-throughput approach to profile RNA structure.

    PubMed

    Delli Ponti, Riccardo; Marti, Stefanie; Armaos, Alexandros; Tartaglia, Gian Gaetano

    2017-03-17

    Here we introduce the Computational Recognition of Secondary Structure (CROSS) method to calculate the structural profile of an RNA sequence (single- or double-stranded state) at single-nucleotide resolution and without sequence length restrictions. We trained CROSS using data from high-throughput experiments such as Selective 2΄-Hydroxyl Acylation analyzed by Primer Extension (SHAPE; Mouse and HIV transcriptomes) and Parallel Analysis of RNA Structure (PARS; Human and Yeast transcriptomes) as well as high-quality NMR/X-ray structures (PDB database). The algorithm uses primary structure information alone to predict experimental structural profiles with >80% accuracy, showing high performances on large RNAs such as Xist (17 900 nucleotides; Area Under the ROC Curve AUC of 0.75 on dimethyl sulfate (DMS) experiments). We integrated CROSS in thermodynamics-based methods to predict secondary structure and observed an increase in their predictive power by up to 30%. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Using a local low rank plus sparse reconstruction to accelerate dynamic hyperpolarized 13C imaging using the bSSFP sequence.

    PubMed

    Milshteyn, Eugene; von Morze, Cornelius; Reed, Galen D; Shang, Hong; Shin, Peter J; Larson, Peder E Z; Vigneron, Daniel B

    2018-05-01

    Acceleration of dynamic 2D (T 2 Mapping) and 3D hyperpolarized 13 C MRI acquisitions using the balanced steady-state free precession sequence was achieved with a specialized reconstruction method, based on the combination of low rank plus sparse and local low rank reconstructions. Methods were validated using both retrospectively and prospectively undersampled in vivo data from normal rats and tumor-bearing mice. Four-fold acceleration of 1-2 mm isotropic 3D dynamic acquisitions with 2-5 s temporal resolution and two-fold acceleration of 0.25-1 mm 2 2D dynamic acquisitions was achieved. This enabled visualization of the biodistribution of [2- 13 C]pyruvate, [1- 13 C]lactate, [ 13 C,  15 N 2 ]urea, and HP001 within heart, kidneys, vasculature, and tumor, as well as calculation of high resolution T 2 maps. Copyright © 2018 Elsevier Inc. All rights reserved.

  3. Rotational dynamics of bases in the gene coding interferon alpha 17 (IFNA17).

    PubMed

    Krasnobaeva, L A; Yakushevich, L V

    2015-02-01

    In the present work, rotational oscillations of nitrogenous bases in the DNA with the sequence of the gene coding interferon alpha 17 (IFNA17), are investigated. As a mathematical model simulating oscillations of the bases, we use a system of two coupled nonlinear partial differential equations that takes into account effects of dissipation, action of external fields and dependence of the equation coefficients on the sequence of bases. We apply the methods of the theory of oscillations to solve the equations in the linear approach and to construct the dispersive curves determining the dependence of the frequency of the plane waves (ω) on the wave vector (q). In the nonlinear case, the solutions in the form of kink are considered, and the main characteristics of the kink: the rest energy (E0), the rest mass (m0), the size (d) and sound velocity (C0), are calculated. With the help of the energetic method, the kink velocity (υ), the path (S), and the lifetime (τ) are also obtained.

  4. Structural and theoretical study of 1-[1-oxo-3-phenyl-(2-benzosulfonamide)-propyl amido] - anthracene-9,10-dione to be i-motif inhibitor

    NASA Astrophysics Data System (ADS)

    Vatsal, Manu; Devi, Vandna; Awasthi, Pamita

    2018-04-01

    The 1-[1-oxo-3-phenyl-(2-benzosulfonamide)-propyl amido] - anthracene-9,10-dione (BPAQ) an analogue of anthracenedione class of antibiotic has been synthesized. To characterize molecular functional groups FT-IR and FT-Raman spectrum were recorded and vibrational frequencies were assigned accordingly. The optimized geometrical parameters, vibrational assignments, chemical shifts and thermodynamic properties of title compound were computed by ab initio calculations at Density Functional Theory (DFT) method with 6-31G(d,p) as basis set. The calculated harmonic vibrational frequencies of molecule were then analysed in comparison to experimental FT-IR and Raman spectrum. Gauge independent atomic orbital (GIAO) method was used for determining, (1H) and carbon (13C) nuclear magnetic resonance (NMR) spectra of the molecule. Molecular parameters were calculated along with its periodic boundary conditions calculation (PBC) analysis supported by X-ray diffraction studies. The frontier molecular orbital (HOMO, LUMO) analysis describes charge distribution and stability of the molecule which concluded that nucleophilic substitution is more preferred and the mullikan charge analysis also confirmed the same. Further the title compound showed an inhibitory action at d(TCCCCC), an intermolecular i-motif sequence, hence molecular docking study suggested the inhibitory activity of the compound at these junction.

  5. Hidden symmetries in N-layer dielectric stacks

    NASA Astrophysics Data System (ADS)

    Liu, Haihao; Shoufie Ukhtary, M.; Saito, Riichiro

    2017-11-01

    The optical properties of a multilayer system with arbitrary N layers of dielectric media are investigated. Each layer is one of two dielectric media, with a thickness one-quarter the wavelength of light in that medium, corresponding to a central frequency f 0. Using the transfer matrix method, the transmittance T is calculated for all possible 2 N sequences for small N. Unexpectedly, it is found that instead of 2 N different values of T at f 0 (T 0), there are only (N/2+1) discrete values of T 0, for even N, and (N + 1) for odd N. We explain this high degeneracy in T 0 values by finding symmetry operations on the sequences that do not change T 0. Analytical formulae were derived for the T 0 values and their degeneracies as functions of N and an integer parameter for each sequence we call ‘charge’. Additionally, the bandwidth at f 0 and filter response of the transmission spectra are investigated, revealing asymptotic behavior at large N.

  6. Static electric polarizabilities and first hyperpolarizabilities of molecular ions RgH + (Rg = He, Ne, Ar, Kr, Xe): ab initio study

    NASA Astrophysics Data System (ADS)

    Cukras, Janusz; Antušek, Andrej; Holka, Filip; Sadlej, Joanna

    2009-06-01

    Extensive ab initio calculations of static electric properties of molecular ions of general formula RgH + (Rg = He, Ne, Ar, Kr, Xe) involving the finite field method and coupled cluster CCSD(T) approach have been done. The relativistic effects were taken into account by Douglas-Kroll-Hess approximation. The numerical stability and reliability of calculated values have been tested using the systematic sequence of Dunning's cc-pVXZ-DK and ANO-RCC-VQZP basis sets. The influence of ZPE and pure vibrational contribution has been discussed. The component αzz has increasing trend in RgH + while the relativistic effect on αzz leads to a small increase of this molecular parameter.

  7. Electrochemical Behavior of Sulfur in Aqueous Alkaline Solutions

    NASA Astrophysics Data System (ADS)

    Mamyrbekova, Aigul; Mamitova, A. D.; Mamyrbekova, Aizhan

    2018-03-01

    The kinetics and mechanism of the electrode oxidation-reduction of sulfur on an electrically conductive sulfur-graphite electrode in an alkaline solution was studied by the potentiodynamic method. To examine the mechanism of electrode processes occurring during AC polarization on a sulfur-graphite electrode, the cyclic polarization in both directions and anodic polarization curves were recorded. The kinetic parameters: charge transfer coefficients (α), diffusion coefficients ( D), heterogeneous rate constants of electrode process ( k s), and effective activation energies of the process ( E a) were calculated from the results of polarization measurements. An analysis of the results and calculated kinetic parameters of electrode processes showed that discharge ionization of sulfur in alkaline solutions occurs as a sequence of two stages and is a quasireversible process.

  8. modlAMP: Python for antimicrobial peptides.

    PubMed

    Müller, Alex T; Gabernet, Gisela; Hiss, Jan A; Schneider, Gisbert

    2017-09-01

    We have implemented the lecular esign aboratory's nti icrobial eptides package ( ), a Python-based software package for the design, classification and visual representation of peptide data. modlAMP offers functions for molecular descriptor calculation and the retrieval of amino acid sequences from public or local sequence databases, and provides instant access to precompiled datasets for machine learning. The package also contains methods for the analysis and representation of circular dichroism spectra. The modlAMP Python package is available under the BSD license from URL http://doi.org/10.5905/ethz-1007-72 or via pip from the Python Package Index (PyPI). gisbert.schneider@pharma.ethz.ch. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  9. Binocular video ophthalmoscope for simultaneous recording of sequences of the human retina to compare dynamic parameters

    NASA Astrophysics Data System (ADS)

    Tornow, Ralf P.; Milczarek, Aleksandra; Odstrcilik, Jan; Kolar, Radim

    2017-07-01

    A parallel video ophthalmoscope was developed to acquire short video sequences (25 fps, 250 frames) of both eyes simultaneously with exact synchronization. Video sequences were registered off-line to compensate for eye movements. From registered video sequences dynamic parameters like cardiac cycle induced reflection changes and eye movements can be calculated and compared between eyes.

  10. Fast selection of miRNA candidates based on large-scale pre-computed MFE sets of randomized sequences.

    PubMed

    Warris, Sven; Boymans, Sander; Muiser, Iwe; Noback, Michiel; Krijnen, Wim; Nap, Jan-Peter

    2014-01-13

    Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings. Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition. The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification.

  11. An improved parallel fuzzy connected image segmentation method based on CUDA.

    PubMed

    Wang, Liansheng; Li, Dong; Huang, Shaohui

    2016-05-12

    Fuzzy connectedness method (FC) is an effective method for extracting fuzzy objects from medical images. However, when FC is applied to large medical image datasets, its running time will be greatly expensive. Therefore, a parallel CUDA version of FC (CUDA-kFOE) was proposed by Ying et al. to accelerate the original FC. Unfortunately, CUDA-kFOE does not consider the edges between GPU blocks, which causes miscalculation of edge points. In this paper, an improved algorithm is proposed by adding a correction step on the edge points. The improved algorithm can greatly enhance the calculation accuracy. In the improved method, an iterative manner is applied. In the first iteration, the affinity computation strategy is changed and a look up table is employed for memory reduction. In the second iteration, the error voxels because of asynchronism are updated again. Three different CT sequences of hepatic vascular with different sizes were used in the experiments with three different seeds. NVIDIA Tesla C2075 is used to evaluate our improved method over these three data sets. Experimental results show that the improved algorithm can achieve a faster segmentation compared to the CPU version and higher accuracy than CUDA-kFOE. The calculation results were consistent with the CPU version, which demonstrates that it corrects the edge point calculation error of the original CUDA-kFOE. The proposed method has a comparable time cost and has less errors compared to the original CUDA-kFOE as demonstrated in the experimental results. In the future, we will focus on automatic acquisition method and automatic processing.

  12. MsLDR-creator: a web service to design msLDR assays.

    PubMed

    Bormann, Felix; Dahl, Andreas; Sers, Christine

    2012-03-01

    MsLDR-creator is a free web service to design assays for the new DNA methylation detection method msLDR. The service provides the user with all necessary information about the oligonucleotides required for the measurement of a given CpG within a sequence of interest. The parameters are calculated by the nearest neighbour approach to achieve optimal behaviour during the experimental procedure. In addition, to guarantee a good start using msLDR, further information, like protocols and hints and tricks, are provided.

  13. Exact calculation of distributions on integers, with application to sequence alignment.

    PubMed

    Newberg, Lee A; Lawrence, Charles E

    2009-01-01

    Computational biology is replete with high-dimensional discrete prediction and inference problems. Dynamic programming recursions can be applied to several of the most important of these, including sequence alignment, RNA secondary-structure prediction, phylogenetic inference, and motif finding. In these problems, attention is frequently focused on some scalar quantity of interest, a score, such as an alignment score or the free energy of an RNA secondary structure. In many cases, score is naturally defined on integers, such as a count of the number of pairing differences between two sequence alignments, or else an integer score has been adopted for computational reasons, such as in the test of significance of motif scores. The probability distribution of the score under an appropriate probabilistic model is of interest, such as in tests of significance of motif scores, or in calculation of Bayesian confidence limits around an alignment. Here we present three algorithms for calculating the exact distribution of a score of this type; then, in the context of pairwise local sequence alignments, we apply the approach so as to find the alignment score distribution and Bayesian confidence limits.

  14. Translational resistivity/conductivity of coding sequences during exponential growth of Escherichia coli.

    PubMed

    Takai, Kazuyuki

    2017-01-21

    Codon adaptation index (CAI) has been widely used for prediction of expression of recombinant genes in Escherichia coli and other organisms. However, CAI has no mechanistic basis that rationalizes its application to estimation of translational efficiency. Here, I propose a model based on which we could consider how codon usage is related to the level of expression during exponential growth of bacteria. In this model, translation of a gene is considered as an analog of electric current, and an analog of electric resistance corresponding to each gene is considered. "Translational resistance" is dependent on the steady-state concentration and the sequence of the mRNA species, and "translational resistivity" is dependent only on the mRNA sequence. The latter is the sum of two parts: one is the resistivity for the elongation reaction (coding sequence resistivity), and the other comes from all of the other steps of the decoding reaction. This electric circuit model clearly shows that some conditions should be met for codon composition of a coding sequence to correlate well with its expression level. On the other hand, I calculated relative frequency of each of the 61 sense codon triplets translated during exponential growth of E. coli from a proteomic dataset covering over 2600 proteins. A tentative method for estimating relative coding sequence resistivity based on the data is presented. Copyright © 2016. Published by Elsevier Ltd.

  15. The Stability Analysis Method of the Cohesive Granular Slope on the Basis of Graph Theory

    PubMed Central

    Guan, Yanpeng; Liu, Xiaoli; Wang, Enzhi; Wang, Sijing

    2017-01-01

    This paper attempted to provide a method to calculate progressive failure of the cohesive-frictional granular geomaterial and the spatial distribution of the stability of the cohesive granular slope. The methodology can be divided into two parts: the characterization method of macro-contact and the analysis of the slope stability. Based on the graph theory, the vertexes, the edges and the edge sequences are abstracted out to characterize the voids, the particle contact and the macro-contact, respectively, bridging the gap between the mesoscopic and macro scales of granular materials. This paper adopts this characterization method to extract a graph from a granular slope and characterize the macro sliding surface, then the weighted graph is analyzed to calculate the slope safety factor. Each edge has three weights representing the sliding moment, the anti-sliding moment and the braking index of contact-bond, respectively, E1E2E3E1E2E3. The safety factor of the slope is calculated by presupposing a certain number of sliding routes and reducing Weight E3 repeatedly and counting the mesoscopic failure of the edge. It is a kind of slope analysis method from mesoscopic perspective so it can present more detail of the mesoscopic property of the granular slope. In the respect of macro scale, the spatial distribution of the stability of the granular slope is in agreement with the theoretical solution. PMID:28772596

  16. Monte Carlo Method for Determining Earthquake Recurrence Parameters from Short Paleoseismic Catalogs: Example Calculations for California

    USGS Publications Warehouse

    Parsons, Tom

    2008-01-01

    Paleoearthquake observations often lack enough events at a given site to directly define a probability density function (PDF) for earthquake recurrence. Sites with fewer than 10-15 intervals do not provide enough information to reliably determine the shape of the PDF using standard maximum-likelihood techniques [e.g., Ellsworth et al., 1999]. In this paper I present a method that attempts to fit wide ranges of distribution parameters to short paleoseismic series. From repeated Monte Carlo draws, it becomes possible to quantitatively estimate most likely recurrence PDF parameters, and a ranked distribution of parameters is returned that can be used to assess uncertainties in hazard calculations. In tests on short synthetic earthquake series, the method gives results that cluster around the mean of the input distribution, whereas maximum likelihood methods return the sample means [e.g., NIST/SEMATECH, 2006]. For short series (fewer than 10 intervals), sample means tend to reflect the median of an asymmetric recurrence distribution, possibly leading to an overestimate of the hazard should they be used in probability calculations. Therefore a Monte Carlo approach may be useful for assessing recurrence from limited paleoearthquake records. Further, the degree of functional dependence among parameters like mean recurrence interval and coefficient of variation can be established. The method is described for use with time-independent and time-dependent PDF?s, and results from 19 paleoseismic sequences on strike-slip faults throughout the state of California are given.

  17. Monte Carlo method for determining earthquake recurrence parameters from short paleoseismic catalogs: Example calculations for California

    USGS Publications Warehouse

    Parsons, T.

    2008-01-01

    Paleoearthquake observations often lack enough events at a given site to directly define a probability density function (PDF) for earthquake recurrence. Sites with fewer than 10-15 intervals do not provide enough information to reliably determine the shape of the PDF using standard maximum-likelihood techniques (e.g., Ellsworth et al., 1999). In this paper I present a method that attempts to fit wide ranges of distribution parameters to short paleoseismic series. From repeated Monte Carlo draws, it becomes possible to quantitatively estimate most likely recurrence PDF parameters, and a ranked distribution of parameters is returned that can be used to assess uncertainties in hazard calculations. In tests on short synthetic earthquake series, the method gives results that cluster around the mean of the input distribution, whereas maximum likelihood methods return the sample means (e.g., NIST/SEMATECH, 2006). For short series (fewer than 10 intervals), sample means tend to reflect the median of an asymmetric recurrence distribution, possibly leading to an overestimate of the hazard should they be used in probability calculations. Therefore a Monte Carlo approach may be useful for assessing recurrence from limited paleoearthquake records. Further, the degree of functional dependence among parameters like mean recurrence interval and coefficient of variation can be established. The method is described for use with time-independent and time-dependent PDFs, and results from 19 paleoseismic sequences on strike-slip faults throughout the state of California are given.

  18. HIV-1 pol mutation frequency by subtype and treatment experience

    PubMed Central

    Rhee, Soo-Yon; Kantor, Rami; Katzenstein, David A.; Camacho, Ricardo; Morris, Lynn; Sirivichayakul, Sunee; Jorgensen, Louise; Brigido, Luis F.; Schapiro, Jonathan M.; Shafer, Robert W.

    2008-01-01

    Objective HIVseq was developed in 2000 to make published data on the frequency of HIV-1 group M protease and reverse transcriptase (RT) mutations available in real time to laboratories and researchers sequencing these genes. Because most published protease and RT sequences belonged to subtype B, the initial version of HIVseq was based on this subtype. As additional non-B sequences from persons with well-characterized antiretroviral treatment histories have become available, the program has been extended to subtypes A, C, D, F, G, CRF01, and CRF02. Methods The latest frequency of each protease and RT mutation according to subtype and drug-class exposure was calculated using published sequences in the Stanford HIV RT and Protease Sequence Database. Each mutation was hyperlinked to published reports of viruses containing the mutation. Results As of September 2005, the mean number of protease sequences per non-B subtype was 534 from protease inhibitor-naive persons and 133 from protease inhibitor-treated persons, representing 13.2% and 2.3%, respectively, of the data available for subtype B. The mean number of RT sequences per non-B subtype was 373 from RT inhibitor-naive persons and 288 from RT inhibitor-treated persons, representing 17.9% and 3.8%, respectively, of the data available for subtype B. Conclusions HIVseq allows users to examine protease and RT mutations within the context of previously published sequences of these genes. The publication of additional non-B protease and RT sequences from persons with well-characterized treatment histories, however, will be required to perform the same types of analysis possible with the much larger number of subtype B sequences. PMID:16514293

  19. Study of base pair mutations in proline-rich homeodomain (PRH)-DNA complexes using molecular dynamics.

    PubMed

    Jalili, Seifollah; Karami, Leila; Schofield, Jeremy

    2013-06-01

    Proline-rich homeodomain (PRH) is a regulatory protein controlling transcription and gene expression processes by binding to the specific sequence of DNA, especially to the sequence 5'-TAATNN-3'. The impact of base pair mutations on the binding between the PRH protein and DNA is investigated using molecular dynamics and free energy simulations to identify DNA sequences that form stable complexes with PRH. Three 20-ns molecular dynamics simulations (PRH-TAATTG, PRH-TAATTA and PRH-TAATGG complexes) in explicit solvent water were performed to investigate three complexes structurally. Structural analysis shows that the native TAATTG sequence forms a complex that is more stable than complexes with base pair mutations. It is also observed that upon mutation, the number and occupancy of the direct and water-mediated hydrogen bonds decrease. Free energy calculations performed with the thermodynamic integration method predict relative binding free energies of 0.64 and 2 kcal/mol for GC to AT and TA to GC mutations, respectively, suggesting that among the three DNA sequences, the PRH-TAATTG complex is more stable than the two mutated complexes. In addition, it is demonstrated that the stability of the PRH-TAATTA complex is greater than that of the PRH-TAATGG complex.

  20. Extended RF shimming: Sequence-level parallel transmission optimization applied to steady-state free precession MRI of the heart.

    PubMed

    Beqiri, Arian; Price, Anthony N; Padormo, Francesco; Hajnal, Joseph V; Malik, Shaihan J

    2017-06-01

    Cardiac magnetic resonance imaging (MRI) at high field presents challenges because of the high specific absorption rate and significant transmit field (B 1 + ) inhomogeneities. Parallel transmission MRI offers the ability to correct for both issues at the level of individual radiofrequency (RF) pulses, but must operate within strict hardware and safety constraints. The constraints are themselves affected by sequence parameters, such as the RF pulse duration and TR, meaning that an overall optimal operating point exists for a given sequence. This work seeks to obtain optimal performance by performing a 'sequence-level' optimization in which pulse sequence parameters are included as part of an RF shimming calculation. The method is applied to balanced steady-state free precession cardiac MRI with the objective of minimizing TR, hence reducing the imaging duration. Results are demonstrated using an eight-channel parallel transmit system operating at 3 T, with an in vivo study carried out on seven male subjects of varying body mass index (BMI). Compared with single-channel operation, a mean-squared-error shimming approach leads to reduced imaging durations of 32 ± 3% with simultaneous improvement in flip angle homogeneity of 32 ± 8% within the myocardium. © 2017 The Authors. NMR in Biomedicine published by John Wiley & Sons Ltd.

  1. The 2016 Mihoub (north-central Algeria) earthquake sequence: Seismological and tectonic aspects

    NASA Astrophysics Data System (ADS)

    Khelif, M. F.; Yelles-Chaouche, A.; Benaissa, Z.; Semmane, F.; Beldjoudi, H.; Haned, A.; Issaadi, A.; Chami, A.; Chimouni, R.; Harbi, A.; Maouche, S.; Dabbouz, G.; Aidi, C.; Kherroubi, A.

    2018-06-01

    On 28 May 2016 at 23:54 (UTC), an Mw5.4 earthquake occurred in Mihoub village, Algeria, 60 km southeast of Algiers. This earthquake was the largest event in a sequence recorded from 10 April to 15 July 2016. In addition to the permanent national network, a temporary network was installed in the epicentral region after this shock. Recorded event locations allow us to give a general overview of the sequence and reveal the existence of two main fault segments. The first segment, on which the first event in the sequence was located, is near-vertical and trends E-W. The second fault plane, on which the largest event of the sequence was located, dips to the southeast and strikes NE-SW. A total of 46 well-constrained focal mechanisms were calculated. The events located on the E-W-striking fault segment show mainly right-lateral strike-slip (strike N70°E, dip 77° to the SSE, rake 150°). The events located on the NE-SW-striking segment show mainly reverse faulting (strike N60°E, dip 70° to the SE, rake 130°). We calculated the static stress change caused by the first event (Md4.9) of the sequence; the result shows that the fault plane of the largest event in the sequence (Mw5.4) and most of the aftershocks occurred within an area of increased Coulomb stress. Moreover, using the focal mechanisms calculated in this work, we estimated the orientations of the main axes of the local stress tensor ellipsoid. The results confirm previous findings that the general stress field in this area shows orientations aligned NNW-SSE to NW-SE. The 2016 Mihoub earthquake sequence study thus improves our understanding of seismic hazard in north-central Algeria.

  2. Applicability of the quantification of genetically modified organisms to foods processed from maize and soy.

    PubMed

    Yoshimura, Tomoaki; Kuribara, Hideo; Matsuoka, Takeshi; Kodama, Takashi; Iida, Mayu; Watanabe, Takahiro; Akiyama, Hiroshi; Maitani, Tamio; Furui, Satoshi; Hino, Akihiro

    2005-03-23

    The applicability of quantifying genetically modified (GM) maize and soy to processed foods was investigated using heat treatment processing models. The detection methods were based on real-time quantitative polymerase chain reaction (PCR) analysis. Ground seeds of insect resistant GM maize (MON810) and glyphosate tolerant Roundup Ready (RR) soy were dissolved in water and were heat treated by autoclaving for various time intervals. The calculated copy numbers of the recombinant and taxon specific deoxyribonucleic acid (DNA) sequences in the extracted DNA solution were found to decrease with time. This decrease was influenced by the PCR-amplified size. The conversion factor (Cf), which is the ratio of the recombinant DNA sequence to the taxon specific DNA sequence and is used as a constant number for calculating GM% at each event, tended to be stable when the sizes of PCR products of two DNA sequences were nearly equal. The results suggested that the size of the PCR product plays a key role in the quantification of GM organisms in processed foods. It is believed that the Cf of the endosperm (3n) is influenced by whether the GM originated from a paternal or maternal source. The embryos and endosperms were separated from the F1 generation seeds of five GM maize events, and their Cf values were measured. Both paternal and maternal GM events were identified. In these, the endosperm Cf was lower than that of the embryo, and the embryo Cf was lower than that of the endosperm. These results demonstrate the difficulties encountered in the determination of GM% in maize grains (F2 generation) and in processed foods from maize and soy.

  3. Wavelengths and energy levels for the Zn I isoelectronic sequence Ga[sup 1+] through Xe[sup 24+

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seely, J.F.; Bar-Shalom, A.

    Calculated and experimentally determined transition energies were compared for the Zn I isoelectronic sequence for the elements with atomic numbers Z = 31-54. Using the Hebrew Univ. Lawrence Livermore Atomic Code, the excitation energies were calculated for the 109 levels belonging to the lowest 16 configurations of the types 4/4/[prime] and 4/5/[prime]. The analysis of the energy-level structure along the isoelectronic sequence accounted for a number of avoided level crossings. The differences between the calculated and experimental transition energies were determined for 24 transitions among the 4s[sup 2], 4s4p, 4p[sup 2], 4s4d, and 4s4f configurations. Wavelengths were predicted for previouslymore » unobserved transitions in the highly charged ions. 15 refs., 4 figs., 3 tabs.« less

  4. Improved hybrid optimization algorithm for 3D protein structure prediction.

    PubMed

    Zhou, Changjun; Hou, Caixia; Wei, Xiaopeng; Zhang, Qiang

    2014-07-01

    A new improved hybrid optimization algorithm - PGATS algorithm, which is based on toy off-lattice model, is presented for dealing with three-dimensional protein structure prediction problems. The algorithm combines the particle swarm optimization (PSO), genetic algorithm (GA), and tabu search (TS) algorithms. Otherwise, we also take some different improved strategies. The factor of stochastic disturbance is joined in the particle swarm optimization to improve the search ability; the operations of crossover and mutation that are in the genetic algorithm are changed to a kind of random liner method; at last tabu search algorithm is improved by appending a mutation operator. Through the combination of a variety of strategies and algorithms, the protein structure prediction (PSP) in a 3D off-lattice model is achieved. The PSP problem is an NP-hard problem, but the problem can be attributed to a global optimization problem of multi-extremum and multi-parameters. This is the theoretical principle of the hybrid optimization algorithm that is proposed in this paper. The algorithm combines local search and global search, which overcomes the shortcoming of a single algorithm, giving full play to the advantage of each algorithm. In the current universal standard sequences, Fibonacci sequences and real protein sequences are certified. Experiments show that the proposed new method outperforms single algorithms on the accuracy of calculating the protein sequence energy value, which is proved to be an effective way to predict the structure of proteins.

  5. Real life identification of partially occluded weapons in video frames

    NASA Astrophysics Data System (ADS)

    Hempelmann, Christian F.; Arslan, Abdullah N.; Attardo, Salvatore; Blount, Grady P.; Sirakov, Nikolay M.

    2016-05-01

    We empirically test the capacity of an improved system to identify not just images of individual guns, but partially occluded guns and their parts appearing in a videoframe. This approach combines low-level geometrical information gleaned from the visual images and high-level semantic information stored in an ontology enriched with meronymic part-whole relations. The main improvements of the system are handling occlusion, new algorithms, and an emerging meronomy. Well-known and commonly deployed in ontologies, actual meronomies need to be engineered and populated with unique solutions. Here, this includes adjacency of weapon parts and essentiality of parts to the threat of and the diagnosticity for a weapon. In this study video sequences are processed frame by frame. The extraction method separates colors and removes the background. Then image subtraction of the next frame determines moving targets, before morphological closing is applied to the current frame in order to clean up noise and fill gaps. Next, the method calculates for each object the boundary coordinates and uses them to create a finite numerical sequence as a descriptor. Parts identification is done by cyclic sequence alignment and matching against the nodes of the weapons ontology. From the identified parts, the most-likely weapon will be determined by using the weapon ontology.

  6. Predicting the helix packing of globular proteins by self-correcting distance geometry.

    PubMed

    Mumenthaler, C; Braun, W

    1995-05-01

    A new self-correcting distance geometry method for predicting the three-dimensional structure of small globular proteins was assessed with a test set of 8 helical proteins. With the knowledge of the amino acid sequence and the helical segments, our completely automated method calculated the correct backbone topology of six proteins. The accuracy of the predicted structures ranged from 2.3 A to 3.1 A for the helical segments compared to the experimentally determined structures. For two proteins, the predicted constraints were not restrictive enough to yield a conclusive prediction. The method can be applied to all small globular proteins, provided the secondary structure is known from NMR analysis or can be predicted with high reliability.

  7. Accuracy and Reproducibility of Adipose Tissue Measurements in Young Infants by Whole Body Magnetic Resonance Imaging

    PubMed Central

    Bauer, Jan Stefan; Noël, Peter Benjamin; Vollhardt, Christiane; Much, Daniela; Degirmenci, Saliha; Brunner, Stefanie; Rummeny, Ernst Josef; Hauner, Hans

    2015-01-01

    Purpose MR might be well suited to obtain reproducible and accurate measures of fat tissues in infants. This study evaluates MR-measurements of adipose tissue in young infants in vitro and in vivo. Material and Methods MR images of ten phantoms simulating subcutaneous fat of an infant’s torso were obtained using a 1.5T MR scanner with and without simulated breathing. Scans consisted of a cartesian water-suppression turbo spin echo (wsTSE) sequence, and a PROPELLER wsTSE sequence. Fat volume was quantified directly and by MR imaging using k-means clustering and threshold-based segmentation procedures to calculate accuracy in vitro. Whole body MR was obtained in sleeping young infants (average age 67±30 days). This study was approved by the local review board. All parents gave written informed consent. To obtain reproducibility in vivo, cartesian and PROPELLER wsTSE sequences were repeated in seven and four young infants, respectively. Overall, 21 repetitions were performed for the cartesian sequence and 13 repetitions for the PROPELLER sequence. Results In vitro accuracy errors depended on the chosen segmentation procedure, ranging from 5.4% to 76%, while the sequence showed no significant influence. Artificial breathing increased the minimal accuracy error to 9.1%. In vivo reproducibility errors for total fat volume of the sleeping infants ranged from 2.6% to 3.4%. Neither segmentation nor sequence significantly influenced reproducibility. Conclusion With both cartesian and PROPELLER sequences an accurate and reproducible measure of body fat was achieved. Adequate segmentation was mandatory for high accuracy. PMID:25706876

  8. NMR and computational methods applied to the 3- dimensional structure determination of DNA and ligand-DNA complexes in solution

    NASA Astrophysics Data System (ADS)

    Smith, Jarrod Anson

    2D homonuclear 1H NMR methods and restrained molecular dynamics (rMD) calculations have been applied to determining the three-dimensional structures of DNA and minor groove-binding ligand-DNA complexes in solution. The structure of the DNA decamer sequence d(GCGTTAACGC)2 has been solved both with a distance-based rMD protocol and an NOE relaxation matrix backcalculation-based protocol in order to probe the relative merits of the different refinement methods. In addition, three minor groove binding ligand-DNA complexes have been examined. The solution structure of the oligosaccharide moiety of the antitumor DNA scission agent calicheamicin γ1I has been determined in complex with a decamer duplex containing its high affinity 5'-TCCT- 3' binding sequence. The structure of the complex reinforces the belief that the oligosaccharide moiety is responsible for the sequence selective minor-groove binding activity of the agent, and critical intermolecular contacts are revealed. The solution structures of both the (+) and (-) enantiomers of the minor groove binding DNA alkylating agent duocarmycin SA have been determined in covalent complex with the undecamer DNA duplex d(GACTAATTGTC).d(GAC AATTAGTC). The results support the proposal that the alkylation activity of the duocarmycin antitumor antibiotics is catalyzed by a binding-induced conformational change in the ligand which activates the cyclopropyl group for reaction with the DNA. Comparisons between the structures of the two enantiomers covalently bound to the same DNA sequence at the same 5'-AATTA-3 ' site have provided insight into the binding orientation and site selectivity, as well as the relative rates of reactivity of these two agents.

  9. TRX-LOGOS - a graphical tool to demonstrate DNA information content dependent upon backbone dynamics in addition to base sequence.

    PubMed

    Fortin, Connor H; Schulze, Katharina V; Babbitt, Gregory A

    2015-01-01

    It is now widely-accepted that DNA sequences defining DNA-protein interactions functionally depend upon local biophysical features of DNA backbone that are important in defining sites of binding interaction in the genome (e.g. DNA shape, charge and intrinsic dynamics). However, these physical features of DNA polymer are not directly apparent when analyzing and viewing Shannon information content calculated at single nucleobases in a traditional sequence logo plot. Thus, sequence logos plots are severely limited in that they convey no explicit information regarding the structural dynamics of DNA backbone, a feature often critical to binding specificity. We present TRX-LOGOS, an R software package and Perl wrapper code that interfaces the JASPAR database for computational regulatory genomics. TRX-LOGOS extends the traditional sequence logo plot to include Shannon information content calculated with regard to the dinucleotide-based BI-BII conformation shifts in phosphate linkages on the DNA backbone, thereby adding a visual measure of intrinsic DNA flexibility that can be critical for many DNA-protein interactions. TRX-LOGOS is available as an R graphics module offered at both SourceForge and as a download supplement at this journal. To demonstrate the general utility of TRX logo plots, we first calculated the information content for 416 Saccharomyces cerevisiae transcription factor binding sites functionally confirmed in the Yeastract database and matched to previously published yeast genomic alignments. We discovered that flanking regions contain significantly elevated information content at phosphate linkages than can be observed at nucleobases. We also examined broader transcription factor classifications defined by the JASPAR database, and discovered that many general signatures of transcription factor binding are locally more information rich at the level of DNA backbone dynamics than nucleobase sequence. We used TRX-logos in combination with MEGA 6.0 software for molecular evolutionary genetics analysis to visually compare the human Forkhead box/FOX protein evolution to its binding site evolution. We also compared the DNA binding signatures of human TP53 tumor suppressor determined by two different laboratory methods (SELEX and ChIP-seq). Further analysis of the entire yeast genome, center aligned at the start codon, also revealed a distinct sequence-independent 3 bp periodic pattern in information content, present only in coding region, and perhaps indicative of the non-random organization of the genetic code. TRX-LOGOS is useful in any situation in which important information content in DNA can be better visualized at the positions of phosphate linkages (i.e. dinucleotides) where the dynamic properties of the DNA backbone functions to facilitate DNA-protein interaction.

  10. Conformational Entropy of Intrinsically Disordered Proteins from Amino Acid Triads

    PubMed Central

    Baruah, Anupaul; Rani, Pooja; Biswas, Parbati

    2015-01-01

    This work quantitatively characterizes intrinsic disorder in proteins in terms of sequence composition and backbone conformational entropy. Analysis of the normalized relative composition of the amino acid triads highlights a distinct boundary between globular and disordered proteins. The conformational entropy is calculated from the dihedral angles of the middle amino acid in the amino acid triad for the conformational ensemble of the globular, partially and completely disordered proteins relative to the non-redundant database. Both Monte Carlo (MC) and Molecular Dynamics (MD) simulations are used to characterize the conformational ensemble of the representative proteins of each group. The results show that the globular proteins span approximately half of the allowed conformational states in the Ramachandran space, while the amino acid triads in disordered proteins sample the entire range of the allowed dihedral angle space following Flory’s isolated-pair hypothesis. Therefore, only the sequence information in terms of the relative amino acid triad composition may be sufficient to predict protein disorder and the backbone conformational entropy, even in the absence of well-defined structure. The predicted entropies are found to agree with those calculated using mutual information expansion and the histogram method. PMID:26138206

  11. Free-energy calculations reveal the subtle differences in the interactions of DNA bases with α-hemolysin.

    PubMed

    Manara, Richard M A; Guy, Andrew T; Wallace, E Jayne; Khalid, Syma

    2015-02-10

    Next generation DNA sequencing methods that utilize protein nanopores have the potential to revolutionize this area of biotechnology. While the technique is underpinned by simple physics, the wild-type protein pores do not have all of the desired properties for efficient and accurate DNA sequencing. Much of the research efforts have focused on protein nanopores, such as α-hemolysin from Staphylococcus aureus. However, the speed of DNA translocation has historically been an issue, hampered in part by incomplete knowledge of the energetics of translocation. Here we have utilized atomistic molecular dynamics simulations of nucleotide fragments in order to calculate the potential of mean force (PMF) through α-hemolysin. Our results reveal specific regions within the pore that play a key role in the interaction with DNA. In particular, charged residues such as D127 and K131 provide stabilizing interactions with the anionic DNA and therefore are likely to reduce the speed of translocation. These regions provide rational targets for pore optimization. Furthermore, we show that the energetic contributions to the protein-DNA interactions are a complex combination of electrostatics and short-range interactions, often mediated by water molecules.

  12. Molecular dynamics of 17α- and 21-hydroxy progesterone studied by NMR. Relation between molecule conformation and height of the barrier for methyl group reorientations in steroid compounds

    NASA Astrophysics Data System (ADS)

    Szyczewski, A.; Hołderna-Natkaniec, K.

    2005-01-01

    For the two steroid compounds 17αOH-progesterone and 21OH-progesterone, the activation energies of reorientations of the methyl groups have been determined. Their values together with results of the quantum chemical calculations permitted establishment of the sequence of the onset of the methyl group reorientations about the three-fold symmetry axis of the C-C bond. On the basis of the asymmetry parameters, the conformations of the hitherto studied pregnane derivatives and testosterone have been determined. It has been found that the conformation of ring A has dominant effect on the activation energies of the reorientation of C(19)H 3. The reorientation of the methyl group C(18)H 3 significantly depends on the conformation of the side chain 17β (torsional angle C(13)-C(17)-C(20)-O(20)) and the distance between C18 and O20. The study has proved that the 1H NMR method in combination with the quantum chemistry calculations and inelastic incoherent neutron scattering (IINS) are effective for prediction of the sequence of the methyl group reorientations about the three-fold symmetry axis.

  13. Development and Application of Quantitative Detection Method for Viral Hemorrhagic Septicemia Virus (VHSV) Genogroup IVa

    PubMed Central

    Kim, Jong-Oh; Kim, Wi-Sik; Kim, Si-Woo; Han, Hyun-Ja; Kim, Jin Woo; Park, Myoung Ae; Oh, Myung-Joo

    2014-01-01

    Viral hemorrhagic septicemia virus (VHSV) is a problematic pathogen in olive flounder (Paralichthys olivaceus) aquaculture farms in Korea. Thus, it is necessary to develop a rapid and accurate diagnostic method to detect this virus. We developed a quantitative RT-PCR (qRT-PCR) method based on the nucleocapsid (N) gene sequence of Korean VHSV isolate (Genogroup IVa). The slope and R2 values of the primer set developed in this study were −0.2928 (96% efficiency) and 0.9979, respectively. Its comparison with viral infectivity calculated by traditional quantifying method (TCID50) showed a similar pattern of kinetic changes in vitro and in vivo. The qRT-PCR method reduced detection time compared to that of TCID50, making it a very useful tool for VHSV diagnosis. PMID:24859343

  14. The “curved lead pathway” method to enable a single lead to reach any two intracranial targets

    NASA Astrophysics Data System (ADS)

    Ding, Chen-Yu; Yu, Liang-Hong; Lin, Yuan-Xiang; Chen, Fan; Lin, Zhang-Ya; Kang, De-Zhi

    2017-01-01

    Deep brain stimulation is an effective way to treat movement disorders, and a powerful research tool for exploring brain functions. This report proposes a “curved lead pathway” method for lead implantation, such that a single lead can reach in sequence to any two intracranial targets. A new type of stereotaxic system for implanting a curved lead to the brain of human/primates was designed, the auxiliary device needed for this method to be used in rat/mouse was fabricated and verified in rat, and the Excel algorithm used for automatically calculating the necessary parameters was implemented. This “curved lead pathway” method of lead implantation may complement the current method, make lead implantation for multiple targets more convenient, and expand the experimental techniques of brain function research.

  15. Protein contact prediction using patterns of correlation.

    PubMed

    Hamilton, Nicholas; Burrage, Kevin; Ragan, Mark A; Huber, Thomas

    2004-09-01

    We describe a new method for using neural networks to predict residue contact pairs in a protein. The main inputs to the neural network are a set of 25 measures of correlated mutation between all pairs of residues in two "windows" of size 5 centered on the residues of interest. While the individual pair-wise correlations are a relatively weak predictor of contact, by training the network on windows of correlation the accuracy of prediction is significantly improved. The neural network is trained on a set of 100 proteins and then tested on a disjoint set of 1033 proteins of known structure. An average predictive accuracy of 21.7% is obtained taking the best L/2 predictions for each protein, where L is the sequence length. Taking the best L/10 predictions gives an average accuracy of 30.7%. The predictor is also tested on a set of 59 proteins from the CASP5 experiment. The accuracy is found to be relatively consistent across different sequence lengths, but to vary widely according to the secondary structure. Predictive accuracy is also found to improve by using multiple sequence alignments containing many sequences to calculate the correlations. Copyright 2004 Wiley-Liss, Inc.

  16. A bimetallic nanocomposite modified genosensor for recognition and determination of thalassemia gene.

    PubMed

    Hamidi-Asl, Ezat; Raoof, Jahan Bakhsh; Naghizadeh, Nahid; Akhavan-Niaki, Haleh; Ojani, Reza; Banihashemi, Ali

    2016-10-01

    The main roles of DNA in the cells are to maintain and properly express genetic information. It is important to have analytical methods capable of fast and sensitive detection of DNA damage. DNA hybridization sensors are well suited for diagnostics and other purposes, including determination of bacteria and viruses. Beta thalassemias (βth) are due to mutations in the β-globin gene. In this study, an electrochemical biosensor which detects the sequences related to the β-globin gene issued from real samples amplified by polymerase chain reaction (PCR) is described for the first time. The biosensor relies on the immobilization of 20-mer single stranded oligonucleotide (probe) related to βth sequence on the carbon paste electrode (CPE) modified by 15% silver (Ag) and platinum (Pt) nanoparticles to prepare the bimetallic nanocomposite electrode and hybridization of this oligonucleotide with its complementary sequence (target). The extent of hybridization between the probe and target sequences was shown by using linear sweep voltammetry (LSV) with methylene blue (MB) as hybridization indicator. The selectivity of sensor was investigated using PCR samples containing non-complementary oligonucleotides. The detection limit of biosensor was calculated about 470.0pg/μL. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. Normalization of Complete Genome Characteristics: Application to Evolution from Primitive Organisms to Homo sapiens.

    PubMed

    Sorimachi, Kenji; Okayasu, Teiji; Ohhira, Shuji

    2015-04-01

    Normalized nucleotide and amino acid contents of complete genome sequences can be visualized as radar charts. The shapes of these charts depict the characteristics of an organism's genome. The normalized values calculated from the genome sequence theoretically exclude experimental errors. Further, because normalization is independent of both target size and kind, this procedure is applicable not only to single genes but also to whole genomes, which consist of a huge number of different genes. In this review, we discuss the applications of the normalization of the nucleotide and predicted amino acid contents of complete genomes to the investigation of genome structure and to evolutionary research from primitive organisms to Homo sapiens. Some of the results could never have been obtained from the analysis of individual nucleotide or amino acid sequences but were revealed only after the normalization of nucleotide and amino acid contents was applied to genome research. The discovery that genome structure was homogeneous was obtained only after normalization methods were applied to the nucleotide or predicted amino acid contents of genome sequences. Normalization procedures are also applicable to evolutionary research. Thus, normalization of the contents of whole genomes is a useful procedure that can help to characterize organisms.

  18. Quantitation of fetal DNA fraction in maternal plasma using circulating single molecule amplification and re-sequencing technology (cSMART).

    PubMed

    Song, Yijun; Zhou, Xiya; Huang, Saiqiong; Li, Xiaohong; Qi, Qingwei; Jiang, Yulin; Liu, Yiqian; Ma, Chengcheng; Li, Zhifeng; Xu, Mengnan; Cram, David S; Liu, Juntao

    2016-05-01

    Calculation of the fetal DNA fraction (FF) is important for reliable and accurate noninvasive prenatal testing (NIPT) for fetal genetic abnormalities. The aim of the study was to develop and validate a novel method for FF determination. FF was calculated using the chromosome Y (ChrY) sequence read assay and by circulating single molecule amplification and re-sequencing technology of 76 autosomal SNPs. By Pearson correlation for FF (4.73-22.11%) in 33 male pregnancy samples, the R(2) co-efficient for the 76-SNP versus the ChrY assay was 0.9572 (p<0.001). In addition, the co-efficient of variation (CV) of FF measurement by the 76-SNP assay was low (0.15-0.35). As a control, the FF measurement for four non-pregnant plasma samples was virtually zero. In prospective longitudinal studies of 14 women with normal pregnancies, FF generally increased with gestational age. However, in eight women (71%) there was a significant decrease in FF between the first trimester (11-13 weeks) and the second trimester (15-19 weeks), and this was attributable to significant maternal weight gain. The novel 76-SNP cSMART assay has the precision to accurately measure FF in all pregnancies at a detection threshold of 5%. Based on FF trends in individual pregnancies, our results suggest that the end of the first trimester may be a more optimal window for performing NIPT. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Orebody Modelling for Exploration: The Western Mineralisation, Broken Hill, NSW

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lotfolah Hamedani, Mohammad, E-mail: mlotfham@gmail.com; Plimer, Ian Rutherford; Xu Chaoshui

    2012-09-15

    The Western Mineralisation in the Broken Hill deposit was studied to identify the zonation sequence of lithogeochemical haloes along and across the strike of the orebody. Samples used are from 77 drill holes and the samples were assayed for Pb, Zn, Fe, S, Cu, Ag, Cd, Sb, Bi and As. Variogram analyses were calculated for all the elements and kriging was used to construct the 3D block model. Analysis of cross sections along and across the strike of the orebody shows that Bi and Sb form broader halos around sulphide masses and this suggests that they are pathfinder elements formore » the Pb and Zn elements of this orebody. The threshold concentrations (minimum anomaly) of the 10 elements were determined using the concentration-area analysis. On east-west vertical cross sections, the values of linear productivity, variability gradient and zonality index were calculated for each element. Based on the maximum zonality index of each element, the sequence of geochemical zonation pattern was determined from top to bottom of the orebody. The result shows that S, Pb, Zn and Cd tend to concentrate in the upper part of the mineralisation whereas Ag, Cu, Bi and As have a tendency to concentrate in the lower part of the mineralised rocks. Also, an empirical product ratio index was developed based on the position of the elements in the zonation sequence. The methods and results of this research are applicable to exploration of similar Zn and Pb sulphide ore deposits.« less

  20. MISTIC2: comprehensive server to study coevolution in protein families.

    PubMed

    Colell, Eloy A; Iserte, Javier A; Simonetti, Franco L; Marino-Buslje, Cristina

    2018-06-14

    Correlated mutations between residue pairs in evolutionarily related proteins arise from constraints needed to maintain a functional and stable protein. Identifying these inter-related positions narrows down the search for structurally or functionally important sites. MISTIC is a server designed to assist users to calculate covariation in protein families and provide them with an interactive tool to visualize the results. Here, we present MISTIC2, an update to the previous server, that allows to calculate four covariation methods (MIp, mfDCA, plmDCA and gaussianDCA). The results visualization framework has been reworked for improved performance, compatibility and user experience. It includes a circos representation of the information contained in the alignment, an interactive covariation network, a 3D structure viewer and a sequence logo. Others components provide additional information such as residue annotations, a roc curve for assessing contact prediction, data tables and different ways of filtering the data and exporting figures. Comparison of different methods is easily done and scores combination is also possible. A newly implemented web service allows users to access MISTIC2 programmatically using an API to calculate covariation and retrieve results. MISTIC2 is available at: https://mistic2.leloir.org.ar.

  1. Metric Scale Calculation for Visual Mapping Algorithms

    NASA Astrophysics Data System (ADS)

    Hanel, A.; Mitschke, A.; Boerner, R.; Van Opdenbosch, D.; Hoegner, L.; Brodie, D.; Stilla, U.

    2018-05-01

    Visual SLAM algorithms allow localizing the camera by mapping its environment by a point cloud based on visual cues. To obtain the camera locations in a metric coordinate system, the metric scale of the point cloud has to be known. This contribution describes a method to calculate the metric scale for a point cloud of an indoor environment, like a parking garage, by fusing multiple individual scale values. The individual scale values are calculated from structures and objects with a-priori known metric extension, which can be identified in the unscaled point cloud. Extensions of building structures, like the driving lane or the room height, are derived from density peaks in the point distribution. The extension of objects, like traffic signs with a known metric size, are derived using projections of their detections in images onto the point cloud. The method is tested with synthetic image sequences of a drive with a front-looking mono camera through a virtual 3D model of a parking garage. It has been shown, that each individual scale value improves either the robustness of the fused scale value or reduces its error. The error of the fused scale is comparable to other recent works.

  2. Grand canonical electronic density-functional theory: Algorithms and applications to electrochemistry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sundararaman, Ravishankar; Goddard, III, William A.; Arias, Tomas A.

    First-principles calculations combining density-functional theory and continuum solvation models enable realistic theoretical modeling and design of electrochemical systems. When a reaction proceeds in such systems, the number of electrons in the portion of the system treated quantum mechanically changes continuously, with a balancing charge appearing in the continuum electrolyte. A grand-canonical ensemble of electrons at a chemical potential set by the electrode potential is therefore the ideal description of such systems that directly mimics the experimental condition. We present two distinct algorithms: a self-consistent field method and a direct variational free energy minimization method using auxiliary Hamiltonians (GC-AuxH), to solvemore » the Kohn-Sham equations of electronic density-functional theory directly in the grand canonical ensemble at fixed potential. Both methods substantially improve performance compared to a sequence of conventional fixed-number calculations targeting the desired potential, with the GC-AuxH method additionally exhibiting reliable and smooth exponential convergence of the grand free energy. Lastly, we apply grand-canonical density-functional theory to the under-potential deposition of copper on platinum from chloride-containing electrolytes and show that chloride desorption, not partial copper monolayer formation, is responsible for the second voltammetric peak.« less

  3. Grand canonical electronic density-functional theory: Algorithms and applications to electrochemistry.

    PubMed

    Sundararaman, Ravishankar; Goddard, William A; Arias, Tomas A

    2017-03-21

    First-principles calculations combining density-functional theory and continuum solvation models enable realistic theoretical modeling and design of electrochemical systems. When a reaction proceeds in such systems, the number of electrons in the portion of the system treated quantum mechanically changes continuously, with a balancing charge appearing in the continuum electrolyte. A grand-canonical ensemble of electrons at a chemical potential set by the electrode potential is therefore the ideal description of such systems that directly mimics the experimental condition. We present two distinct algorithms: a self-consistent field method and a direct variational free energy minimization method using auxiliary Hamiltonians (GC-AuxH), to solve the Kohn-Sham equations of electronic density-functional theory directly in the grand canonical ensemble at fixed potential. Both methods substantially improve performance compared to a sequence of conventional fixed-number calculations targeting the desired potential, with the GC-AuxH method additionally exhibiting reliable and smooth exponential convergence of the grand free energy. Finally, we apply grand-canonical density-functional theory to the under-potential deposition of copper on platinum from chloride-containing electrolytes and show that chloride desorption, not partial copper monolayer formation, is responsible for the second voltammetric peak.

  4. Grand canonical electronic density-functional theory: Algorithms and applications to electrochemistry

    DOE PAGES

    Sundararaman, Ravishankar; Goddard, III, William A.; Arias, Tomas A.

    2017-03-16

    First-principles calculations combining density-functional theory and continuum solvation models enable realistic theoretical modeling and design of electrochemical systems. When a reaction proceeds in such systems, the number of electrons in the portion of the system treated quantum mechanically changes continuously, with a balancing charge appearing in the continuum electrolyte. A grand-canonical ensemble of electrons at a chemical potential set by the electrode potential is therefore the ideal description of such systems that directly mimics the experimental condition. We present two distinct algorithms: a self-consistent field method and a direct variational free energy minimization method using auxiliary Hamiltonians (GC-AuxH), to solvemore » the Kohn-Sham equations of electronic density-functional theory directly in the grand canonical ensemble at fixed potential. Both methods substantially improve performance compared to a sequence of conventional fixed-number calculations targeting the desired potential, with the GC-AuxH method additionally exhibiting reliable and smooth exponential convergence of the grand free energy. Lastly, we apply grand-canonical density-functional theory to the under-potential deposition of copper on platinum from chloride-containing electrolytes and show that chloride desorption, not partial copper monolayer formation, is responsible for the second voltammetric peak.« less

  5. Sequence Factorial of "g"-Gonal Numbers

    ERIC Educational Resources Information Center

    Asiru, Muniru A.

    2013-01-01

    The gamma function, which has the property to interpolate the factorial whenever the argument is an integer, is a special case (the case "g"?=?2) of the general term of the sequence factorial of "g"-gonal numbers. In relation to this special case, a formula for calculating the general term of the sequence factorial of any…

  6. BIOPEP database and other programs for processing bioactive peptide sequences.

    PubMed

    Minkiewicz, Piotr; Dziuba, Jerzy; Iwaniak, Anna; Dziuba, Marta; Darewicz, Małgorzata

    2008-01-01

    This review presents the potential for application of computational tools in peptide science based on a sample BIOPEP database and program as well as other programs and databases available via the World Wide Web. The BIOPEP application contains a database of biologically active peptide sequences and a program enabling construction of profiles of the potential biological activity of protein fragments, calculation of quantitative descriptors as measures of the value of proteins as potential precursors of bioactive peptides, and prediction of bonds susceptible to hydrolysis by endopeptidases in a protein chain. Other bioactive and allergenic peptide sequence databases are also presented. Programs enabling the construction of binary and multiple alignments between peptide sequences, the construction of sequence motifs attributed to a given type of bioactivity, searching for potential precursors of bioactive peptides, and the prediction of sites susceptible to proteolytic cleavage in protein chains are available via the Internet as are other approaches concerning secondary structure prediction and calculation of physicochemical features based on amino acid sequence. Programs for prediction of allergenic and toxic properties have also been developed. This review explores the possibilities of cooperation between various programs.

  7. Focal point determination in magnetic resonance-guided focused ultrasound using tracking coils.

    PubMed

    Svedin, Bryant T; Beck, Michael J; Hadley, J Rock; Merrill, Robb; de Bever, Joshua T; Bolster, Bradley D; Payne, Allison; Parker, Dennis L

    2017-06-01

    To develop a method for rapid prediction of the geometric focus location in MR coordinates of a focused ultrasound (US) transducer with arbitrary position and orientation without sonicating. Three small tracker coil circuits were designed, constructed, attached to the transducer housing of a breast-specific MR-guided focused US (MRgFUS) system with 5 degrees of freedom, and connected to receiver channel inputs of an MRI scanner. A one-dimensional sequence applied in three orthogonal directions determined the position of each tracker, which was then corrected for gradient nonlinearity. In a calibration step, low-level heating located the US focus in one transducer position orientation where the tracker positions were also known. Subsequent US focus locations were determined from the isometric transformation of the trackers. The accuracy of this method was verified by comparing the tracking coil predictions to thermal center of mass calculated using MR thermometry data acquired at 16 different transducer positions for MRgFUS sonications in a homogeneous gelatin phantom. The tracker coil predicted focus was an average distance of 2.1 ± 1.1 mm from the thermal center of mass. The one-dimensional locator sequence and prediction calculations took less than 1 s to perform. This technique accurately predicts the geometric focus for a transducer with arbitrary position and orientation without sonicating. Magn Reson Med 77:2424-2430, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.

  8. VIP Barcoding: composition vector-based software for rapid species identification based on DNA barcoding.

    PubMed

    Fan, Long; Hui, Jerome H L; Yu, Zu Guo; Chu, Ka Hou

    2014-07-01

    Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/. © 2014 John Wiley & Sons Ltd.

  9. Comprehensive Cardiovascular magnetic resonance of myocardial mechanics in mice using three-dimensional cine DENSE

    PubMed Central

    2011-01-01

    Background Quantitative noninvasive imaging of myocardial mechanics in mice enables studies of the roles of individual genes in cardiac function. We sought to develop comprehensive three-dimensional methods for imaging myocardial mechanics in mice. Methods A 3D cine DENSE pulse sequence was implemented on a 7T small-bore scanner. The sequence used three-point phase cycling for artifact suppression and a stack-of-spirals k-space trajectory for efficient data acquisition. A semi-automatic 2D method was adapted for 3D image segmentation, and automated 3D methods to calculate strain, twist, and torsion were employed. A scan protocol that covered the majority of the left ventricle in a scan time of less than 25 minutes was developed, and seven healthy C57Bl/6 mice were studied. Results Using these methods, multiphase normal and shear strains were measured, as were myocardial twist and torsion. Peak end-systolic values for the normal strains at the mid-ventricular level were 0.29 ± 0.17, -0.13 ± 0.03, and -0.18 ± 0.14 for Err, Ecc, and Ell, respectively. Peak end-systolic values for the shear strains were 0.00 ± 0.08, 0.04 ± 0.12, and 0.03 ± 0.07 for Erc, Erl, and Ecl, respectively. The peak end-systolic normalized torsion was 5.6 ± 0.9°. Conclusions Using a 3D cine DENSE sequence tailored for cardiac imaging in mice at 7 T, a comprehensive assessment of 3D myocardial mechanics can be achieved with a scan time of less than 25 minutes and an image analysis time of approximately 1 hour. PMID:22208954

  10. 1 Tbit/inch2 Recording in Angular-Multiplexing Holographic Memory with Constant Signal-to-Scatter Ratio Schedule

    NASA Astrophysics Data System (ADS)

    Hosaka, Makoto; Ishii, Toshiki; Tanaka, Asato; Koga, Shogo; Hoshizawa, Taku

    2013-09-01

    We developed an iterative method for optimizing the exposure schedule to obtain a constant signal-to-scatter ratio (SSR) to accommodate various recording conditions and achieve high-density recording. 192 binary images were recorded in the same location of a medium in approximately 300×300 µm2 using an experimental system embedded with a blue laser diode with a 405 nm wavelength and an objective lens with a 0.85 numerical aperture. The recording density of this multiplexing corresponds to 1 Tbit/in.2. The recording exposure time was optimized through the iteration of a three-step sequence consisting of total reproduced intensity measurement, target signal calculation, and recording energy density calculation. The SSR of pages recorded with this method was almost constant throughout the entire range of the reference beam angle. The signal-to-noise ratio of the sampled pages was over 2.9 dB, which is higher than the reproducible limit of 1.5 dB in our experimental system.

  11. Isoelectronic studies of the 5s/sup 2/ /sup 1/S/sub 0/-5s5p/sup 1,3/P/sub J/ intervals in the Cd sequence

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Curtis, L.J.

    1986-02-01

    The 5s/sup 2/ /sup 1/S/sub 0/-5s5p/sup 1,3/P/sub J/ energy intervals in the Cd isoelectronic sequence have been investigated through a semiempirical systematization of recent measurements and through the performance of ab initio multiconfiguration Dirac-Fock calculations. Screening-parameter reductions of the spin-orbit and exchange energies both for the observed data and for the theoretically computed values establish the existence of empirical linearities similar to those exploited earlier for the Be, Mg, and Zn sequences. This permits extrapolative isoelectronic predictions of the relative energies of the 5s5p levels, which can be connected to 5s/sup 2/ using intersinglet intervals obtained from empirically corrected abmore » initio calculations. These linearities have also been examined homologously for the Zn, Cd, and Hg sequences, and common relationships have been found that accurately describe all three of these sequences.« less

  12. DNA/RNA transverse current sequencing: intrinsic structural noise from neighboring bases

    PubMed Central

    Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.

    2015-01-01

    Nanopore DNA sequencing via transverse current has emerged as a promising candidate for third-generation sequencing technology. It produces long read lengths which could alleviate problems with assembly errors inherent in current technologies. However, the high error rates of nanopore sequencing have to be addressed. A very important source of the error is the intrinsic noise in the current arising from carrier dispersion along the chain of the molecule, i.e., from the influence of neighboring bases. In this work we perform calculations of the transverse current within an effective multi-orbital tight-binding model derived from first-principles calculations of the DNA/RNA molecules, to study the effect of this structural noise on the error rates in DNA/RNA sequencing via transverse current in nanopores. We demonstrate that a statistical technique, utilizing not only the currents through the nucleotides but also the correlations in the currents, can in principle reduce the error rate below any desired precision. PMID:26150827

  13. Numerical Simulation of Stress evolution and earthquake sequence of the Tibetan Plateau

    NASA Astrophysics Data System (ADS)

    Dong, Peiyu; Hu, Caibo; Shi, Yaolin

    2015-04-01

    The India-Eurasia's collision produces N-S compression and results in large thrust fault in the southern edge of the Tibetan Plateau. Differential eastern flow of the lower crust of the plateau leads to large strike-slip faults and normal faults within the plateau. From 1904 to 2014, more than 30 earthquakes of Mw > 6.5 occurred sequentially in this distinctive tectonic environment. How did the stresses evolve during the last 110 years, how did the earthquakes interact with each other? Can this knowledge help us to forecast the future seismic hazards? In this essay, we tried to simulate the evolution of the stress field and the earthquake sequence in the Tibetan plateau within the last 110 years with a 2-D finite element model. Given an initial state of stress, the boundary condition was constrained by the present-day GPS observation, which was assumed as a constant rate during the 110 years. We calculated stress evolution year by year, and earthquake would occur if stress exceed the crustal strength. Stress changes due to each large earthquake in the sequence was calculated and contributed to the stress evolution. A key issue is the choice of initial stress state of the modeling, which is actually unknown. Usually, in the study of earthquake triggering, people assume the initial stress is zero, and only calculate the stress changes by large earthquakes - the Coulomb failure stress changes (Δ CFS). To some extent, this simplified method is a powerful tool because it can reveal which fault or which part of a fault becomes more risky or safer relatively. Nonetheless, it has not utilized all information available to us. The earthquake sequence reveals, though far from complete, some information about the stress state in the region. If the entire region is close to a self-organized critical or subcritical state, earthquake stress drop provides an estimate of lower limit of initial state. For locations no earthquakes occurred during the period, initial stress has to be lower than certain value. For locations where large earthquakes occurred during the 110 years, the initial stresses can be inverted if the strength is estimated and the tectonic loading is assumed constant. Therefore, although initial stress state is unknown, we can try to make estimate of a range of it. In this study, we estimated a reasonable range of initial stress, and then based on Coulomb-Mohr criterion to regenerate the earthquake sequence, starting from the Daofu earthquake of 1904. We calculated the stress field evolution of the sequence, considering both the tectonic loading and interaction between the earthquakes. Ultimately we got a sketch of the present stress. Of course, a single model with certain initial stress is just one possible model. Consequently the potential seismic hazards distribution based on a single model is not convincing. We made test on hundreds of possible initial stress state, all of them can produce the historical earthquake sequence occurred, and summarized all kinds of calculated probabilities of the future seismic activity. Although we cannot provide the exact state in the future, but we can narrow the estimate of regions where is in high probability of risk. Our primary results indicate that the Xianshuihe fault and adjacent area is one of such zones with higher risk than other regions in the future. During 2014, there were 6 earthquakes (M > 5.0) happened in this region, which correspond with our result in some degree. We emphasized the importance of the initial stress field for the earthquake sequence, and provided a probabilistic assessment for future seismic hazards. This study may bring some new insights to estimate the initial stress, earthquake triggering, and the stress field evolution .

  14. Fast selection of miRNA candidates based on large-scale pre-computed MFE sets of randomized sequences

    PubMed Central

    2014-01-01

    Background Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings. Results Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition. Conclusion The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification. PMID:24418292

  15. Spatio-temporal alignment of multiple sensors

    NASA Astrophysics Data System (ADS)

    Zhang, Tinghua; Ni, Guoqiang; Fan, Guihua; Sun, Huayan; Yang, Biao

    2018-01-01

    Aiming to achieve the spatio-temporal alignment of multi sensor on the same platform for space target observation, a joint spatio-temporal alignment method is proposed. To calibrate the parameters and measure the attitude of cameras, an astronomical calibration method is proposed based on star chart simulation and collinear invariant features of quadrilateral diagonal between the observed star chart. In order to satisfy a temporal correspondence and spatial alignment similarity simultaneously, the method based on the astronomical calibration and attitude measurement in this paper formulates the video alignment to fold the spatial and temporal alignment into a joint alignment framework. The advantage of this method is reinforced by exploiting the similarities and prior knowledge of velocity vector field between adjacent frames, which is calculated by the SIFT Flow algorithm. The proposed method provides the highest spatio-temporal alignment accuracy compared to the state-of-the-art methods on sequences recorded from multi sensor at different times.

  16. Role of conformational sampling in computing mutation-induced changes in protein structure and stability.

    PubMed

    Kellogg, Elizabeth H; Leaver-Fay, Andrew; Baker, David

    2011-03-01

    The prediction of changes in protein stability and structure resulting from single amino acid substitutions is both a fundamental test of macromolecular modeling methodology and an important current problem as high throughput sequencing reveals sequence polymorphisms at an increasing rate. In principle, given the structure of a wild-type protein and a point mutation whose effects are to be predicted, an accurate method should recapitulate both the structural changes and the change in the folding-free energy. Here, we explore the performance of protocols which sample an increasing diversity of conformations. We find that surprisingly similar performances in predicting changes in stability are achieved using protocols that involve very different amounts of conformational sampling, provided that the resolution of the force field is matched to the resolution of the sampling method. Methods involving backbone sampling can in some cases closely recapitulate the structural changes accompanying mutations but not surprisingly tend to do more harm than good in cases where structural changes are negligible. Analysis of the outliers in the stability change calculations suggests areas needing particular improvement; these include the balance between desolvation and the formation of favorable buried polar interactions, and unfolded state modeling. Copyright © 2010 Wiley-Liss, Inc.

  17. Analysis of evolutionary conservation patterns and their influence on identifying protein functional sites.

    PubMed

    Fang, Chun; Noguchi, Tamotsu; Yamana, Hayato

    2014-10-01

    Evolutionary conservation information included in position-specific scoring matrix (PSSM) has been widely adopted by sequence-based methods for identifying protein functional sites, because all functional sites, whether in ordered or disordered proteins, are found to be conserved at some extent. However, different functional sites have different conservation patterns, some of them are linear contextual, some of them are mingled with highly variable residues, and some others seem to be conserved independently. Every value in PSSMs is calculated independently of each other, without carrying the contextual information of residues in the sequence. Therefore, adopting the direct output of PSSM for prediction fails to consider the relationship between conservation patterns of residues and the distribution of conservation scores in PSSMs. In order to demonstrate the importance of combining PSSMs with the specific conservation patterns of functional sites for prediction, three different PSSM-based methods for identifying three kinds of functional sites have been analyzed. Results suggest that, different PSSM-based methods differ in their capability to identify different patterns of functional sites, and better combining PSSMs with the specific conservation patterns of residues would largely facilitate the prediction.

  18. 40 CFR 1065.650 - Emission calculations.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... following sequence of preliminary calculations on recorded concentrations: (i) Correct all THC and CH4.... (iii) Calculate all THC and NMHC concentrations, including dilution air background concentrations, as... NMHC to background corrected mass of THC. If the background corrected mass of NMHC is greater than 0.98...

  19. Analysis of drug binding pockets and repurposing opportunities for twelve essential enzymes of ESKAPE pathogens

    PubMed Central

    Naz, Sadia; Ngo, Tony; Farooq, Umar

    2017-01-01

    Background The rapid increase in antibiotic resistance by various bacterial pathogens underlies the significance of developing new therapies and exploring different drug targets. A fraction of bacterial pathogens abbreviated as ESKAPE by the European Center for Disease Prevention and Control have been considered a major threat due to the rise in nosocomial infections. Here, we compared putative drug binding pockets of twelve essential and mostly conserved metabolic enzymes in numerous bacterial pathogens including those of the ESKAPE group and Mycobacterium tuberculosis. The comparative analysis will provide guidelines for the likelihood of transferability of the inhibitors from one species to another. Methods Nine bacterial species including six ESKAPE pathogens, Mycobacterium tuberculosis along with Mycobacterium smegmatis and Eschershia coli, two non-pathogenic bacteria, have been selected for drug binding pocket analysis of twelve essential enzymes. The amino acid sequences were obtained from Uniprot, aligned using ICM v3.8-4a and matched against the Pocketome encyclopedia. We used known co-crystal structures of selected target enzyme orthologs to evaluate the location of their active sites and binding pockets and to calculate a matrix of pairwise sequence identities across each target enzyme across the different species. This was used to generate sequence maps. Results High sequence identity of enzyme binding pockets, derived from experimentally determined co-crystallized structures, was observed among various species. Comparison at both full sequence level and for drug binding pockets of key metabolic enzymes showed that binding pockets are highly conserved (sequence similarity up to 100%) among various ESKAPE pathogens as well as Mycobacterium tuberculosis. Enzymes orthologs having conserved binding sites may have potential to interact with inhibitors in similar way and might be helpful for design of similar class of inhibitors for a particular species. The derived pocket alignments and distance-based maps provide guidelines for drug discovery and repurposing. In addition they also provide recommendations for the relevant model bacteria that may be used for initial drug testing. Discussion Comparing ligand binding sites through sequence identity calculation could be an effective approach to identify conserved orthologs as drug binding pockets have shown higher level of conservation among various species. By using this approach we could avoid the problems associated with full sequence comparison. We identified essential metabolic enzymes among ESKAPE pathogens that share high sequence identity in their putative drug binding pockets (up to 100%), of which known inhibitors can potentially antagonize these identical pockets in the various species in a similar manner. PMID:28948099

  20. FY11 Report on Metagenome Analysis using Pathogen Marker Libraries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardner, Shea N.; Allen, Jonathan E.; McLoughlin, Kevin S.

    2011-06-02

    A method, sequence library, and software suite was invented to rapidly assess whether any member of a pre-specified list of threat organisms or their near neighbors is present in a metagenome. The system was designed to handle mega- to giga-bases of FASTA-formatted raw sequence reads from short or long read next generation sequencing platforms. The approach is to pre-calculate a viral and a bacterial "Pathogen Marker Library" (PML) containing sub-sequences specific to pathogens or their near neighbors. A list of expected matches comparing every bacterial or viral genome against the PML sequences is also pre-calculated. To analyze a metagenome, readsmore » are compared to the PML, and observed PML-metagenome matches are compared to the expected PML-genome matches, and the ratio of observed relative to expected matches is reported. In other words, a 3-way comparison among the PML, metagenome, and existing genome sequences is used to quickly assess which (if any) species included in the PML is likely to be present in the metagenome, based on available sequence data. Our tests showed that the species with the most PML matches correctly indicated the organism sequenced for empirical metagenomes consisting of a cultured, relatively pure isolate. These runs completed in 1 minute to 3 hours on 12 CPU (1 thread/CPU), depending on the metagenome and PML. Using more threads on the same number of CPU resulted in speed improvements roughly proportional to the number of threads. Simulations indicated that detection sensitivity depends on both sequencing coverage levels for a species and the size of the PML: species were correctly detected even at ~0.003x coverage by the large PMLs, and at ~0.03x coverage by the smaller PMLs. Matches to true positive species were 3-4 orders of magnitude higher than to false positives. Simulations with short reads (36 nt and ~260 nt) showed that species were usually detected for metagenome coverage above 0.005x and coverage in the PML above 0.05x, and detection probability appears to be a function of both coverages. Multiple species could be detected simultaneously in a simulated low-coverage, complex metagenome, and the largest PML gave no false negative species and no false positive genera. The presence of multiple species was predicted in a complex metagenome from a human gut microbiome with 1.9 GB of short reads (75 nt); the species predicted were reasonable gut flora and no biothreat agents were detected, showing the feasibility of PML analysis of empirical complex metagenomes.« less

  1. An integrative time-varying frequency detection and channel sounding method for dynamic plasma sheath

    NASA Astrophysics Data System (ADS)

    Shi, Lei; Yao, Bo; Zhao, Lei; Liu, Xiaotong; Yang, Min; Liu, Yanming

    2018-01-01

    The plasma sheath-surrounded hypersonic vehicle is a dynamic and time-varying medium and it is almost impossible to calculate time-varying physical parameters directly. The in-fight detection of the time-varying degree is important to understand the dynamic nature of the physical parameters and their effect on re-entry communication. In this paper, a constant envelope zero autocorrelation (CAZAC) sequence based on time-varying frequency detection and channel sounding method is proposed to detect the plasma sheath electronic density time-varying property and wireless channel characteristic. The proposed method utilizes the CAZAC sequence, which has excellent autocorrelation and spread gain characteristics, to realize dynamic time-varying detection/channel sounding under low signal-to-noise ratio in the plasma sheath environment. Theoretical simulation under a typical time-varying radio channel shows that the proposed method is capable of detecting time-variation frequency up to 200 kHz and can trace the channel amplitude and phase in the time domain well under -10 dB. Experimental results conducted in the RF modulation discharge plasma device verified the time variation detection ability in practical dynamic plasma sheath. Meanwhile, nonlinear phenomenon of dynamic plasma sheath on communication signal is observed thorough channel sounding result.

  2. Classification of protein quaternary structure by functional domain composition

    PubMed Central

    Yu, Xiaojing; Wang, Chuan; Li, Yixue

    2006-01-01

    Background The number and the arrangement of subunits that form a protein are referred to as quaternary structure. Quaternary structure is an important protein attribute that is closely related to its function. Proteins with quaternary structure are called oligomeric proteins. Oligomeric proteins are involved in various biological processes, such as metabolism, signal transduction, and chromosome replication. Thus, it is highly desirable to develop some computational methods to automatically classify the quaternary structure of proteins from their sequences. Results To explore this problem, we adopted an approach based on the functional domain composition of proteins. Every protein was represented by a vector calculated from the domains in the PFAM database. The nearest neighbor algorithm (NNA) was used for classifying the quaternary structure of proteins from this information. The jackknife cross-validation test was performed on the non-redundant protein dataset in which the sequence identity was less than 25%. The overall success rate obtained is 75.17%. Additionally, to demonstrate the effectiveness of this method, we predicted the proteins in an independent dataset and achieved an overall success rate of 84.11% Conclusion Compared with the amino acid composition method and Blast, the results indicate that the domain composition approach may be a more effective and promising high-throughput method in dealing with this complicated problem in bioinformatics. PMID:16584572

  3. Correlation time and diffusion coefficient imaging: application to a granular flow system.

    PubMed

    Caprihan, A; Seymour, J D

    2000-05-01

    A parametric method for spatially resolved measurements for velocity autocorrelation functions, R(u)(tau) = , expressed as a sum of exponentials, is presented. The method is applied to a granular flow system of 2-mm oil-filled spheres rotated in a half-filled horizontal cylinder, which is an Ornstein-Uhlenbeck process with velocity autocorrelation function R(u)(tau) = e(- ||tau ||/tau(c)), where tau(c) is the correlation time and D = tau(c) is the diffusion coefficient. The pulsed-field-gradient NMR method consists of applying three different gradient pulse sequences of varying motion sensitivity to distinguish the range of correlation times present for particle motion. Time-dependent apparent diffusion coefficients are measured for these three sequences and tau(c) and D are then calculated from the apparent diffusion coefficient images. For the cylinder rotation rate of 2.3 rad/s, the axial diffusion coefficient at the top center of the free surface was 5.5 x 10(-6) m(2)/s, the correlation time was 3 ms, and the velocity fluctuation or granular temperature was 1.8 x 10(-3) m(2)/s(2). This method is also applicable to study transport in systems involving turbulence and porous media flows. Copyright 2000 Academic Press.

  4. MR fingerprinting using the quick echo splitting NMR imaging technique.

    PubMed

    Jiang, Yun; Ma, Dan; Jerecic, Renate; Duerk, Jeffrey; Seiberlich, Nicole; Gulani, Vikas; Griswold, Mark A

    2017-03-01

    The purpose of the study is to develop a quantitative method for the relaxation properties with a reduced radio frequency (RF) power deposition by combining magnetic resonance fingerprinting (MRF) technique with quick echo splitting NMR imaging technique (QUEST). A QUEST-based MRF sequence was implemented to acquire high-order echoes by increasing the gaps between RF pulses. Bloch simulations were used to calculate a dictionary containing the range of physically plausible signal evolutions using a range of T 1 and T 2 values based on the pulse sequence. MRF-QUEST was evaluated by comparing to the results of spin-echo methods. The specific absorption rate (SAR) of MRF-QUEST was compared with the clinically available methods. MRF-QUEST quantifies the relaxation properties with good accuracy at the estimated head SAR of 0.03 W/kg. T 1 and T 2 values estimated by MRF-QUEST are in good agreement with the traditional methods. The combination of the MRF and the QUEST provides an accurate quantification of T 1 and T 2 simultaneously with reduced RF power deposition. The resulting lower SAR may provide a new acquisition strategy for MRF when RF energy deposition is problematic. Magn Reson Med 77:979-988, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.

  5. SIMAP—the database of all-against-all protein sequence similarities and annotations with new interfaces and increased coverage

    PubMed Central

    Arnold, Roland; Goldenberg, Florian; Mewes, Hans-Werner; Rattei, Thomas

    2014-01-01

    The Similarity Matrix of Proteins (SIMAP, http://mips.gsf.de/simap/) database has been designed to massively accelerate computationally expensive protein sequence analysis tasks in bioinformatics. It provides pre-calculated sequence similarities interconnecting the entire known protein sequence universe, complemented by pre-calculated protein features and domains, similarity clusters and functional annotations. SIMAP covers all major public protein databases as well as many consistently re-annotated metagenomes from different repositories. As of September 2013, SIMAP contains >163 million proteins corresponding to ∼70 million non-redundant sequences. SIMAP uses the sensitive FASTA search heuristics, the Smith–Waterman alignment algorithm, the InterPro database of protein domain models and the BLAST2GO functional annotation algorithm. SIMAP assists biologists by facilitating the interactive exploration of the protein sequence universe. Web-Service and DAS interfaces allow connecting SIMAP with any other bioinformatic tool and resource. All-against-all protein sequence similarity matrices of project-specific protein collections are generated on request. Recent improvements allow SIMAP to cover the rapidly growing sequenced protein sequence universe. New Web-Service interfaces enhance the connectivity of SIMAP. Novel tools for interactive extraction of protein similarity networks have been added. Open access to SIMAP is provided through the web portal; the portal also contains instructions and links for software access and flat file downloads. PMID:24165881

  6. High-resolution correlation

    NASA Astrophysics Data System (ADS)

    Nelson, D. J.

    2007-09-01

    In the basic correlation process a sequence of time-lag-indexed correlation coefficients are computed as the inner or dot product of segments of two signals. The time-lag(s) for which the magnitude of the correlation coefficient sequence is maximized is the estimated relative time delay of the two signals. For discrete sampled signals, the delay estimated in this manner is quantized with the same relative accuracy as the clock used in sampling the signals. In addition, the correlation coefficients are real if the input signals are real. There have been many methods proposed to estimate signal delay to more accuracy than the sample interval of the digitizer clock, with some success. These methods include interpolation of the correlation coefficients, estimation of the signal delay from the group delay function, and beam forming techniques, such as the MUSIC algorithm. For spectral estimation, techniques based on phase differentiation have been popular, but these techniques have apparently not been applied to the correlation problem . We propose a phase based delay estimation method (PBDEM) based on the phase of the correlation function that provides a significant improvement of the accuracy of time delay estimation. In the process, the standard correlation function is first calculated. A time lag error function is then calculated from the correlation phase and is used to interpolate the correlation function. The signal delay is shown to be accurately estimated as the zero crossing of the correlation phase near the index of the peak correlation magnitude. This process is nearly as fast as the conventional correlation function on which it is based. For real valued signals, a simple modification is provided, which results in the same correlation accuracy as is obtained for complex valued signals.

  7. Internal twisting motion dependent conductance of an aperiodic DNA molecule

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wiliyanti, Vandan, E-mail: vandan.wiliyanti@ui.ac.id; Yudiarsah, Efta

    The influence of internal twisting motion of base-pair on conductance of an aperiodic DNA molecule has been studied. Double-stranded DNA molecule with sequence GCTAGTACGTGACGTAGCTAGGATATGCCTGA on one chain and its complement on the other chain is used. The molecule is modeled using Hamiltonian Tight Binding, in which the effect of twisting motion on base onsite energy and between bases electron hopping constant was taking into account. Semi-empirical theory of Slater-Koster is employed in bringing the twisting motion effect on the hopping constants. In addition to the ability to hop from one base to other base, electron can also hop from amore » base to sugar-phosphate backbone and vice versa. The current flowing through DNA molecule is calculated using Landauer–Büttiker formula from transmission probability, which is calculated using transfer matrix technique and scattering matrix method, simultaneously. Then, the differential conductance is calculated from the I-V curve. The calculation result shows at some region of voltages, the conductance increases as the frequency increases, but in other region it decreases with the frequency.« less

  8. TU-H-CAMPUS-IeP3-02: Neurovascular 4D Parametric Imaging Using Co-Registration of Biplane DSA Sequences with 3D Vascular Geometry Obtained From Cone Beam CT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Balasubramoniam, A; Bednarek, D; Rudin, S

    Purpose: To create 4D parametric images using biplane Digital Subtraction Angiography (DSA) sequences co-registered with the 3D vascular geometry obtained from Cone Beam-CT (CBCT). Methods: We investigated a method to derive multiple 4D Parametric Imaging (PI) maps using only one CBCT acquisition. During this procedure a 3D-DSA geometry is stored and used subsequently for all 4D images. Each time a biplane DSA is acquired, we calculate 2D parametric maps of Bolus Arrival Time (BAT), Mean Transit Time (MTT) and Time to Peak (TTP). Arterial segments which are nearly parallel with one of the biplane imaging planes in the 2D parametricmore » maps are co-registered with the 3D geometry. The values in the remaining vascular network are found using spline interpolation since the points chosen for co-registration on the vasculature are discrete and remaining regions need to be interpolated. To evaluate the method we used a patient CT volume data set for 3D printing a neurovascular phantom containing a complete Circle of Willis. We connected the phantom to a flow loop with a peristaltic pump, simulating physiological flow conditions. Contrast media was injected with an automatic injector at 10 ml/sec. Images were acquired with a Toshiba Infinix C-arm and 4D parametric image maps of the vasculature were calculated. Results: 4D BAT, MTT, and TTP parametric image maps of the Circle of Willis were derived. We generated color-coded 3D geometries which avoided artifacts due to vessel overlap or foreshortening in the projection direction. Conclusion: The software was tested successfully and multiple 4D parametric images were obtained from biplane DSA sequences without the need to acquire additional 3D-DSA runs. This can benefit the patient by reducing the contrast media and the radiation dose normally associated with these procedures. Partial support from NIH Grant R01-EB002873 and Toshiba Medical Systems Corp.« less

  9. Evaluating Imputation Algorithms for Low-Depth Genotyping-By-Sequencing (GBS) Data

    PubMed Central

    2016-01-01

    Well-powered genomic studies require genome-wide marker coverage across many individuals. For non-model species with few genomic resources, high-throughput sequencing (HTS) methods, such as Genotyping-By-Sequencing (GBS), offer an inexpensive alternative to array-based genotyping. Although affordable, datasets derived from HTS methods suffer from sequencing error, alignment errors, and missing data, all of which introduce noise and uncertainty to variant discovery and genotype calling. Under such circumstances, meaningful analysis of the data is difficult. Our primary interest lies in the issue of how one can accurately infer or impute missing genotypes in HTS-derived datasets. Many of the existing genotype imputation algorithms and software packages were primarily developed by and optimized for the human genetics community, a field where a complete and accurate reference genome has been constructed and SNP arrays have, in large part, been the common genotyping platform. We set out to answer two questions: 1) can we use existing imputation methods developed by the human genetics community to impute missing genotypes in datasets derived from non-human species and 2) are these methods, which were developed and optimized to impute ascertained variants, amenable for imputation of missing genotypes at HTS-derived variants? We selected Beagle v.4, a widely used algorithm within the human genetics community with reportedly high accuracy, to serve as our imputation contender. We performed a series of cross-validation experiments, using GBS data collected from the species Manihot esculenta by the Next Generation (NEXTGEN) Cassava Breeding Project. NEXTGEN currently imputes missing genotypes in their datasets using a LASSO-penalized, linear regression method (denoted ‘glmnet’). We selected glmnet to serve as a benchmark imputation method for this reason. We obtained estimates of imputation accuracy by masking a subset of observed genotypes, imputing, and calculating the sample Pearson correlation between observed and imputed genotype dosages at the site and individual level; computation time served as a second metric for comparison. We then set out to examine factors affecting imputation accuracy, such as levels of missing data, read depth, minor allele frequency (MAF), and reference panel composition. PMID:27537694

  10. Evaluating Imputation Algorithms for Low-Depth Genotyping-By-Sequencing (GBS) Data.

    PubMed

    Chan, Ariel W; Hamblin, Martha T; Jannink, Jean-Luc

    2016-01-01

    Well-powered genomic studies require genome-wide marker coverage across many individuals. For non-model species with few genomic resources, high-throughput sequencing (HTS) methods, such as Genotyping-By-Sequencing (GBS), offer an inexpensive alternative to array-based genotyping. Although affordable, datasets derived from HTS methods suffer from sequencing error, alignment errors, and missing data, all of which introduce noise and uncertainty to variant discovery and genotype calling. Under such circumstances, meaningful analysis of the data is difficult. Our primary interest lies in the issue of how one can accurately infer or impute missing genotypes in HTS-derived datasets. Many of the existing genotype imputation algorithms and software packages were primarily developed by and optimized for the human genetics community, a field where a complete and accurate reference genome has been constructed and SNP arrays have, in large part, been the common genotyping platform. We set out to answer two questions: 1) can we use existing imputation methods developed by the human genetics community to impute missing genotypes in datasets derived from non-human species and 2) are these methods, which were developed and optimized to impute ascertained variants, amenable for imputation of missing genotypes at HTS-derived variants? We selected Beagle v.4, a widely used algorithm within the human genetics community with reportedly high accuracy, to serve as our imputation contender. We performed a series of cross-validation experiments, using GBS data collected from the species Manihot esculenta by the Next Generation (NEXTGEN) Cassava Breeding Project. NEXTGEN currently imputes missing genotypes in their datasets using a LASSO-penalized, linear regression method (denoted 'glmnet'). We selected glmnet to serve as a benchmark imputation method for this reason. We obtained estimates of imputation accuracy by masking a subset of observed genotypes, imputing, and calculating the sample Pearson correlation between observed and imputed genotype dosages at the site and individual level; computation time served as a second metric for comparison. We then set out to examine factors affecting imputation accuracy, such as levels of missing data, read depth, minor allele frequency (MAF), and reference panel composition.

  11. A noninvasive method for measuring the velocity of diffuse hydrothermal flow by tracking moving refractive index anomalies

    NASA Astrophysics Data System (ADS)

    Mittelstaedt, Eric; Davaille, Anne; van Keken, Peter E.; Gracias, Nuno; Escartin, Javier

    2010-10-01

    Diffuse flow velocimetry (DFV) is introduced as a new, noninvasive, optical technique for measuring the velocity of diffuse hydrothermal flow. The technique uses images of a motionless, random medium (e.g., rocks) obtained through the lens of a moving refraction index anomaly (e.g., a hot upwelling). The method works in two stages. First, the changes in apparent background deformation are calculated using particle image velocimetry (PIV). The deformation vectors are determined by a cross correlation of pixel intensities across consecutive images. Second, the 2-D velocity field is calculated by cross correlating the deformation vectors between consecutive PIV calculations. The accuracy of the method is tested with laboratory and numerical experiments of a laminar, axisymmetric plume in fluids with both constant and temperature-dependent viscosity. Results show that average RMS errors are ˜5%-7% and are most accurate in regions of pervasive apparent background deformation which is commonly encountered in regions of diffuse hydrothermal flow. The method is applied to a 25 s video sequence of diffuse flow from a small fracture captured during the Bathyluck'09 cruise to the Lucky Strike hydrothermal field (September 2009). The velocities of the ˜10°C-15°C effluent reach ˜5.5 cm/s, in strong agreement with previous measurements of diffuse flow. DFV is found to be most accurate for approximately 2-D flows where background objects have a small spatial scale, such as sand or gravel.

  12. Detecting and Characterizing Repeating Earthquake Sequences During Volcanic Eruptions

    NASA Astrophysics Data System (ADS)

    Tepp, G.; Haney, M. M.; Wech, A.

    2017-12-01

    A major challenge in volcano seismology is forecasting eruptions. Repeating earthquake sequences often precede volcanic eruptions or lava dome activity, providing an opportunity for short-term eruption forecasting. Automatic detection of these sequences can lead to timely eruption notification and aid in continuous monitoring of volcanic systems. However, repeating earthquake sequences may also occur after eruptions or along with magma intrusions that do not immediately lead to an eruption. This additional challenge requires a better understanding of the processes involved in producing these sequences to distinguish those that are precursory. Calculation of the inverse moment rate and concepts from the material failure forecast method can lead to such insights. The temporal evolution of the inverse moment rate is observed to differ for precursory and non-precursory sequences, and multiple earthquake sequences may occur concurrently. These observations suggest that sequences may occur in different locations or through different processes. We developed an automated repeating earthquake sequence detector and near real-time alarm to send alerts when an in-progress sequence is identified. Near real-time inverse moment rate measurements can further improve our ability to forecast eruptions by allowing for characterization of sequences. We apply the detector to eruptions of two Alaskan volcanoes: Bogoslof in 2016-2017 and Redoubt Volcano in 2009. The Bogoslof eruption produced almost 40 repeating earthquake sequences between its start in mid-December 2016 and early June 2017, 21 of which preceded an explosive eruption, and 2 sequences in the months before eruptive activity. Three of the sequences occurred after the implementation of the alarm in late March 2017 and successfully triggered alerts. The nearest seismometers to Bogoslof are over 45 km away, requiring a detector that can work with few stations and a relatively low signal-to-noise ratio. During the Redoubt eruption, earthquake sequences were observed in the months leading up to the eruptive activity beginning in March 2009 as well as immediately preceding 7 of the 19 explosive events. In contrast to Bogoslof, Redoubt has a local monitoring network which allows for better detection and more detailed analysis of the repeating earthquake sequences.

  13. Protein sequences bound to mineral surfaces persist into deep time

    PubMed Central

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa; Freeman, Colin L; Woolley, Jos; Crisp, Molly K; Wilson, Julie; Fotakis, Anna; Fischer, Roman; Kessler, Benedikt M; Rakownikow Jersie-Christensen, Rosa; Olsen, Jesper V; Haile, James; Thomas, Jessica; Marean, Curtis W; Parkington, John; Presslee, Samantha; Lee-Thorp, Julia; Ditchfield, Peter; Hamilton, Jacqueline F; Ward, Martyn W; Wang, Chunting Michelle; Shaw, Marvin D; Harrison, Terry; Domínguez-Rodrigo, Manuel; MacPhee, Ross DE; Kwekason, Amandus; Ecker, Michaela; Kolska Horwitz, Liora; Chazan, Michael; Kröger, Roland; Thomas-Oates, Jane; Harding, John H; Cappellini, Enrico; Penkman, Kirsty; Collins, Matthew J

    2016-01-01

    Proteins persist longer in the fossil record than DNA, but the longevity, survival mechanisms and substrates remain contested. Here, we demonstrate the role of mineral binding in preserving the protein sequence in ostrich (Struthionidae) eggshell, including from the palaeontological sites of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Molecular dynamics simulations of struthiocalcin-1 and -2, the dominant proteins within the eggshell, reveal that distinct domains bind to the mineral surface. It is the domain with the strongest calculated binding energy to the calcite surface that is selectively preserved. Thermal age calculations demonstrate that the Laetoli and Olduvai peptides are 50 times older than any previously authenticated sequence (equivalent to ~16 Ma at a constant 10°C). DOI: http://dx.doi.org/10.7554/eLife.17092.001 PMID:27668515

  14. Photonic-Doppler-Velocimetry, Paraxial-Scalar Diffraction Theory and Simulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ambrose, W. P.

    2015-07-20

    In this report I describe current progress on a paraxial, scalar-field theory suitable for simulating what is measured in Photonic Doppler Velocimetry (PDV) experiments in three dimensions. I have introduced a number of approximations in this work in order to bring the total computation time for one experiment down to around 20 hours. My goals were: to develop an approximate method of calculating the peak frequency in a spectral sideband at an instant of time based on an optical diffraction theory for a moving target, to compare the ‘measured’ velocity to the ‘input’ velocity to gain insights into how andmore » to what precision PDV measures the component of the mass velocity along the optical axis, and to investigate the effects of small amounts of roughness on the measured velocity. This report illustrates the progress I have made in describing how to perform such calculations with a full three dimensional picture including tilted target, tilted mass velocity (not necessarily in the same direction), and small amounts of surface roughness. With the method established for a calculation at one instant of time, measured velocities can be simulated for a sequence of times, similar to the process of sampling velocities in experiments. Improvements in these methods are certainly possible at hugely increased computational cost. I am hopeful that readers appreciate the insights possible at the current level of approximation.« less

  15. Computational drug discovery

    PubMed Central

    Ou-Yang, Si-sheng; Lu, Jun-yan; Kong, Xiang-qian; Liang, Zhong-jie; Luo, Cheng; Jiang, Hualiang

    2012-01-01

    Computational drug discovery is an effective strategy for accelerating and economizing drug discovery and development process. Because of the dramatic increase in the availability of biological macromolecule and small molecule information, the applicability of computational drug discovery has been extended and broadly applied to nearly every stage in the drug discovery and development workflow, including target identification and validation, lead discovery and optimization and preclinical tests. Over the past decades, computational drug discovery methods such as molecular docking, pharmacophore modeling and mapping, de novo design, molecular similarity calculation and sequence-based virtual screening have been greatly improved. In this review, we present an overview of these important computational methods, platforms and successful applications in this field. PMID:22922346

  16. Efficient Unsteady Flow Visualization with High-Order Access Dependencies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Jiang; Guo, Hanqi; Yuan, Xiaoru

    We present a novel high-order access dependencies based model for efficient pathline computation in unsteady flow visualization. By taking longer access sequences into account to model more sophisticated data access patterns in particle tracing, our method greatly improves the accuracy and reliability in data access prediction. In our work, high-order access dependencies are calculated by tracing uniformly-seeded pathlines in both forward and backward directions in a preprocessing stage. The effectiveness of our proposed approach is demonstrated through a parallel particle tracing framework with high-order data prefetching. Results show that our method achieves higher data locality and hence improves the efficiencymore » of pathline computation.« less

  17. Method to amplify variable sequences without imposing primer sequences

    DOEpatents

    Bradbury, Andrew M.; Zeytun, Ahmet

    2006-11-14

    The present invention provides methods of amplifying target sequences without including regions flanking the target sequence in the amplified product or imposing amplification primer sequences on the amplified product. Also provided are methods of preparing a library from such amplified target sequences.

  18. Precise genotyping and recombination detection of Enterovirus

    PubMed Central

    2015-01-01

    Enteroviruses (EV) with different genotypes cause diverse infectious diseases in humans and mammals. A correct EV typing result is crucial for effective medical treatment and disease control; however, the emergence of novel viral strains has impaired the performance of available diagnostic tools. Here, we present a web-based tool, named EVIDENCE (EnteroVirus In DEep conception, http://symbiont.iis.sinica.edu.tw/evidence), for EV genotyping and recombination detection. We introduce the idea of using mixed-ranking scores to evaluate the fitness of prototypes based on relatedness and on the genome regions of interest. Using phylogenetic methods, the most possible genotype is determined based on the closest neighbor among the selected references. To detect possible recombination events, EVIDENCE calculates the sequence distance and phylogenetic relationship among sequences of all sliding windows scanning over the whole genome. Detected recombination events are plotted in an interactive figure for viewing of fine details. In addition, all EV sequences available in GenBank were collected and revised using the latest classification and nomenclature of EV in EVIDENCE. These sequences are built into the database and are retrieved in an indexed catalog, or can be searched for by keywords or by sequence similarity. EVIDENCE is the first web-based tool containing pipelines for genotyping and recombination detection, with updated, built-in, and complete reference sequences to improve sensitivity and specificity. The use of EVIDENCE can accelerate genotype identification, aiding clinical diagnosis and enhancing our understanding of EV evolution. PMID:26678286

  19. Draft genome sequence of an extensively drug-resistant Pseudomonas aeruginosa isolate belonging to ST644 isolated from a footpad infection in a Magellanic penguin (Spheniscus magellanicus).

    PubMed

    Sellera, Fábio P; Fernandes, Miriam R; Moura, Quézia; Souza, Tiago A; Nascimento, Cristiane L; Cerdeira, Louise; Lincopan, Nilton

    2018-03-01

    The incidence of multidrug-resistant bacteria in wildlife animals has been investigated to improve our knowledge of the spread of clinically relevant antimicrobial resistance genes. The aim of this study was to report the first draft genome sequence of an extensively drug-resistant (XDR) Pseudomonas aeruginosa ST644 isolate recovered from a Magellanic penguin with a footpad infection (bumblefoot) undergoing rehabilitation process. The genome was sequenced on an Illumina NextSeq ® platform using 150-bp paired-end reads. De novo genome assembly was performed using Velvet v.1.2.10, and the whole genome sequence was evaluated using bioinformatics approaches from the Center of Genomic Epidemiology, whereas an in-house method (mapping of raw whole genome sequence reads) was used to identify chromosomal point mutations. The genome size was calculated at 6436450bp, with 6357 protein-coding sequences and the presence of genes conferring resistance to aminoglycosides, β-lactams, phenicols, sulphonamides, tetracyclines, quinolones and fosfomycin; in addition, mutations in the genes gyrA (Thr83Ile), parC (Ser87Leu), phoQ (Arg61His) and pmrB (Tyr345His), conferring resistance to quinolones and polymyxins, respectively, were confirmed. This draft genome sequence can provide useful information for comparative genomic analysis regarding the dissemination of clinically significant antibiotic resistance genes and XDR bacterial species at the human-animal interface. Copyright © 2017 International Society for Chemotherapy of Infection and Cancer. Published by Elsevier Ltd. All rights reserved.

  20. Phylodynamic Analysis Revealed That Epidemic of CRF07_BC Strain in Men Who Have Sex with Men Drove Its Second Spreading Wave in China.

    PubMed

    Zhang, Min; Jia, Dijing; Li, Hanping; Gui, Tao; Jia, Lei; Wang, Xiaolin; Li, Tianyi; Liu, Yongjian; Bao, Zuoyi; Liu, Siyang; Zhuang, Daomin; Li, Jingyun; Li, Lin

    2017-10-01

    CRF07_BC was originally formed in Yunnan province of China in 1980s and spread quickly in injecting drug users (IDUs). In recent years, it has been introduced into men who have sex with men (MSM) and become the most dominant strain in China. In this study, we performed a comprehensively phylodynamic analysis of CRF07_BC sequences from China. All CRF07_BC sequences identified in China were retrieved from database. More sequences obtained in our laboratory were added to make the dataset more representative. A maximum-likelihood (ML) tree was constructed with PhyML3.0. Maximum clade credibility (MCC) tree and effective population size were predicted by using Markov Chains Monte Carlo sampling method with Beast software. A total of 610 CRF07_BC sequences coving 1,473 bp of the gag gene (from 817 to 2,289 according to HXB2 calculator) were included into the dataset. Three epidemic clusters were identified; two clusters comprised sequences from IDUs, while one cluster mainly contained sequences from MSMs. The time of the most recent common ancestor of clusters that composed of sequences from MSMs was estimated to be in 2000. Two rapid spreading waves of effective population size of CRF07_BC infections were identified in the skyline plot. The second wave coincided with the expanding of MSM cluster. The results indicated that the control of CRF07_BC infections in MSMs would help to decrease its epidemic in China.

  1. Modeling of Failure Mechanisms in Composites With Z-Pins-Damage Validation of Z-Pin Reinforced Co-Cured Composite Laminates

    DTIC Science & Technology

    2011-04-01

    there it is a computer implementation of the method just introduced. It uses Scilab ® programming language, and the Young modulus is calculated as final...laminate without Z-pins, its thickness, lamina stacking sequence and lamina’s engineering elastic constants, the second Scilab ® code can be used to find...EL thickness, the second Scilab ® code is employed once again; this time, though, a new Young’s modulus estimate would be produced. On the other hand

  2. Instantaneous relationship between solar inertial and local vertical local horizontal attitudes

    NASA Technical Reports Server (NTRS)

    Vickery, S. A.

    1977-01-01

    The instantaneous relationship between the Solar Inertial (SI) and Local Vertical Local Horizontal (LVLH) coordinate systems is derived. A method is presented for computation of the LVLH to SI rotational transformation matrix as a function of an input LVLH attitude and the corresponding look angles to the sun. Logic is provided for conversion between LVLH and SI attitudes expressed in terms of a pitch, yaw, roll Euler sequence. Documentation is included for a program which implements the logic on the Hewlett-Packard 97 programmable calculator.

  3. Promoter classifier: software package for promoter database analysis.

    PubMed

    Gershenzon, Naum I; Ioshikhes, Ilya P

    2005-01-01

    Promoter Classifier is a package of seven stand-alone Windows-based C++ programs allowing the following basic manipulations with a set of promoter sequences: (i) calculation of positional distributions of nucleotides averaged over all promoters of the dataset; (ii) calculation of the averaged occurrence frequencies of the transcription factor binding sites and their combinations; (iii) division of the dataset into subsets of sequences containing or lacking certain promoter elements or combinations; (iv) extraction of the promoter subsets containing or lacking CpG islands around the transcription start site; and (v) calculation of spatial distributions of the promoter DNA stacking energy and bending stiffness. All programs have a user-friendly interface and provide the results in a convenient graphical form. The Promoter Classifier package is an effective tool for various basic manipulations with eukaryotic promoter sequences that usually are necessary for analysis of large promoter datasets. The program Promoter Divider is described in more detail as a representative component of the package.

  4. Leveraging transcript quantification for fast computation of alternative splicing profiles.

    PubMed

    Alamancos, Gael P; Pagès, Amadís; Trincado, Juan L; Bellora, Nicolás; Eyras, Eduardo

    2015-09-01

    Alternative splicing plays an essential role in many cellular processes and bears major relevance in the understanding of multiple diseases, including cancer. High-throughput RNA sequencing allows genome-wide analyses of splicing across multiple conditions. However, the increasing number of available data sets represents a major challenge in terms of computation time and storage requirements. We describe SUPPA, a computational tool to calculate relative inclusion values of alternative splicing events, exploiting fast transcript quantification. SUPPA accuracy is comparable and sometimes superior to standard methods using simulated as well as real RNA-sequencing data compared with experimentally validated events. We assess the variability in terms of the choice of annotation and provide evidence that using complete transcripts rather than more transcripts per gene provides better estimates. Moreover, SUPPA coupled with de novo transcript reconstruction methods does not achieve accuracies as high as using quantification of known transcripts, but remains comparable to existing methods. Finally, we show that SUPPA is more than 1000 times faster than standard methods. Coupled with fast transcript quantification, SUPPA provides inclusion values at a much higher speed than existing methods without compromising accuracy, thereby facilitating the systematic splicing analysis of large data sets with limited computational resources. The software is implemented in Python 2.7 and is available under the MIT license at https://bitbucket.org/regulatorygenomicsupf/suppa. © 2015 Alamancos et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  5. Human gait recognition by pyramid of HOG feature on silhouette images

    NASA Astrophysics Data System (ADS)

    Yang, Guang; Yin, Yafeng; Park, Jeanrok; Man, Hong

    2013-03-01

    As a uncommon biometric modality, human gait recognition has a great advantage of identify people at a distance without high resolution images. It has attracted much attention in recent years, especially in the fields of computer vision and remote sensing. In this paper, we propose a human gait recognition framework that consists of a reliable background subtraction method followed by the pyramid of Histogram of Gradient (pHOG) feature extraction on the silhouette image, and a Hidden Markov Model (HMM) based classifier. Through background subtraction, the silhouette of human gait in each frame is extracted and normalized from the raw video sequence. After removing the shadow and noise in each region of interest (ROI), pHOG feature is computed on the silhouettes images. Then the pHOG features of each gait class will be used to train a corresponding HMM. In the test stage, pHOG feature will be extracted from each test sequence and used to calculate the posterior probability toward each trained HMM model. Experimental results on the CASIA Gait Dataset B1 demonstrate that with our proposed method can achieve very competitive recognition rate.

  6. Adaptive metric learning with deep neural networks for video-based facial expression recognition

    NASA Astrophysics Data System (ADS)

    Liu, Xiaofeng; Ge, Yubin; Yang, Chao; Jia, Ping

    2018-01-01

    Video-based facial expression recognition has become increasingly important for plenty of applications in the real world. Despite that numerous efforts have been made for the single sequence, how to balance the complex distribution of intra- and interclass variations well between sequences has remained a great difficulty in this area. We propose the adaptive (N+M)-tuplet clusters loss function and optimize it with the softmax loss simultaneously in the training phrase. The variations introduced by personal attributes are alleviated using the similarity measurements of multiple samples in the feature space with many fewer comparison times as conventional deep metric learning approaches, which enables the metric calculations for large data applications (e.g., videos). Both the spatial and temporal relations are well explored by a unified framework that consists of an Inception-ResNet network with long short term memory and the two fully connected layer branches structure. Our proposed method has been evaluated with three well-known databases, and the experimental results show that our method outperforms many state-of-the-art approaches.

  7. Matching tire tracks on the head using forensic photogrammetry.

    PubMed

    Thali, M J; Braun, M; Brüschweiler, W; Dirnhofer, R

    2000-09-11

    In the field of the documentation of forensics-relevant injuries, from the reconstructive point of view, the forensic, CAD-supported photogrammetry plays an important role; particularly so when a detailed 3-D reconstruction is vital. This is demonstrated with a soft-tissue injury to the face caused by being run over by a car tire. Since the objects (injury and surface of the tire) to be investigated will be evaluated in virtual space, they must be series photographed. These photo sequences are then evaluated with the RolleiMetric multi-image evaluation system. This system measures and calculates the spatial location of points shown in the photo sequences, and creates 3-D data models of the objects. In a 3-D CAD program, the model of the injury is then compared against the model of the possible injury-causing instrument. The validation of the forensic, CAD-supported photogrammetry, as shown by the perfect 3-D match between the tire tread and the facial injury, demonstrates how greatly this 3-D method surpasses the classic 2-D overlay method (one-to-one photography).

  8. Normal and compound poisson approximations for pattern occurrences in NGS reads.

    PubMed

    Zhai, Zhiyuan; Reinert, Gesine; Song, Kai; Waterman, Michael S; Luan, Yihui; Sun, Fengzhu

    2012-06-01

    Next generation sequencing (NGS) technologies are now widely used in many biological studies. In NGS, sequence reads are randomly sampled from the genome sequence of interest. Most computational approaches for NGS data first map the reads to the genome and then analyze the data based on the mapped reads. Since many organisms have unknown genome sequences and many reads cannot be uniquely mapped to the genomes even if the genome sequences are known, alternative analytical methods are needed for the study of NGS data. Here we suggest using word patterns to analyze NGS data. Word pattern counting (the study of the probabilistic distribution of the number of occurrences of word patterns in one or multiple long sequences) has played an important role in molecular sequence analysis. However, no studies are available on the distribution of the number of occurrences of word patterns in NGS reads. In this article, we build probabilistic models for the background sequence and the sampling process of the sequence reads from the genome. Based on the models, we provide normal and compound Poisson approximations for the number of occurrences of word patterns from the sequence reads, with bounds on the approximation error. The main challenge is to consider the randomness in generating the long background sequence, as well as in the sampling of the reads using NGS. We show the accuracy of these approximations under a variety of conditions for different patterns with various characteristics. Under realistic assumptions, the compound Poisson approximation seems to outperform the normal approximation in most situations. These approximate distributions can be used to evaluate the statistical significance of the occurrence of patterns from NGS data. The theory and the computational algorithm for calculating the approximate distributions are then used to analyze ChIP-Seq data using transcription factor GABP. Software is available online (www-rcf.usc.edu/∼fsun/Programs/NGS_motif_power/NGS_motif_power.html). In addition, Supplementary Material can be found online (www.liebertonline.com/cmb).

  9. A Homogenization Approach for Design and Simulation of Blast Resistant Composites

    NASA Astrophysics Data System (ADS)

    Sheyka, Michael

    Structural composites have been used in aerospace and structural engineering due to their high strength to weight ratio. Composite laminates have been successfully and extensively used in blast mitigation. This dissertation examines the use of the homogenization approach to design and simulate blast resistant composites. Three case studies are performed to examine the usefulness of different methods that may be used in designing and optimizing composite plates for blast resistance. The first case study utilizes a single degree of freedom system to simulate the blast and a reliability based approach. The first case study examines homogeneous plates and the optimal stacking sequence and plate thicknesses are determined. The second and third case studies use the homogenization method to calculate the properties of composite unit cell made of two different materials. The methods are integrated with dynamic simulation environments and advanced optimization algorithms. The second case study is 2-D and uses an implicit blast simulation, while the third case study is 3-D and simulates blast using the explicit blast method. Both case studies 2 and 3 rely on multi-objective genetic algorithms for the optimization process. Pareto optimal solutions are determined in case studies 2 and 3. Case study 3 is an integrative method for determining optimal stacking sequence, microstructure and plate thicknesses. The validity of the different methods such as homogenization, reliability, explicit blast modeling and multi-objective genetic algorithms are discussed. Possible extension of the methods to include strain rate effects and parallel computation is also examined.

  10. A sequence database allowing automated genotyping of Classical swine fever virus isolates.

    PubMed

    Dreier, Sabrina; Zimmermann, Bernd; Moennig, Volker; Greiser-Wilke, Irene

    2007-03-01

    Classical swine fever (CSF) is a highly contagious viral disease of pigs. According to the OIE classification of diseases it is classified as a notifiable (previously List A) disease, thus having the potential for causing severe socio-economic problems and affecting severely the international trade of pigs and pig products. Effective control measures are compulsory, and to expose weaknesses a reliable tracing of the spread of the virus is necessary. Genetic typing has proved to be the method of choice. However, genotyping involves the use of multiple software applications, which is laborious and complex. The implementation of a sequence database, which is accessible by the World Wide Web with the option to type automatically new CSF virus isolates once the sequence is available is described. The sequence to be typed is tested for correct orientation and, if necessary, adjusted to the right length. The alignment and the neighbor-joining phylogenetic analysis with a standard set of sequences can then be calculated. The results are displayed as a graph. As an example, the determination is shown of the genetic subgroup of the isolate obtained from the outbreaks registered in Russia, in 2005. After registration (Irene.greiser-wilke@tiho-hannover.de) the database including the module for genotyping are accessible under http://viro08.tiho-hannover.de/eg/eurl_virus_db.htm.

  11. dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions.

    PubMed

    Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

    2016-01-01

    The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. © The Author(s) 2016. Published by Oxford University Press.

  12. Network-based simulation of aircraft at gates in airport terminals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cheng, Y.

    1998-03-01

    Simulation is becoming an essential tool for planning, design, and management of airport facilities. A simulation of aircraft at gates at an airport can be applied for various periodically performed applications, relating to the dynamic behavior of aircraft at gates in airport terminals for analyses, evaluations, and decision supports. Conventionally, such simulations are implemented using an event-driven method. For a more efficient simulation, this paper proposes a network-based method. The basic idea is to transform all the sequence constraint relations of aircraft at gates into a network. The simulation is done by calculating the longest path to all the nodesmore » in the network. The effect of the algorithm of the proposed method has been examined by experiments, and the superiority of the proposed method over the event-driven method is revealed through comprehensive comparisons of their overall simulation performance.« less

  13. Determining the multi-scale hedge ratios of stock index futures using the lower partial moments method

    NASA Astrophysics Data System (ADS)

    Dai, Jun; Zhou, Haigang; Zhao, Shaoquan

    2017-01-01

    This paper considers a multi-scale future hedge strategy that minimizes lower partial moments (LPM). To do this, wavelet analysis is adopted to decompose time series data into different components. Next, different parametric estimation methods with known distributions are applied to calculate the LPM of hedged portfolios, which is the key to determining multi-scale hedge ratios over different time scales. Then these parametric methods are compared with the prevailing nonparametric kernel metric method. Empirical results indicate that in the China Securities Index 300 (CSI 300) index futures and spot markets, hedge ratios and hedge efficiency estimated by the nonparametric kernel metric method are inferior to those estimated by parametric hedging model based on the features of sequence distributions. In addition, if minimum-LPM is selected as a hedge target, the hedging periods, degree of risk aversion, and target returns can affect the multi-scale hedge ratios and hedge efficiency, respectively.

  14. Effect of sequencing of complementary feeding in relation to breast-feeding on total intake in infants.

    PubMed

    Shah, Dheeraj; Singh, Meenakshi; Gupta, Piyush; Faridi, M M A

    2014-03-01

    The aim of the present study was to evaluate whether the order of complementary feeding in relation to breast-feeding affects breast milk, semisolid, or total energy intake in infants. The present study was designed as a randomized crossover trial. The study was conducted in a tertiary care hospital. The study participants were 25 healthy infants between the ages of 7 and 11 months who were exclusively breast-fed for at least 6 months and were now receiving complementary foods for at least 1 month in addition to breast-feeding. Infants were randomized to follow a sequence of either complementary feeding before breast-feeding (sequence A) or complementary feeding after breast-feeding (sequence B) for the first day (24 hours) of the study period using simple randomization. For the next day, the sequence was reversed for each child. All babies received 3 actively fed complementary food meals per day (morning, afternoon, and evening). A semisolid study diet was prepared in the hospital by cooking rice and pulse with oil using a standard method, ensuring the energy density of at least 0.6 kcal/g. The infants were allowed ad libitum breast-feeding during the observation period. Semisolid intake was directly measured and breast milk intake was quantified by test weighing method. Energy intake from complementary foods was calculated from the product of energy density of the diet served on that day and the total amount consumed. The total energy intake and energy intake from breast milk and complementary foods between the 2 sequences were compared. The mean (standard deviation) energy intake from breast milk during 12 hours of daytime by following sequence A (complementary feeding before breast-feeding) was 132.0 (67.4) kcal in comparison with 135.9 (56.2) kcal in sequence B, which was not statistically different (P = 0.83). The mean (standard deviation) energy consumed from semisolids in sequences A and B was also comparable (88.6 [75.5] kcal vs. 85.5 [89.7] kcal; P = 0.58). The total energy intake during daytime in sequence A was 220.6 (96.2) kcal in comparison with 221.5 (94.0) kcal in sequence B, which was also comparable (P = 0.97). The results related to energy intake through breast milk and total energy intake were not different when insensible losses during feeding were adjusted in both groups. Altering the sequence of complementary feeding in relation to breast-feeding does not affect total energy intake.

  15. Soil Microbial Functional and Fungal Diversity as Influenced by Municipal Sewage Sludge Accumulation

    PubMed Central

    Frąc, Magdalena; Oszust, Karolina; Lipiec, Jerzy; Jezierska-Tys, Stefania; Nwaichi, Eucharia Oluchi

    2014-01-01

    Safe disposal of municipal sewage sludge is a challenging global environmental concern. The aim of this study was to assess the response of soil microbial functional diversity to the accumulation of municipal sewage sludge during landfill storage. Soil samples of a municipal sewage sludge (SS) and from a sewage sludge landfill that was 3 m from a SS landfill (SS3) were analyzed relative to an undisturbed reference soil. Biolog EcoPlatesTM were inoculated with a soil suspension, and the Average Well Color Development (AWCD), Richness (R) and Shannon-Weaver index (H) were calculated to interpret the results. The fungi isolated from the sewage sludge were identified using comparative rDNA sequencing of the LSU D2 region. The MicroSEQ® ID software was used to assess the raw sequence files, perform sequence matching to the MicroSEQ® ID-validated reference database and create Neighbor-Joining trees. Moreover, the genera of fungi isolated from the soil were identified using microscopic methods. Municipal sewage sludge can serve as a habitat for plant pathogens and as a source of pathogen strains for biotechnological applications. PMID:25170681

  16. Soil microbial functional and fungal diversity as influenced by municipal sewage sludge accumulation.

    PubMed

    Frąc, Magdalena; Oszust, Karolina; Lipiec, Jerzy; Jezierska-Tys, Stefania; Nwaichi, Eucharia Oluchi

    2014-08-28

    Safe disposal of municipal sewage sludge is a challenging global environmental concern. The aim of this study was to assess the response of soil microbial functional diversity to the accumulation of municipal sewage sludge during landfill storage. Soil samples of a municipal sewage sludge (SS) and from a sewage sludge landfill that was 3 m from a SS landfill (SS3) were analyzed relative to an undisturbed reference soil. Biolog EcoPlatesTM were inoculated with a soil suspension, and the Average Well Color Development (AWCD), Richness (R) and Shannon-Weaver index (H) were calculated to interpret the results. The fungi isolated from the sewage sludge were identified using comparative rDNA sequencing of the LSU D2 region. The MicroSEQ® ID software was used to assess the raw sequence files, perform sequence matching to the MicroSEQ® ID-validated reference database and create Neighbor-Joining trees. Moreover, the genera of fungi isolated from the soil were identified using microscopic methods. Municipal sewage sludge can serve as a habitat for plant pathogens and as a source of pathogen strains for biotechnological applications.

  17. INTERDISCIPLINARY PHYSICS AND RELATED AREAS OF SCIENCE AND TECHNOLOGY: Relaxation Property and Stability Analysis of the Quasispecies Models

    NASA Astrophysics Data System (ADS)

    Feng, Xiao-Li; Li, Yu-Xiao; Gu, Jian-Zhong; Zhuo, Yi-Zhong

    2009-10-01

    The relaxation property of both Eigen model and Crow-Kimura model with a single peak fitness landscape is studied from phase transition point of view. We first analyze the eigenvalue spectra of the replication mutation matrices. For sufficiently long sequences, the almost crossing point between the largest and second-largest eigenvalues locates the error threshold at which critical slowing down behavior appears. We calculate the critical exponent in the limit of infinite sequence lengths and compare it with the result from numerical curve fittings at sufficiently long sequences. We find that for both models the relaxation time diverges with exponent 1 at the error (mutation) threshold point. Results obtained from both methods agree quite well. From the unlimited correlation length feature, the first order phase transition is further confirmed. Finally with linear stability theory, we show that the two model systems are stable for all ranges of mutation rate. The Eigen model is asymptotically stable in terms of mutant classes, and the Crow-Kimura model is completely stable.

  18. [Determination of genetic bases of auxotrophy in Yersinia pestis ssp. caucasica strains].

    PubMed

    Odinokov, G N; Eroshenko, G A; Kukleva, L M; Shavina, N Iu; Krasnov, Ia M; Kutyrev, V V

    2012-04-01

    Based on the results of computer analysis of nucleotide sequences in strains Yersinia pestis and Y. pseudotuberculosis recorded in the files of NCBI GenBank database, differences between genes argA, aroG, aroF, thiH, and thiG of strain Pestoides F (subspecies caucasica) were found, compared to other strains of plaque agent and pseudotuberculosis microbe. Using PCR with calculated primers and the method of sequence analysis, the structure of variable regions of these genes was studied in 96 natural Y. pestis and Y. pseudotuberculosis strains. It was shown that all examined strains of subspecies caucasica, unlike strains of plague-causing agent of other subspecies and pseudotubercolosis microbe, had identical mutations in genes argA (integration of the insertion sequence IS100), aroG (insertion of ten nucleotides), aroF (inserion of IS100), thiH (insertion of nucleotide T), and thiG (deletion of 13 nucleotides). These mutations are the reason for the absence in strains belonging to this subspecies of the ability to synthesize arginine, phenylalanine, tyrosine, and vitamin B1 (thiamine), and cause their auxotrophy for these growth factors.

  19. Static-stress impact of the 1992 Landers earthquake sequence on nucleation and slip at the site of the 1999 M=7.1 Hector Mine earthquake, southern California

    USGS Publications Warehouse

    Parsons, Tom; Dreger, Douglas S.

    2000-01-01

    The proximity in time (∼7 years) and space (∼20 km) between the 1992 M=7.3 Landers earthquake and the 1999 M=7.1 Hector Mine event suggests a possible link between the quakes. We thus calculated the static stress changes following the 1992 Joshua Tree/Landers/Big Bear earthquake sequence on the 1999 M=7.1 Hector Mine rupture plane in southern California. Resolving the stress tensor into rake-parallel and fault-normal components and comparing with changes in the post-Landers seismicity rate allows us to estimate a coefficient of friction on the Hector Mine plane. Seismicity following the 1992 sequence increased at Hector Mine where the fault was unclamped. This increase occurred despite a calculated reduction in right-lateral shear stress. The dependence of seismicity change primarily on normal stress change implies a high coefficient of static friction (µ≥0.8). We calculated the Coulomb stress change using µ=0.8 and found that the Hector Mine hypocenter was mildly encouraged (0.5 bars) by the 1992 earthquake sequence. In addition, the region of peak slip during the Hector Mine quake occurred where Coulomb stress is calculated to have increased by 0.5–1.5 bars. In general, slip was more limited where Coulomb stress was reduced, though there was some slip where the strongest stress decrease was calculated. Interestingly, many smaller earthquakes nucleated at or near the 1999 Hector Mine hypocenter after 1992, but only in 1999 did an event spread to become a M=7.1 earthquake.

  20. Development of a real-time PCR method for the differential detection and quantification of four solanaceae in GMO analysis: potato (Solanum tuberosum), tomato (Solanum lycopersicum), eggplant (Solanum melongena), and pepper (Capsicum annuum).

    PubMed

    Chaouachi, Maher; El Malki, Redouane; Berard, Aurélie; Romaniuk, Marcel; Laval, Valérie; Brunel, Dominique; Bertheau, Yves

    2008-03-26

    The labeling of products containing genetically modified organisms (GMO) is linked to their quantification since a threshold for the presence of fortuitous GMOs in food has been established. This threshold is calculated from a combination of two absolute quantification values: one for the specific GMO target and the second for an endogenous reference gene specific to the taxon. Thus, the development of reliable methods to quantify GMOs using endogenous reference genes in complex matrixes such as food and feed is needed. Plant identification can be difficult in the case of closely related taxa, which moreover are subject to introgression events. Based on the homology of beta-fructosidase sequences obtained from public databases, two couples of consensus primers were designed for the detection, quantification, and differentiation of four Solanaceae: potato (Solanum tuberosum), tomato (Solanum lycopersicum), pepper (Capsicum annuum), and eggplant (Solanum melongena). Sequence variability was studied first using lines and cultivars (intraspecies sequence variability), then using taxa involved in gene introgressions, and finally, using taxonomically close taxa (interspecies sequence variability). This study allowed us to design four highly specific TaqMan-MGB probes. A duplex real time PCR assay was developed for simultaneous quantification of tomato and potato. For eggplant and pepper, only simplex real time PCR tests were developed. The results demonstrated the high specificity and sensitivity of the assays. We therefore conclude that beta-fructosidase can be used as an endogenous reference gene for GMO analysis.

  1. A computational framework to empower probabilistic protein design

    PubMed Central

    Fromer, Menachem; Yanover, Chen

    2008-01-01

    Motivation: The task of engineering a protein to perform a target biological function is known as protein design. A commonly used paradigm casts this functional design problem as a structural one, assuming a fixed backbone. In probabilistic protein design, positional amino acid probabilities are used to create a random library of sequences to be simultaneously screened for biological activity. Clearly, certain choices of probability distributions will be more successful in yielding functional sequences. However, since the number of sequences is exponential in protein length, computational optimization of the distribution is difficult. Results: In this paper, we develop a computational framework for probabilistic protein design following the structural paradigm. We formulate the distribution of sequences for a structure using the Boltzmann distribution over their free energies. The corresponding probabilistic graphical model is constructed, and we apply belief propagation (BP) to calculate marginal amino acid probabilities. We test this method on a large structural dataset and demonstrate the superiority of BP over previous methods. Nevertheless, since the results obtained by BP are far from optimal, we thoroughly assess the paradigm using high-quality experimental data. We demonstrate that, for small scale sub-problems, BP attains identical results to those produced by exact inference on the paradigmatic model. However, quantitative analysis shows that the distributions predicted significantly differ from the experimental data. These findings, along with the excellent performance we observed using BP on the smaller problems, suggest potential shortcomings of the paradigm. We conclude with a discussion of how it may be improved in the future. Contact: fromer@cs.huji.ac.il PMID:18586717

  2. Sequence information signal processor for local and global string comparisons

    DOEpatents

    Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.

    1997-01-01

    A sequence information signal processing integrated circuit chip designed to perform high speed calculation of a dynamic programming algorithm based upon the algorithm defined by Waterman and Smith. The signal processing chip of the present invention is designed to be a building block of a linear systolic array, the performance of which can be increased by connecting additional sequence information signal processing chips to the array. The chip provides a high speed, low cost linear array processor that can locate highly similar global sequences or segments thereof such as contiguous subsequences from two different DNA or protein sequences. The chip is implemented in a preferred embodiment using CMOS VLSI technology to provide the equivalent of about 400,000 transistors or 100,000 gates. Each chip provides 16 processing elements, and is designed to provide 16 bit, two's compliment operation for maximum score precision of between -32,768 and +32,767. It is designed to provide a comparison between sequences as long as 4,194,304 elements without external software and between sequences of unlimited numbers of elements with the aid of external software. Each sequence can be assigned different deletion and insertion weight functions. Each processor is provided with a similarity measure device which is independently variable. Thus, each processor can contribute to maximum value score calculation using a different similarity measure.

  3. Prediction of new high pressure structural sequence in thorium carbide: A first principles study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sahoo, B. D., E-mail: bdsahoo@barc.gov.in; Joshi, K. D.; Gupta, Satish C.

    2015-05-14

    In the present work, we report the detailed electronic band structure calculations on thorium monocarbide. The comparison of enthalpies, derived for various phases using evolutionary structure search method in conjunction with first principles total energy calculations at several hydrostatic compressions, yielded a high pressure structural sequence of NaCl type (B1) → Pnma → Cmcm → CsCl type (B2) at hydrostatic pressures of ∼19 GPa, 36 GPa, and 200 GPa, respectively. However, the two high pressure experimental studies by Gerward et al. [J. Appl. Crystallogr. 19, 308 (1986); J. Less-Common Met. 161, L11 (1990)] one up to 36 GPa and other up to 50 GPa, onmore » substoichiometric thorium carbide samples with carbon deficiency of ∼20%, do not report any structural transition. The discrepancy between theory and experiment could be due to the non-stoichiometry of thorium carbide samples used in the experiment. Further, in order to substantiate the results of our static lattice calculations, we have determined the phonon dispersion relations for these structures from lattice dynamic calculations. The theoretically calculated phonon spectrum reveal that the B1 phase fails dynamically at ∼33.8 GPa whereas the Pnma phase appears as dynamically stable structure around the B1 to Pnma transition pressure. Similarly, the Cmcm structure also displays dynamic stability in the regime of its structural stability. The B2 phase becomes dynamically stable much below the Cmcm to B2 transition pressure. Additionally, we have derived various thermophysical properties such as zero pressure equilibrium volume, bulk modulus, its pressure derivative, Debye temperature, thermal expansion coefficient and Gruneisen parameter at 300 K and compared these with available experimental data. Further, the behavior of zero pressure bulk modulus, heat capacity and Helmholtz free energy has been examined as a function temperature and compared with the experimental data of Danan [J. Nucl. Mater. 57, 280 (1975)].« less

  4. Acoustic⁻Seismic Mixed Feature Extraction Based on Wavelet Transform for Vehicle Classification in Wireless Sensor Networks.

    PubMed

    Zhang, Heng; Pan, Zhongming; Zhang, Wenna

    2018-06-07

    An acoustic⁻seismic mixed feature extraction method based on the wavelet coefficient energy ratio (WCER) of the target signal is proposed in this study for classifying vehicle targets in wireless sensor networks. The signal was decomposed into a set of wavelet coefficients using the à trous algorithm, which is a concise method used to implement the wavelet transform of a discrete signal sequence. After the wavelet coefficients of the target acoustic and seismic signals were obtained, the energy ratio of each layer coefficient was calculated as the feature vector of the target signals. Subsequently, the acoustic and seismic features were merged into an acoustic⁻seismic mixed feature to improve the target classification accuracy after the acoustic and seismic WCER features of the target signal were simplified using the hierarchical clustering method. We selected the support vector machine method for classification and utilized the data acquired from a real-world experiment to validate the proposed method. The calculated results show that the WCER feature extraction method can effectively extract the target features from target signals. Feature simplification can reduce the time consumption of feature extraction and classification, with no effect on the target classification accuracy. The use of acoustic⁻seismic mixed features effectively improved target classification accuracy by approximately 12% compared with either acoustic signal or seismic signal alone.

  5. Evaluation of Techniques for Measuring Microbial Hazards in Bathing Waters: A Comparative Study

    PubMed Central

    Schang, Christelle; Henry, Rebekah; Kolotelo, Peter A.; Prosser, Toby; Crosbie, Nick; Grant, Trish; Cottam, Darren; O’Brien, Peter; Coutts, Scott; Deletic, Ana; McCarthy, David T.

    2016-01-01

    Recreational water quality is commonly monitored by means of culture based faecal indicator organism (FIOs) assays. However, these methods are costly and time-consuming; a serious disadvantage when combined with issues such as non-specificity and user bias. New culture and molecular methods have been developed to counter these drawbacks. This study compared industry-standard IDEXX methods (Colilert and Enterolert) with three alternative approaches: 1) TECTA™ system for E. coli and enterococci; 2) US EPA’s 1611 method (qPCR based enterococci enumeration); and 3) Next Generation Sequencing (NGS). Water samples (233) were collected from riverine, estuarine and marine environments over the 2014–2015 summer period and analysed by the four methods. The results demonstrated that E. coli and coliform densities, inferred by the IDEXX system, correlated strongly with the TECTA™ system. The TECTA™ system had further advantages in faster turnaround times (~12 hrs from sample receipt to result compared to 24 hrs); no staff time required for interpretation and less user bias (results are automatically calculated, compared to subjective colorimetric decisions). The US EPA Method 1611 qPCR method also showed significant correlation with the IDEXX enterococci method; but had significant disadvantages such as highly technical analysis and higher operational costs (330% of IDEXX). The NGS method demonstrated statistically significant correlations between IDEXX and the proportions of sequences belonging to FIOs, Enterobacteriaceae, and Enterococcaceae. While costs (3,000% of IDEXX) and analysis time (300% of IDEXX) were found to be significant drawbacks of NGS, rapid technological advances in this field will soon see it widely adopted. PMID:27213772

  6. Multi-laboratory evaluations of the performance of Catellicoccus marimammalium PCR assays developed to target gull fecal sources

    USGS Publications Warehouse

    Sinigalliano, Christopher D.; Ervin, Jared S.; Van De Werfhorst, Laurie C.; Badgley, Brian D.; Ballestée, Elisenda; Bartkowiaka, Jakob; Boehm, Alexandria B.; Byappanahalli, Muruleedhara N.; Goodwin, Kelly D.; Gourmelon, Michèle; Griffith, John; Holden, Patricia A.; Jay, Jenny; Layton, Blythe; Lee, Cheonghoon; Lee, Jiyoung; Meijer, Wim G.; Noble, Rachel; Raith, Meredith; Ryu, Hodon; Sadowsky, Michael J.; Schriewer, Alexander; Wang, Dan; Wanless, David; Whitman, Richard; Wuertz, Stefan; Santo Domingo, Jorge W.

    2013-01-01

    Here we report results from a multi-laboratory (n = 11) evaluation of four different PCR methods targeting the 16S rRNA gene of Catellicoccus marimammalium originally developed to detect gull fecal contamination in coastal environments. The methods included a conventional end-point PCR method, a SYBR® Green qPCR method, and two TaqMan® qPCR methods. Different techniques for data normalization and analysis were tested. Data analysis methods had a pronounced impact on assay sensitivity and specificity calculations. Across-laboratory standardization of metrics including the lower limit of quantification (LLOQ), target detected but not quantifiable (DNQ), and target not detected (ND) significantly improved results compared to results submitted by individual laboratories prior to definition standardization. The unit of measure used for data normalization also had a pronounced effect on measured assay performance. Data normalization to DNA mass improved quantitative method performance as compared to enterococcus normalization. The MST methods tested here were originally designed for gulls but were found in this study to also detect feces from other birds, particularly feces composited from pigeons. Sequencing efforts showed that some pigeon feces from California contained sequences similar to C. marimammalium found in gull feces. These data suggest that the prevalence, geographic scope, and ecology of C. marimammalium in host birds other than gulls require further investigation. This study represents an important first step in the multi-laboratory assessment of these methods and highlights the need to broaden and standardize additional evaluations, including environmentally relevant target concentrations in ambient waters from diverse geographic regions.

  7. Statistical alignment: computational properties, homology testing and goodness-of-fit.

    PubMed

    Hein, J; Wiuf, C; Knudsen, B; Møller, M B; Wibling, G

    2000-09-08

    The model of insertions and deletions in biological sequences, first formulated by Thorne, Kishino, and Felsenstein in 1991 (the TKF91 model), provides a basis for performing alignment within a statistical framework. Here we investigate this model.Firstly, we show how to accelerate the statistical alignment algorithms several orders of magnitude. The main innovations are to confine likelihood calculations to a band close to the similarity based alignment, to get good initial guesses of the evolutionary parameters and to apply an efficient numerical optimisation algorithm for finding the maximum likelihood estimate. In addition, the recursions originally presented by Thorne, Kishino and Felsenstein can be simplified. Two proteins, about 1500 amino acids long, can be analysed with this method in less than five seconds on a fast desktop computer, which makes this method practical for actual data analysis.Secondly, we propose a new homology test based on this model, where homology means that an ancestor to a sequence pair can be found finitely far back in time. This test has statistical advantages relative to the traditional shuffle test for proteins.Finally, we describe a goodness-of-fit test, that allows testing the proposed insertion-deletion (indel) process inherent to this model and find that real sequences (here globins) probably experience indels longer than one, contrary to what is assumed by the model. Copyright 2000 Academic Press.

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marshall, William BJ J; Rearden, Bradley T

    The validation of neutron transport methods used in nuclear criticality safety analyses is required by consensus American National Standards Institute/American Nuclear Society (ANSI/ANS) standards. In the last decade, there has been an increased interest in correlations among critical experiments used in validation that have shared physical attributes and which impact the independence of each measurement. The statistical methods included in many of the frequently cited guidance documents on performing validation calculations incorporate the assumption that all individual measurements are independent, so little guidance is available to practitioners on the topic. Typical guidance includes recommendations to select experiments from multiple facilitiesmore » and experiment series in an attempt to minimize the impact of correlations or common-cause errors in experiments. Recent efforts have been made both to determine the magnitude of such correlations between experiments and to develop and apply methods for adjusting the bias and bias uncertainty to account for the correlations. This paper describes recent work performed at Oak Ridge National Laboratory using the Sampler sequence from the SCALE code system to develop experimental correlations using a Monte Carlo sampling technique. Sampler will be available for the first time with the release of SCALE 6.2, and a brief introduction to the methods used to calculate experiment correlations within this new sequence is presented in this paper. Techniques to utilize these correlations in the establishment of upper subcritical limits are the subject of a companion paper and will not be discussed here. Example experimental uncertainties and correlation coefficients are presented for a variety of low-enriched uranium water-moderated lattice experiments selected for use in a benchmark exercise by the Working Party on Nuclear Criticality Safety Subgroup on Uncertainty Analysis in Criticality Safety Analyses. The results include studies on the effect of fuel rod pitch on the correlations, and some observations are also made regarding difficulties in determining experimental correlations using the Monte Carlo sampling technique.« less

  9. Probing DNA in nanopores via tunneling: from sequencing to ``quantum'' analogies

    NASA Astrophysics Data System (ADS)

    di Ventra, Massimiliano

    2012-02-01

    Fast and low-cost DNA sequencing methods would revolutionize medicine: a person could have his/her full genome sequenced so that drugs could be tailored to his/her specific illnesses; doctors could know in advance patients' likelihood to develop a given ailment; cures to major diseases could be found faster [1]. However, this goal of ``personalized medicine'' is hampered today by the high cost and slow speed of DNA sequencing methods. In this talk, I will discuss the sequencing protocol we suggest which requires the measurement of the distributions of transverse currents during the translocation of single-stranded DNA into nanopores [2-5]. I will support our conclusions with a combination of molecular dynamics simulations coupled to quantum mechanical calculations of electrical current in experimentally realizable systems [2-5]. I will also discuss recent experiments that support these theoretical predictions. In addition, I will show how this relatively unexplored area of research at the interface between solids, liquids, and biomolecules at the nanometer length scale is a fertile ground to study quantum phenomena that have a classical counterpart, such as ionic quasi-particles, ionic ``quantized'' conductance [6,7] and Coulomb blockade [8]. Work supported in part by NIH. [4pt] [1] M. Zwolak, M. Di Ventra, Physical Approaches to DNA Sequencing and Detection, Rev. Mod. Phys. 80, 141 (2008).[0pt] [2] M. Zwolak and M. Di Ventra, Electronic signature of DNA nucleotides via transverse transport, Nano Lett. 5, 421 (2005).[0pt] [3] J. Lagerqvist, M. Zwolak, and M. Di Ventra, Fast DNA sequencing via transverse electronic transport, Nano Lett. 6, 779 (2006).[0pt] [4] J. Lagerqvist, M. Zwolak, and M. Di Ventra, Influence of the environment and probes on rapid DNA sequencing via transverse electronic transport, Biophys. J. 93, 2384 (2007).[0pt] [5] M. Krems, M. Zwolak, Y.V. Pershin, and M. Di Ventra, Effect of noise on DNA sequencing via transverse electronic transport, Biophys. J. 97, 1990, (2009).[0pt] [6] M. Zwolak, J. Lagerqvist, and M. Di Ventra, Ionic conductance quantization in nanopores, Phys. Rev.Lett. 103, 128102 (2009).[0pt] [7] M. Zwolak, J. Wilson, and M. Di Ventra, Dehydration and ionic conductance quantization in nanopores, J. Phys. Cond. Matt. 22 454126 (2011). [0pt] [8] M. Krems and M. Di Ventra, Ionic Coulomb blockade in nanopores arXiv:1103.2749.

  10. Ordered transport and identification of particles

    DOEpatents

    Shera, E.B.

    1993-05-11

    A method and apparatus are provided for application of electrical field gradients to induce particle velocities to enable particle sequence and identification information to be obtained. Particle sequence is maintained by providing electroosmotic flow for an electrolytic solution in a particle transport tube. The transport tube and electrolytic solution are selected to provide an electroosmotic radius of >100 so that a plug flow profile is obtained for the electrolytic solution in the transport tube. Thus, particles are maintained in the same order in which they are introduced in the transport tube. When the particles also have known electrophoretic velocities, the field gradients introduce an electrophoretic velocity component onto the electroosmotic velocity. The time that the particles pass selected locations along the transport tube may then be detected and the electrophoretic velocity component calculated for particle identification. One particular application is the ordered transport and identification of labeled nucleotides sequentially cleaved from a strand of DNA.

  11. Statistical Ring Opening Metathesis Copolymerization of Norbornene and Cyclopentene by Grubbs' 1st-Generation Catalyst.

    PubMed

    Nikovia, Christiana; Maroudas, Andreas-Philippos; Goulis, Panagiotis; Tzimis, Dionysios; Paraskevopoulou, Patrina; Pitsikalis, Marinos

    2015-08-27

    Statistical copolymers of norbornene (NBE) with cyclopentene (CP) were prepared by ring-opening metathesis polymerization, employing the 1st-generation Grubbs' catalyst, in the presence or absence of triphenylphosphine, PPh₃. The reactivity ratios were estimated using the Finemann-Ross, inverted Finemann-Ross, and Kelen-Tüdos graphical methods, along with the computer program COPOINT, which evaluates the parameters of binary copolymerizations from comonomer/copolymer composition data by integrating a given copolymerization equation in its differential form. Structural parameters of the copolymers were obtained by calculating the dyad sequence fractions and the mean sequence length, which were derived using the monomer reactivity ratios. The kinetics of thermal decomposition of the copolymers along with the respective homopolymers was studied by thermogravimetric analysis within the framework of the Ozawa-Flynn-Wall and Kissinger methodologies. Finally, the effect of triphenylphosphine on the kinetics of copolymerization, the reactivity ratios, and the kinetics of thermal decomposition were examined.

  12. Entropy of finite random binary sequences with weak long-range correlations.

    PubMed

    Melnik, S S; Usatenko, O V

    2014-11-01

    We study the N-step binary stationary ergodic Markov chain and analyze its differential entropy. Supposing that the correlations are weak we express the conditional probability function of the chain through the pair correlation function and represent the entropy as a functional of the pair correlator. Since the model uses the two-point correlators instead of the block probability, it makes it possible to calculate the entropy of strings at much longer distances than using standard methods. A fluctuation contribution to the entropy due to finiteness of random chains is examined. This contribution can be of the same order as its regular part even at the relatively short lengths of subsequences. A self-similar structure of entropy with respect to the decimation transformations is revealed for some specific forms of the pair correlation function. Application of the theory to the DNA sequence of the R3 chromosome of Drosophila melanogaster is presented.

  13. Entropy of finite random binary sequences with weak long-range correlations

    NASA Astrophysics Data System (ADS)

    Melnik, S. S.; Usatenko, O. V.

    2014-11-01

    We study the N -step binary stationary ergodic Markov chain and analyze its differential entropy. Supposing that the correlations are weak we express the conditional probability function of the chain through the pair correlation function and represent the entropy as a functional of the pair correlator. Since the model uses the two-point correlators instead of the block probability, it makes it possible to calculate the entropy of strings at much longer distances than using standard methods. A fluctuation contribution to the entropy due to finiteness of random chains is examined. This contribution can be of the same order as its regular part even at the relatively short lengths of subsequences. A self-similar structure of entropy with respect to the decimation transformations is revealed for some specific forms of the pair correlation function. Application of the theory to the DNA sequence of the R3 chromosome of Drosophila melanogaster is presented.

  14. Ordered transport and identification of particles

    DOEpatents

    Shera, E. Brooks

    1993-01-01

    A method and apparatus are provided for application of electrical field gradients to induce particle velocities to enable particle sequence and identification information to be obtained. Particle sequence is maintained by providing electroosmotic flow for an electrolytic solution in a particle transport tube. The transport tube and electrolytic solution are selected to provide an electroosmotic radius of >100 so that a plug flow profile is obtained for the electrolytic solution in the transport tube. Thus, particles are maintained in the same order in which they are introduced in the transport tube. When the particles also have known electrophoretic velocities, the field gradients introduce an electrophoretic velocity component onto the electroosmotic velocity. The time that the particles pass selected locations along the transport tube may then be detected and the electrophoretic velocity component calculated for particle identification. One particular application is the ordered transport and identification of labeled nucleotides sequentially cleaved from a strand of DNA.

  15. Study of the 4p2, 5p2 and 5s5f excited configurations of the Zn and Cd isoelectronic sequences, using relativistic and non-relativistic semiempirical approaches

    NASA Astrophysics Data System (ADS)

    Di Rocco, Héctor O.; Raineri, Mónica; Reyna-Almandos, Jorge G.

    2016-11-01

    The consistency of the energy levels published for configurations 4p2, 5p2 and 5s5f belonging to Zn and Cd isoelectronic sequences is studied. Different semiempirical approaches considering the linearity of the Slater integrals for large Zc, the smoothness of the sF screening parameters, the energy values in terms of Z (or Zc), and the differences of the Ecalc - Eexp values are used, where Ecalc values are energies calculated with a Hartree-Hock method with relativistic corrections and superposition of configurations (HFR-SOC), and Eexp are the experimental values. For the np2 configurations both LS and relativistic jj expressions are considered. Configuration 5s5f is also analyzed taking into account the Landé's interval rule.

  16. High-order noise filtering in nontrivial quantum logic gates.

    PubMed

    Green, Todd; Uys, Hermann; Biercuk, Michael J

    2012-07-13

    Treating the effects of a time-dependent classical dephasing environment during quantum logic operations poses a theoretical challenge, as the application of noncommuting control operations gives rise to both dephasing and depolarization errors that must be accounted for in order to understand total average error rates. We develop a treatment based on effective Hamiltonian theory that allows us to efficiently model the effect of classical noise on nontrivial single-bit quantum logic operations composed of arbitrary control sequences. We present a general method to calculate the ensemble-averaged entanglement fidelity to arbitrary order in terms of noise filter functions, and provide explicit expressions to fourth order in the noise strength. In the weak noise limit we derive explicit filter functions for a broad class of piecewise-constant control sequences, and use them to study the performance of dynamically corrected gates, yielding good agreement with brute-force numerics.

  17. Implementation of the common phrase index method on the phrase query for information retrieval

    NASA Astrophysics Data System (ADS)

    Fatmawati, Triyah; Zaman, Badrus; Werdiningsih, Indah

    2017-08-01

    As the development of technology, the process of finding information on the news text is easy, because the text of the news is not only distributed in print media, such as newspapers, but also in electronic media that can be accessed using the search engine. In the process of finding relevant documents on the search engine, a phrase often used as a query. The number of words that make up the phrase query and their position obviously affect the relevance of the document produced. As a result, the accuracy of the information obtained will be affected. Based on the outlined problem, the purpose of this research was to analyze the implementation of the common phrase index method on information retrieval. This research will be conducted in English news text and implemented on a prototype to determine the relevance level of the documents produced. The system is built with the stages of pre-processing, indexing, term weighting calculation, and cosine similarity calculation. Then the system will display the document search results in a sequence, based on the cosine similarity. Furthermore, system testing will be conducted using 100 documents and 20 queries. That result is then used for the evaluation stage. First, determine the relevant documents using kappa statistic calculation. Second, determine the system success rate using precision, recall, and F-measure calculation. In this research, the result of kappa statistic calculation was 0.71, so that the relevant documents are eligible for the system evaluation. Then the calculation of precision, recall, and F-measure produces precision of 0.37, recall of 0.50, and F-measure of 0.43. From this result can be said that the success rate of the system to produce relevant documents is low.

  18. Designing optimal universal pulses using second-order, large-scale, non-linear optimization

    NASA Astrophysics Data System (ADS)

    Anand, Christopher Kumar; Bain, Alex D.; Curtis, Andrew Thomas; Nie, Zhenghua

    2012-06-01

    Recently, RF pulse design using first-order and quasi-second-order pulses has been actively investigated. We present a full second-order design method capable of incorporating relaxation, inhomogeneity in B0 and B1. Our model is formulated as a generic optimization problem making it easy to incorporate diverse pulse sequence features. To tame the computational cost, we present a method of calculating second derivatives in at most a constant multiple of the first derivative calculation time, this is further accelerated by using symbolic solutions of the Bloch equations. We illustrate the relative merits and performance of quasi-Newton and full second-order optimization with a series of examples, showing that even a pulse already optimized using other methods can be visibly improved. To be useful in CPMG experiments, a universal refocusing pulse should be independent of the delay time and insensitive of the relaxation time and RF inhomogeneity. We design such a pulse and show that, using it, we can obtain reliable R2 measurements for offsets within ±γB1. Finally, we compare our optimal refocusing pulse with other published refocusing pulses by doing CPMG experiments.

  19. The influence of the dispersion corrections on the performance of DFT method in modeling HNgY noble gas molecules and their complexes

    NASA Astrophysics Data System (ADS)

    Cukras, Janusz; Sadlej, Joanna

    2018-01-01

    The letter reports a comparative assessment of the usefulness of the two different Grimme's corrections for evaluating dispersion interaction (DFT-D3 and DFT-D3BJ) for the representative molecules of the family of noble-gas hydrides HXeY and their complexes with the HZ molecules, where Y and Z are F/Cl/OH/SH. with special regard to the dispersion term calculated by means of the symmetry-adapted perturbation theory (at the SAPT0 level). The results indicate that despite differences in the total interactions energy (DFT + corrections) versus SAPT0 results, the sequence of contributions of the individual dispersion terms is still maintained. Both dispersion corrections perform similarly and they improve the results suggesting that it is worthwhile to include them in calculations.

  20. Systematization of experimental data on the spectra of 3s--3p and 3p--3d transitions of Ar IX (436--861 A)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kramida, A.E.

    1985-04-01

    The article correlates and systematizes published data on the wavelengths of lines in the spectra of argon obtained by various authors by the so-called beam-foil method. The energies of the levels are compared with the results of an extrapolation along an isoelectronic sequence, and the lifetimes of the levels are compared with theoretical calculations. The energies of all 26 energy levels of the 2p/sup 5/3s, 3p and 3d configurations of Ar IX are determined and refined. Forty-five transitions between these configurations in the 436--861 A region are identified. The identifications are supported by combinations according to the Ritz principle andmore » by a semiempirical calculation of the energies of the levels.« less

  1. Gas-Phase Reactions of Dimethyl Disulfide with Aliphatic Carbanions - A Mass Spectrometry and Computational Study

    NASA Astrophysics Data System (ADS)

    Franczuk, Barbara; Danikiewicz, Witold

    2018-03-01

    Ion-molecule reactions of Me2S2 with a wide range of aliphatic carbanions differing by structure and proton affinity values have been studied in the gas phase using mass spectrometry techniques and DFT calculations. The analysis of the spectra shows a variety of product ions formed via different reaction mechanisms, depending on the structure and proton affinity of the carbanion. Product ions of thiophilic reaction ( m/z 47), SN2 ( m/z 79), and E2 elimination - addition sequence of reactions ( m/z 93) can be observed. Primary products of thiophilic reaction can undergo subsequent SN2 and proton transfer reactions. Gibbs free energy profiles calculated for experimentally observed reactions using PBE0/6-311+G(2d,p) method show good agreement with experimental results. [Figure not available: see fulltext.

  2. Avoidance of truncated proteins from unintended ribosome binding sites within heterologous protein coding sequences.

    PubMed

    Whitaker, Weston R; Lee, Hanson; Arkin, Adam P; Dueber, John E

    2015-03-20

    Genetic sequences ported into non-native hosts for synthetic biology applications can gain unexpected properties. In this study, we explored sequences functioning as ribosome binding sites (RBSs) within protein coding DNA sequences (CDSs) that cause internal translation, resulting in truncated proteins. Genome-wide prediction of bacterial RBSs, based on biophysical calculations employed by the RBS calculator, suggests a selection against internal RBSs within CDSs in Escherichia coli, but not those in Saccharomyces cerevisiae. Based on these calculations, silent mutations aimed at removing internal RBSs can effectively reduce truncation products from internal translation. However, a solution for complete elimination of internal translation initiation is not always feasible due to constraints of available coding sequences. Fluorescence assays and Western blot analysis showed that in genes with internal RBSs, increasing the strength of the intended upstream RBS had little influence on the internal translation strength. Another strategy to minimize truncated products from an internal RBS is to increase the relative strength of the upstream RBS with a concomitant reduction in promoter strength to achieve the same protein expression level. Unfortunately, lower transcription levels result in increased noise at the single cell level due to stochasticity in gene expression. At the low expression regimes desired for many synthetic biology applications, this problem becomes particularly pronounced. We found that balancing promoter strengths and upstream RBS strengths to intermediate levels can achieve the target protein concentration while avoiding both excessive noise and truncated protein.

  3. A DMAP Program for the Selection of Accelerometer Locations in MSC/NASTRAN

    NASA Technical Reports Server (NTRS)

    Peck, Jeff; Torres, Isaias

    2004-01-01

    A new program for selecting sensor locations has been written in the DMAP (Direct Matrix Abstraction Program) language of MSC/NASTRAN. The program implements the method of Effective Independence for selecting sensor locations, and is executed within a single NASTRAN analysis as a "rigid format alter" to the normal modes solution sequence (SOL 103). The user of the program is able to choose among various analysis options using Case Control and Bulk Data entries. Algorithms tailored for the placement of both uni-axial and tri- axial accelerometers are available, as well as several options for including the model s mass distribution into the calculations. Target modes for the Effective Independence analysis are selected from the MSC/NASTRAN ASET modes calculated by the "SOL 103" solution sequence. The initial candidate sensor set is also under user control, and is selected from the ASET degrees of freedom. Analysis results are printed to the MSCINASTRAN output file (*.f06), and may include the current candidate sensors set, and their associated Effective Independence distribution, at user specified iteration intervals. At the conclusion of the analysis, the model is reduced to the final sensor set, and frequencies and orthogonality checks are printed. Example results are given for a pre-test analysis of NASA s five-segment solid rocket booster modal test.

  4. PSS-3D1D: an improved 3D1D profile method of protein fold recognition for the annotation of twilight zone sequences.

    PubMed

    Ganesan, K; Parthasarathy, S

    2011-12-01

    Annotation of any newly determined protein sequence depends on the pairwise sequence identity with known sequences. However, for the twilight zone sequences which have only 15-25% identity, the pair-wise comparison methods are inadequate and the annotation becomes a challenging task. Such sequences can be annotated by using methods that recognize their fold. Bowie et al. described a 3D1D profile method in which the amino acid sequences that fold into a known 3D structure are identified by their compatibility to that known 3D structure. We have improved the above method by using the predicted secondary structure information and employ it for fold recognition from the twilight zone sequences. In our Protein Secondary Structure 3D1D (PSS-3D1D) method, a score (w) for the predicted secondary structure of the query sequence is included in finding the compatibility of the query sequence to the known fold 3D structures. In the benchmarks, the PSS-3D1D method shows a maximum of 21% improvement in predicting correctly the α + β class of folds from the sequences with twilight zone level of identity, when compared with the 3D1D profile method. Hence, the PSS-3D1D method could offer more clues than the 3D1D method for the annotation of twilight zone sequences. The web based PSS-3D1D method is freely available in the PredictFold server at http://bioinfo.bdu.ac.in/servers/ .

  5. MODBASE, a database of annotated comparative protein structure models

    PubMed Central

    Pieper, Ursula; Eswar, Narayanan; Stuart, Ashley C.; Ilyin, Valentin A.; Sali, Andrej

    2002-01-01

    MODBASE (http://guitar.rockefeller.edu/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on PSI-BLAST, IMPALA and MODELLER. MODBASE uses the MySQL relational database management system for flexible and efficient querying, and the MODVIEW Netscape plugin for viewing and manipulating multiple sequences and structures. It is updated regularly to reflect the growth of the protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different datasets. The largest dataset contains models for domains in 304 517 out of 539 171 unique protein sequences in the complete TrEMBL database (23 March 2001); only models based on significant alignments (PSI-BLAST E-value < 10–4) and models assessed to have the correct fold are included. Other datasets include models for target selection and structure-based annotation by the New York Structural Genomics Research Consortium, models for prediction of genes in the Drosophila melanogaster genome, models for structure determination of several ribosomal particles and models calculated by the MODWEB comparative modeling web server. PMID:11752309

  6. Restoration of distorted depth maps calculated from stereo sequences

    NASA Technical Reports Server (NTRS)

    Damour, Kevin; Kaufman, Howard

    1991-01-01

    A model-based Kalman estimator is developed for spatial-temporal filtering of noise and other degradations in velocity and depth maps derived from image sequences or cinema. As an illustration of the proposed procedures, edge information from image sequences of rigid objects is used in the processing of the velocity maps by selecting from a series of models for directional adaptive filtering. Adaptive filtering then allows for noise reduction while preserving sharpness in the velocity maps. Results from several synthetic and real image sequences are given.

  7. Comparative bioavailability study of cefuroxime axetil (equivalent to 500 mg cefuroxime/tablet) tablets (Zednad® versus Zinnat®) in healthy male volunteers.

    PubMed

    Asiri, Y A; Al-Hadiya, B M; Kadi, A A; Al-Khamis, K I; Mowafy, H A; El-Sayed, Y M

    2011-09-01

    This study was performed to investigate the bioequivalence of cefuroxime axetil tablets between a generic test product (A) Zednad® Tablet (500 mg cefuroxime/ tablet, Diamond Pharma, Syria), and the Reference Product (B) Zinnat® Tablet (500 mg cefuroxime/tablet, GlaxoSmithKline, Saudi Arabia). The bioavailability study was carried out for 24 healthy male volunteers. The subjects received 1 Zednad® Tablet (500 mg/ tablet) and 1 Zinnat® Tablet (500 mg/tablet) in a randomized, two-way crossover design fashion on 2 treatment days, after an overnight fast of at least 10 h, with a washout period of 7 days. 24 volunteers plus 2 alternatives completed the crossover. The bioanalysis of clinical plasma samples was accomplished by HPLC method, which was developed and validated in accordance with international guidelines. Pharmacokinetic parameters, determined by standard non-compartmental methods, and ANOVA statistics were calculated using SAS Statistical Software. The significance of a sequence effect was tested using the subjects nested in sequence as the error term. The 90% confidence intervals for the ratio between the test and reference product pharmacokinetic parameters of AUC0→t, AUC0→∞, and Cmax were calculated and found to be within the confidence limits of 80.00 - 125.00% for AUC0→t, AUC0→∞ and Cmax. The study demonstrated that the test product (A) was found bioequivalent to the reference product (B) following an oral dose of 500 mg tablet. Therefore, the two formulations were considered to be bioequivalent.

  8. Phylo_dCor: distance correlation as a novel metric for phylogenetic profiling.

    PubMed

    Sferra, Gabriella; Fratini, Federica; Ponzi, Marta; Pizzi, Elisabetta

    2017-09-05

    Elaboration of powerful methods to predict functional and/or physical protein-protein interactions from genome sequence is one of the main tasks in the post-genomic era. Phylogenetic profiling allows the prediction of protein-protein interactions at a whole genome level in both Prokaryotes and Eukaryotes. For this reason it is considered one of the most promising methods. Here, we propose an improvement of phylogenetic profiling that enables handling of large genomic datasets and infer global protein-protein interactions. This method uses the distance correlation as a new measure of phylogenetic profile similarity. We constructed robust reference sets and developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation that makes it applicable to large genomic data. Using Saccharomyces cerevisiae and Escherichia coli genome datasets, we showed that Phylo-dCor outperforms phylogenetic profiling methods previously described based on the mutual information and Pearson's correlation as measures of profile similarity. In this work, we constructed and assessed robust reference sets and propose the distance correlation as a measure for comparing phylogenetic profiles. To make it applicable to large genomic data, we developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation. Two R scripts that can be run on a wide range of machines are available upon request.

  9. Method for phosphorothioate antisense DNA sequencing by capillary electrophoresis with UV detection.

    PubMed

    Froim, D; Hopkins, C E; Belenky, A; Cohen, A S

    1997-11-01

    The progress of antisense DNA therapy demands development of reliable and convenient methods for sequencing short single-stranded oligonucleotides. A method of phosphorothioate antisense DNA sequencing analysis using UV detection coupled to capillary electrophoresis (CE) has been developed based on a modified chain termination sequencing method. The proposed method reduces the sequencing cost since it uses affordable CE-UV instrumentation and requires no labeling with minimal sample processing before analysis. Cycle sequencing with ThermoSequenase generates quantities of sequencing products that are readily detectable by UV. Discrimination of undesired components from sequencing products in the reaction mixture, previously accomplished by fluorescent or radioactive labeling, is now achieved by bringing concentrations of undesired components below the UV detection range which yields a 'clean', well defined sequence. UV detection coupled with CE offers additional conveniences for sequencing since it can be accomplished with commercially available CE-UV equipment and is readily amenable to automation.

  10. Method for phosphorothioate antisense DNA sequencing by capillary electrophoresis with UV detection.

    PubMed Central

    Froim, D; Hopkins, C E; Belenky, A; Cohen, A S

    1997-01-01

    The progress of antisense DNA therapy demands development of reliable and convenient methods for sequencing short single-stranded oligonucleotides. A method of phosphorothioate antisense DNA sequencing analysis using UV detection coupled to capillary electrophoresis (CE) has been developed based on a modified chain termination sequencing method. The proposed method reduces the sequencing cost since it uses affordable CE-UV instrumentation and requires no labeling with minimal sample processing before analysis. Cycle sequencing with ThermoSequenase generates quantities of sequencing products that are readily detectable by UV. Discrimination of undesired components from sequencing products in the reaction mixture, previously accomplished by fluorescent or radioactive labeling, is now achieved by bringing concentrations of undesired components below the UV detection range which yields a 'clean', well defined sequence. UV detection coupled with CE offers additional conveniences for sequencing since it can be accomplished with commercially available CE-UV equipment and is readily amenable to automation. PMID:9336449

  11. Development of a Coordinate Transformation method for direct georeferencing in map projection frames

    NASA Astrophysics Data System (ADS)

    Zhao, Haitao; Zhang, Bing; Wu, Changshan; Zuo, Zhengli; Chen, Zhengchao

    2013-03-01

    This paper develops a novel Coordinate Transformation method (CT-method), with which the orientation angles (roll, pitch, heading) of the local tangent frame of the GPS/INS system are transformed into those (omega, phi, kappa) of the map projection frame for direct georeferencing (DG). Especially, the orientation angles in the map projection frame were derived from a sequence of coordinate transformations. The effectiveness of orientation angles transformation was verified through comparing with DG results obtained from conventional methods (Legat method and POSPac method) using empirical data. Moreover, the CT-method was also validated with simulated data. One advantage of the proposed method is that the orientation angles can be acquired simultaneously while calculating position elements of exterior orientation (EO) parameters and auxiliary points coordinates by coordinate transformation. These three methods were demonstrated and compared using empirical data. Empirical results show that the CT-method is both as sound and effective as Legat method. Compared with POSPac method, the CT-method is more suitable for calculating EO parameters for DG in map projection frames. DG accuracy of the CT-method and Legat method are at the same level. DG results of all these three methods have systematic errors in height due to inconsistent length projection distortion in the vertical and horizontal components, and these errors can be significantly reduced using the EO height correction technique in Legat's approach. Similar to the results obtained with empirical data, the effectiveness of the CT-method was also proved with simulated data. POSPac method: The method is presented by Applanix POSPac software technical note (Hutton and Savina, 1997). It is implemented in the POSEO module of POSPac software.

  12. A variable pressure method for characterizing nanoparticle surface charge using pore sensors.

    PubMed

    Vogel, Robert; Anderson, Will; Eldridge, James; Glossop, Ben; Willmott, Geoff

    2012-04-03

    A novel method using resistive pulse sensors for electrokinetic surface charge measurements of nanoparticles is presented. This method involves recording the particle blockade rate while the pressure applied across a pore sensor is varied. This applied pressure acts in a direction which opposes transport due to the combination of electro-osmosis, electrophoresis, and inherent pressure. The blockade rate reaches a minimum when the velocity of nanoparticles in the vicinity of the pore approaches zero, and the forces on typical nanoparticles are in equilibrium. The pressure applied at this minimum rate can be used to calculate the zeta potential of the nanoparticles. The efficacy of this variable pressure method was demonstrated for a range of carboxylated 200 nm polystyrene nanoparticles with different surface charge densities. Results were of the same order as phase analysis light scattering (PALS) measurements. Unlike PALS results, the sequence of increasing zeta potential for different particle types agreed with conductometric titration.

  13. From metadynamics to dynamics.

    PubMed

    Tiwary, Pratyush; Parrinello, Michele

    2013-12-06

    Metadynamics is a commonly used and successful enhanced sampling method. By the introduction of a history dependent bias which depends on a restricted number of collective variables it can explore complex free energy surfaces characterized by several metastable states separated by large free energy barriers. Here we extend its scope by introducing a simple yet powerful method for calculating the rates of transition between different metastable states. The method does not rely on a previous knowledge of the transition states or reaction coordinates, as long as collective variables are known that can distinguish between the various stable minima in free energy space. We demonstrate that our method recovers the correct escape rates out of these stable states and also preserves the correct sequence of state-to-state transitions, with minimal extra computational effort needed over ordinary metadynamics. We apply the formalism to three different problems and in each case find excellent agreement with the results of long unbiased molecular dynamics runs.

  14. From Metadynamics to Dynamics

    NASA Astrophysics Data System (ADS)

    Tiwary, Pratyush; Parrinello, Michele

    2013-12-01

    Metadynamics is a commonly used and successful enhanced sampling method. By the introduction of a history dependent bias which depends on a restricted number of collective variables it can explore complex free energy surfaces characterized by several metastable states separated by large free energy barriers. Here we extend its scope by introducing a simple yet powerful method for calculating the rates of transition between different metastable states. The method does not rely on a previous knowledge of the transition states or reaction coordinates, as long as collective variables are known that can distinguish between the various stable minima in free energy space. We demonstrate that our method recovers the correct escape rates out of these stable states and also preserves the correct sequence of state-to-state transitions, with minimal extra computational effort needed over ordinary metadynamics. We apply the formalism to three different problems and in each case find excellent agreement with the results of long unbiased molecular dynamics runs.

  15. Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times.

    PubMed

    dos Reis, Mario; Yang, Ziheng

    2011-07-01

    The molecular clock provides a powerful way to estimate species divergence times. If information on some species divergence times is available from the fossil or geological record, it can be used to calibrate a phylogeny and estimate divergence times for all nodes in the tree. The Bayesian method provides a natural framework to incorporate different sources of information concerning divergence times, such as information in the fossil and molecular data. Current models of sequence evolution are intractable in a Bayesian setting, and Markov chain Monte Carlo (MCMC) is used to generate the posterior distribution of divergence times and evolutionary rates. This method is computationally expensive, as it involves the repeated calculation of the likelihood function. Here, we explore the use of Taylor expansion to approximate the likelihood during MCMC iteration. The approximation is much faster than conventional likelihood calculation. However, the approximation is expected to be poor when the proposed parameters are far from the likelihood peak. We explore the use of parameter transforms (square root, logarithm, and arcsine) to improve the approximation to the likelihood curve. We found that the new methods, particularly the arcsine-based transform, provided very good approximations under relaxed clock models and also under the global clock model when the global clock is not seriously violated. The approximation is poorer for analysis under the global clock when the global clock is seriously wrong and should thus not be used. The results suggest that the approximate method may be useful for Bayesian dating analysis using large data sets.

  16. Walking tree heuristics for biological string alignment, gene location, and phylogenies

    NASA Astrophysics Data System (ADS)

    Cull, P.; Holloway, J. L.; Cavener, J. D.

    1999-03-01

    Basic biological information is stored in strings of nucleic acids (DNA, RNA) or amino acids (proteins). Teasing out the meaning of these strings is a central problem of modern biology. Matching and aligning strings brings out their shared characteristics. Although string matching is well-understood in the edit-distance model, biological strings with transpositions and inversions violate this model's assumptions. We propose a family of heuristics called walking trees to align biologically reasonable strings. Both edit-distance and walking tree methods can locate specific genes within a large string when the genes' sequences are given. When we attempt to match whole strings, the walking tree matches most genes, while the edit-distance method fails. We also give examples in which the walking tree matches substrings even if they have been moved or inverted. The edit-distance method was not designed to handle these problems. We include an example in which the walking tree "discovered" a gene. Calculating scores for whole genome matches gives a method for approximating evolutionary distance. We show two evolutionary trees for the picornaviruses which were computed by the walking tree heuristic. Both of these trees show great similarity to previously constructed trees. The point of this demonstration is that WHOLE genomes can be matched and distances calculated. The first tree was created on a Sequent parallel computer and demonstrates that the walking tree heuristic can be efficiently parallelized. The second tree was created using a network of work stations and demonstrates that there is suffient parallelism in the phylogenetic tree calculation that the sequential walking tree can be used effectively on a network.

  17. An automated method for mapping human tissue permittivities by MRI in hyperthermia treatment planning.

    PubMed

    Farace, P; Pontalti, R; Cristoforetti, L; Antolini, R; Scarpa, M

    1997-11-01

    This paper presents an automatic method to obtain tissue complex permittivity values to be used as input data in the computer modelling for hyperthermia treatment planning. Magnetic resonance (MR) images were acquired and the tissue water content was calculated from the signal intensity of the image pixels. The tissue water content was converted into complex permittivity values by monotonic functions based on mixture theory. To obtain a water content map by MR imaging a gradient-echo pulse sequence was used and an experimental procedure was set up to correct for relaxation and radiofrequency field inhomogeneity effects on signal intensity. Two approaches were followed to assign the permittivity values to fat-rich tissues: (i) fat-rich tissue localization by a segmentation procedure followed by assignment of tabulated permittivity values; (ii) water content evaluation by chemical shift imaging followed by permittivity calculation. Tests were performed on phantoms of known water content to establish the reliability of the proposed method. MRI data were acquired and processed pixel-by-pixel according to the outlined procedure. The signal intensity in the phantom images correlated well with water content. Experiments were performed on volunteers' healthy tissue. In particular two anatomical structures were chosen to calculate permittivity maps: the head and the thigh. The water content and electric permittivity values were obtained from the MRI data and compared to others in the literature. A good agreement was found for muscle, cerebrospinal fluid (CSF) and white and grey matter. The advantages of the reported method are discussed in the light of possible application in hyperthermia treatment planning.

  18. Modeling landslide recurrence in Seattle, Washington, USA

    USGS Publications Warehouse

    Salciarini, Diana; Godt, Jonathan W.; Savage, William Z.; Baum, Rex L.; Conversini, Pietro

    2008-01-01

    To manage the hazard associated with shallow landslides, decision makers need an understanding of where and when landslides may occur. A variety of approaches have been used to estimate the hazard from shallow, rainfall-triggered landslides, such as empirical rainfall threshold methods or probabilistic methods based on historical records. The wide availability of Geographic Information Systems (GIS) and digital topographic data has led to the development of analytic methods for landslide hazard estimation that couple steady-state hydrological models with slope stability calculations. Because these methods typically neglect the transient effects of infiltration on slope stability, results cannot be linked with historical or forecasted rainfall sequences. Estimates of the frequency of conditions likely to cause landslides are critical for quantitative risk and hazard assessments. We present results to demonstrate how a transient infiltration model coupled with an infinite slope stability calculation may be used to assess shallow landslide frequency in the City of Seattle, Washington, USA. A module called CRF (Critical RainFall) for estimating deterministic rainfall thresholds has been integrated in the TRIGRS (Transient Rainfall Infiltration and Grid-based Slope-Stability) model that combines a transient, one-dimensional analytic solution for pore-pressure response to rainfall infiltration with an infinite slope stability calculation. Input data for the extended model include topographic slope, colluvial thickness, initial water-table depth, material properties, and rainfall durations. This approach is combined with a statistical treatment of rainfall using a GEV (General Extreme Value) probabilistic distribution to produce maps showing the shallow landslide recurrence induced, on a spatially distributed basis, as a function of rainfall duration and hillslope characteristics.

  19. High pressure behaviour of uranium dicarbide (UC{sub 2}): Ab-initio study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sahoo, B. D., E-mail: bdsahoo@barc.gov.in; Mukherjee, D.; Joshi, K. D.

    2016-08-28

    The structural stability of uranium dicarbide has been examined under hydrostatic compression employing evolutionary structure search algorithm implemented in the universal structure predictor: evolutionary Xtallography (USPEX) code in conjunction with ab-initio electronic band structure calculation method. The ab-initio total energy calculations involved for this purpose have been carried out within both generalized gradient approximations (GGA) and GGA + U approximations. Our calculations under GGA approximation predict the high pressure structural sequence of tetragonal → monoclinic → orthorhombic for this material with transition pressures of ∼8 GPa and 42 GPa, respectively. The same transition sequence is predicted by calculations within GGA + U also with transition pressuresmore » placed at ∼24 GPa and ∼50 GPa, respectively. Further, on the basis of comparison of zero pressure equilibrium volume and equation of state with available experimental data, we find that GGA + U approximation with U = 2.5 eV describes this material better than the simple GGA approximation. The theoretically predicted high pressure structural phase transitions are in disagreement with the only high experimental study by Dancausse et al. [J. Alloys. Compd. 191, 309 (1993)] on this compound which reports a tetragonal to hexagonal phase transition at a pressure of ∼17.6 GPa. Interestingly, during lowest enthalpy structure search using USPEX, we do not see any hexagonal phase to be closer to the predicted monoclinic phase even within 0.2 eV/f. unit. More experiments with varying carbon contents in UC{sub 2} sample are required to resolve this discrepancy. The existence of these high pressure phases predicted by static lattice calculations has been further substantiated by analyzing the elastic and lattice dynamic stability of these structures in the pressure regimes of their structural stability. Additionally, various thermo-physical quantities such as equilibrium volume, bulk modulus, Debye temperature, thermal expansion coefficient, Gruneisen parameter, and heat capacity at ambient conditions have been determined from these calculations and compared with the available experimental data.« less

  20. MRI-Only Based Radiotherapy Treatment Planning for the Rat Brain on a Small Animal Radiation Research Platform (SARRP).

    PubMed

    Gutierrez, Shandra; Descamps, Benedicte; Vanhove, Christian

    2015-01-01

    Computed tomography (CT) is the standard imaging modality in radiation therapy treatment planning (RTP). However, magnetic resonance (MR) imaging provides superior soft tissue contrast, increasing the precision of target volume selection. We present MR-only based RTP for a rat brain on a small animal radiation research platform (SARRP) using probabilistic voxel classification with multiple MR sequences. Six rat heads were imaged, each with one CT and five MR sequences. The MR sequences were: T1-weighted, T2-weighted, zero-echo time (ZTE), and two ultra-short echo time sequences with 20 μs (UTE1) and 2 ms (UTE2) echo times. CT data were manually segmented into air, soft tissue, and bone to obtain the RTP reference. Bias field corrected MR images were automatically segmented into the same tissue classes using a fuzzy c-means segmentation algorithm with multiple images as input. Similarities between segmented CT and automatic segmented MR (ASMR) images were evaluated using Dice coefficient. Three ASMR images with high similarity index were used for further RTP. Three beam arrangements were investigated. Dose distributions were compared by analysing dose volume histograms. The highest Dice coefficients were obtained for the ZTE-UTE2 combination and for the T1-UTE1-T2 combination when ZTE was unavailable. Both combinations, along with UTE1-UTE2, often used to generate ASMR images, were used for further RTP. Using 1 beam, MR based RTP underestimated the dose to be delivered to the target (range: 1.4%-7.6%). When more complex beam configurations were used, the calculated dose using the ZTE-UTE2 combination was the most accurate, with 0.7% deviation from CT, compared to 0.8% for T1-UTE1-T2 and 1.7% for UTE1-UTE2. The presented MR-only based workflow for RTP on a SARRP enables both accurate organ delineation and dose calculations using multiple MR sequences. This method can be useful in longitudinal studies where CT's cumulative radiation dose might contribute to the total dose.

  1. MRI-Only Based Radiotherapy Treatment Planning for the Rat Brain on a Small Animal Radiation Research Platform (SARRP)

    PubMed Central

    Gutierrez, Shandra; Descamps, Benedicte; Vanhove, Christian

    2015-01-01

    Computed tomography (CT) is the standard imaging modality in radiation therapy treatment planning (RTP). However, magnetic resonance (MR) imaging provides superior soft tissue contrast, increasing the precision of target volume selection. We present MR-only based RTP for a rat brain on a small animal radiation research platform (SARRP) using probabilistic voxel classification with multiple MR sequences. Six rat heads were imaged, each with one CT and five MR sequences. The MR sequences were: T1-weighted, T2-weighted, zero-echo time (ZTE), and two ultra-short echo time sequences with 20 μs (UTE1) and 2 ms (UTE2) echo times. CT data were manually segmented into air, soft tissue, and bone to obtain the RTP reference. Bias field corrected MR images were automatically segmented into the same tissue classes using a fuzzy c-means segmentation algorithm with multiple images as input. Similarities between segmented CT and automatic segmented MR (ASMR) images were evaluated using Dice coefficient. Three ASMR images with high similarity index were used for further RTP. Three beam arrangements were investigated. Dose distributions were compared by analysing dose volume histograms. The highest Dice coefficients were obtained for the ZTE-UTE2 combination and for the T1-UTE1-T2 combination when ZTE was unavailable. Both combinations, along with UTE1-UTE2, often used to generate ASMR images, were used for further RTP. Using 1 beam, MR based RTP underestimated the dose to be delivered to the target (range: 1.4%-7.6%). When more complex beam configurations were used, the calculated dose using the ZTE-UTE2 combination was the most accurate, with 0.7% deviation from CT, compared to 0.8% for T1-UTE1-T2 and 1.7% for UTE1-UTE2. The presented MR-only based workflow for RTP on a SARRP enables both accurate organ delineation and dose calculations using multiple MR sequences. This method can be useful in longitudinal studies where CT’s cumulative radiation dose might contribute to the total dose. PMID:26633302

  2. What caused the outbreak of ESBL-producing Klebsiella pneumoniae in a neonatal intensive care unit, Germany 2009 to 2012? Reconstructing transmission with epidemiological analysis and whole-genome sequencing

    PubMed Central

    Haller, Sebastian; Eller, Christoph; Hermes, Julia; Kaase, Martin; Steglich, Matthias; Radonić, Aleksandar; Dabrowski, Piotr Wojtek; Nitsche, Andreas; Pfeifer, Yvonne; Werner, Guido; Wunderle, Werner; Velasco, Edward; Abu Sin, Muna; Eckmanns, Tim; Nübel, Ulrich

    2015-01-01

    Objective We aimed to retrospectively reconstruct the timing of transmission events and pathways in order to understand why extensive preventive measures and investigations were not sufficient to prevent new cases. Methods We extracted available information from patient charts to describe cases and to compare them to the normal population of the ward. We conducted a cohort study to identify risk factors for pathogen acquisition. We sequenced the available isolates to determine the phylogenetic relatedness of Klebsiella pneumoniae isolates on the basis of their genome sequences. Results The investigation comprises 37 cases and the 10 cases with ESBL (extended-spectrum beta-lactamase)-producing K. pneumoniae bloodstream infection. Descriptive epidemiology indicated that a continuous transmission from person to person was most likely. Results from the cohort study showed that ‘frequent manipulation’ (a proxy for increased exposure to medical procedures) was significantly associated with being a case (RR 1.44, 95% CI 1.02 to 2.19). Genome sequences revealed that all 48 bacterial isolates available for sequencing from 31 cases were closely related (maximum genetic distance, 12 single nucleotide polymorphisms). Based on our calculation of evolutionary rate and sequence diversity, we estimate that the outbreak strain was endemic since 2008. Conclusions Epidemiological and phylogenetic analyses consistently indicated that there were additional, undiscovered cases prior to the onset of microbiological screening and that the spread of the pathogen remained undetected over several years, driven predominantly by person-to-person transmission. Whole-genome sequencing provided valuable information on the onset, course and size of the outbreak, and on possible ways of transmission. PMID:25967999

  3. A complex valued radial basis function network for equalization of fast time varying channels.

    PubMed

    Gan, Q; Saratchandran, P; Sundararajan, N; Subramanian, K R

    1999-01-01

    This paper presents a complex valued radial basis function (RBF) network for equalization of fast time varying channels. A new method for calculating the centers of the RBF network is given. The method allows fixing the number of RBF centers even as the equalizer order is increased so that a good performance is obtained by a high-order RBF equalizer with small number of centers. Simulations are performed on time varying channels using a Rayleigh fading channel model to compare the performance of our RBF with an adaptive maximum-likelihood sequence estimator (MLSE) consisting of a channel estimator and a MLSE implemented by the Viterbi algorithm. The results show that the RBF equalizer produces superior performance with less computational complexity.

  4. Phylogenetic tree and community structure from a Tangled Nature model.

    PubMed

    Canko, Osman; Taşkın, Ferhat; Argın, Kamil

    2015-10-07

    In evolutionary biology, the taxonomy and origination of species are widely studied subjects. An estimation of the evolutionary tree can be done via available DNA sequence data. The calculation of the tree is made by well-known and frequently used methods such as maximum likelihood and neighbor-joining. In order to examine the results of these methods, an evolutionary tree is pursued computationally by a mathematical model, called Tangled Nature. A relatively small genome space is investigated due to computational burden and it is found that the actual and predicted trees are in reasonably good agreement in terms of shape. Moreover, the speciation and the resulting community structure of the food-web are investigated by modularity. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Methods of automatic nucleotide-sequence analysis. Multicomponent spectrophotometric analysis of mixtures of nucleic acid components by a least-squares procedure

    PubMed Central

    Lee, Sheila; McMullen, D.; Brown, G. L.; Stokes, A. R.

    1965-01-01

    1. A theoretical analysis of the errors in multicomponent spectrophotometric analysis of nucleoside mixtures, by a least-squares procedure, has been made to obtain an expression for the error coefficient, relating the error in calculated concentration to the error in extinction measurements. 2. The error coefficients, which depend only on the `library' of spectra used to fit the experimental curves, have been computed for a number of `libraries' containing the following nucleosides found in s-RNA: adenosine, guanosine, cytidine, uridine, 5-ribosyluracil, 7-methylguanosine, 6-dimethylaminopurine riboside, 6-methylaminopurine riboside and thymine riboside. 3. The error coefficients have been used to determine the best conditions for maximum accuracy in the determination of the compositions of nucleoside mixtures. 4. Experimental determinations of the compositions of nucleoside mixtures have been made and the errors found to be consistent with those predicted by the theoretical analysis. 5. It has been demonstrated that, with certain precautions, the multicomponent spectrophotometric method described is suitable as a basis for automatic nucleotide-composition analysis of oligonucleotides containing nine nucleotides. Used in conjunction with continuous chromatography and flow chemical techniques, this method can be applied to the study of the sequence of s-RNA. PMID:14346087

  6. Effects of B1 inhomogeneity correction for three-dimensional variable flip angle T1 measurements in hip dGEMRIC at 3 T and 1.5 T.

    PubMed

    Siversson, Carl; Chan, Jenny; Tiderius, Carl-Johan; Mamisch, Tallal Charles; Jellus, Vladimir; Svensson, Jonas; Kim, Young-Jo

    2012-06-01

    Delayed gadolinium-enhanced MRI of cartilage is a technique for studying the development of osteoarthritis using quantitative T(1) measurements. Three-dimensional variable flip angle is a promising method for performing such measurements rapidly, by using two successive spoiled gradient echo sequences with different excitation pulse flip angles. However, the three-dimensional variable flip angle method is very sensitive to inhomogeneities in the transmitted B(1) field in vivo. In this study, a method for correcting for such inhomogeneities, using an additional B(1) mapping spin-echo sequence, was evaluated. Phantom studies concluded that three-dimensional variable flip angle with B(1) correction calculates accurate T(1) values also in areas with high B(1) deviation. Retrospective analysis of in vivo hip delayed gadolinium-enhanced MRI of cartilage data from 40 subjects showed the difference between three-dimensional variable flip angle with and without B(1) correction to be generally two to three times higher at 3 T than at 1.5 T. In conclusion, the B(1) variations should always be taken into account, both at 1.5 T and at 3 T. Copyright © 2011 Wiley-Liss, Inc.

  7. Low rank approximation methods for MR fingerprinting with large scale dictionaries.

    PubMed

    Yang, Mingrui; Ma, Dan; Jiang, Yun; Hamilton, Jesse; Seiberlich, Nicole; Griswold, Mark A; McGivney, Debra

    2018-04-01

    This work proposes new low rank approximation approaches with significant memory savings for large scale MR fingerprinting (MRF) problems. We introduce a compressed MRF with randomized singular value decomposition method to significantly reduce the memory requirement for calculating a low rank approximation of large sized MRF dictionaries. We further relax this requirement by exploiting the structures of MRF dictionaries in the randomized singular value decomposition space and fitting them to low-degree polynomials to generate high resolution MRF parameter maps. In vivo 1.5T and 3T brain scan data are used to validate the approaches. T 1 , T 2 , and off-resonance maps are in good agreement with that of the standard MRF approach. Moreover, the memory savings is up to 1000 times for the MRF-fast imaging with steady-state precession sequence and more than 15 times for the MRF-balanced, steady-state free precession sequence. The proposed compressed MRF with randomized singular value decomposition and dictionary fitting methods are memory efficient low rank approximation methods, which can benefit the usage of MRF in clinical settings. They also have great potentials in large scale MRF problems, such as problems considering multi-component MRF parameters or high resolution in the parameter space. Magn Reson Med 79:2392-2400, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.

  8. Confirmation of Two Sibling Species among Anopheles fluviatilis Mosquitoes in South and Southeastern Iran by Analysis of Cytochrome Oxidase I Gene.

    PubMed

    Naddaf, Saied Reza; Oshaghi, Mohammad Ali; Vatandoost, Hassan

    2012-12-01

    Anopheles fluviatilis, one of the major malaria vectors in Iran, is assumed to be a complex of sibling species. The aim of this study was to evaluate Cytochrome oxidase I (COI) gene alongside 28S-D3 as a diagnostic tool for identification of An. fluviatilis sibling species in Iran. DNA sample belonging to 24 An. fluviatilis mosquitoes from different geographical areas in south and southeastern Iran were used for amplification of COI gene followed by sequencing. The 474-475 bp COI sequences obtained in this study were aligned with 59 similar sequences of An. fluviatilis and a sequence of Anopheles minimus, as out group, from GenBank database. The distances between group and individual sequences were calculated and phylogenetic tree for obtained sequences was generated by using Kimura two parameter (K2P) model of neighbor-joining method. Phylogenetic analysis using COI gene grouped members of Fars Province (central Iran) in two distinct clades separate from other Iranian members representing Hormozgan, Kerman, and Sistan va Baluchestan Provinces. The mean distance between Iranian and Indian individuals was 1.66%, whereas the value between Fars Province individuals and the group comprising individuals from other areas of Iran was 2.06%. Presence of 2.06% mean distance between individuals from Fars Province and those from other areas of Iran is indicative of at least two sibling species in An. fluviatilis mosquitoes of Iran. This finding confirms earlier results based on RAPD-PCR and 28S-D3 analysis.

  9. The TGA codons are present in the open reading frame of selenoprotein P cDNA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hill, K.E.; Lloyd, R.S.; Read, R.

    1991-03-11

    The TGA codon in DNA has been shown to direct incorporation of selenocysteine into protein. Several proteins from bacteria and animals contain selenocysteine in their primary structures. Each of the cDNA clones of these selenoproteins contains one TGA codon in the open reading frame which corresponds to the selenocysteine in the protein. A cDNA clone for selenoprotein P (SeP), obtained from a {gamma}ZAP rat liver library, was sequenced by the dideoxy termination method. The correct reading frame was determined by comparison of the deduced amino acid sequence with the amino acid sequence of several peptides from SeP. Using SeP labelledmore » with {sup 75}Se in vivo, the selenocysteine content of the peptides was verified by the collection of carboxymethylated {sup 77}Se-selenocysteine as it eluted from the amino acid analyzer and determination of the radioactivity contained in the collected samples. Ten TGA codons are present in the open reading frame of the cDNA. Peptide fragmentation studies and the deduced sequence indicate that selenium-rich regions are located close to the carboxy terminus. Nine of the 10 selenocysteines are located in the terminal 26% of the sequence with four in the terminal 15 amino acids. The deduced sequence codes for a protein of 385 amino acids. Cleavage of the signal peptide gives the mature protein with 366 amino acids and a calculated mol wt of 41,052 Da. Searches of PIR and SWISSPROT protein databases revealed no similarity with glutathione peroxidase or other selenoproteins.« less

  10. Long-range correlations and charge transport properties of DNA sequences

    NASA Astrophysics Data System (ADS)

    Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui

    2010-04-01

    By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5

  11. Quantitative evaluation of benign and malignant vertebral fractures with diffusion-weighted MRI: what is the optimum combination of b values for ADC-based lesion differentiation with the single-shot turbo spin-echo sequence?

    PubMed

    Geith, Tobias; Schmidt, Gerwin; Biffar, Andreas; Dietrich, Olaf; Duerr, Hans Roland; Reiser, Maximilian; Baur-Melnyk, Andrea

    2014-09-01

    The purpose of our study was to determine the optimum combination of b values for calculating the apparent diffusion coefficient (ADC) using a diffusion-weighted (DW) single-shot turbo spin-echo (TSE) sequence in the differentiation between acute benign and malignant vertebral body fractures. Twenty-six patients with osteoporotic (mean age, 69 years; range, 31.5-86.2 years) and 20 patients with malignant vertebral fractures (mean age, 63.4 years; range, 24.7-86.4 years) were studied. T1-weighted, STIR, and T2-weighted sequences were acquired at 1.5 T. A DW single-shot TSE sequence at different b values (100, 250, 400, and 600 s/mm(2)) was applied. On the DW images for each evaluated fracture, an ROI was manually adapted to the area of hyperintense signal intensity on STIR-hypointense signal on T1-weighted images. For each ROI, nine different combinations of two, three, and four b values were used to calculate the ADC using a least-squares algorithm. The Student t test and Mann-Whitney U test were used to determine significant differences between benign and malignant fractures. An ROC analysis and the Youden index were used to determine cutoff values for assessment of the highest sensitivity and specificity for the different ADC values. The positive (PPV) and negative predictive values (NPV) were also determined. All calculated ADCs (except the combination of b = 400 s/mm(2) and b = 600 s/mm(2)) showed statistically significant differences between benign and malignant vertebral body fractures, with benign fractures having higher ADCs than malignant ones. The use of higher b values resulted in lower ADCs than those calculated with low b values. The highest AUC (0.85) showed the ADCs calculated with b = 100 and 400 s/mm(2), and the second highest AUC (0.829) showed the ADCs calculated with b = 100, 250, and 400 s/mm(2). The Youden index with equal weight given to sensitivity and specificity suggests use of an ADC calculated with b = 100, 250, and 400 s/mm(2) (cutoff ADC, < 1.7 × 10(-3) mm(2)/s) to best diagnose malignancy (sensitivity, 85%; specificity, 84.6%; PPV, 81.0%; NPV, 88.0%). ADCs calculated with a combination of low to intermediate b values (b = 100, 250, and 400 s/mm(2)) provide the best diagnostic performance of a DW single-shot TSE sequence to differentiate acute benign and malignant vertebral body fractures.

  12. CBS Genome Atlas Database: a dynamic storage for bioinformatic results and sequence data.

    PubMed

    Hallin, Peter F; Ussery, David W

    2004-12-12

    Currently, new bacterial genomes are being published on a monthly basis. With the growing amount of genome sequence data, there is a demand for a flexible and easy-to-maintain structure for storing sequence data and results from bioinformatic analysis. More than 150 sequenced bacterial genomes are now available, and comparisons of properties for taxonomically similar organisms are not readily available to many biologists. In addition to the most basic information, such as AT content, chromosome length, tRNA count and rRNA count, a large number of more complex calculations are needed to perform detailed comparative genomics. DNA structural calculations like curvature and stacking energy, DNA compositions like base skews, oligo skews and repeats at the local and global level are just a few of the analysis that are presented on the CBS Genome Atlas Web page. Complex analysis, changing methods and frequent addition of new models are factors that require a dynamic database layout. Using basic tools like the GNU Make system, csh, Perl and MySQL, we have created a flexible database environment for storing and maintaining such results for a collection of complete microbial genomes. Currently, these results counts to more than 220 pieces of information. The backbone of this solution consists of a program package written in Perl, which enables administrators to synchronize and update the database content. The MySQL database has been connected to the CBS web-server via PHP4, to present a dynamic web content for users outside the center. This solution is tightly fitted to existing server infrastructure and the solutions proposed here can perhaps serve as a template for other research groups to solve database issues. A web based user interface which is dynamically linked to the Genome Atlas Database can be accessed via www.cbs.dtu.dk/services/GenomeAtlas/. This paper has a supplemental information page which links to the examples presented: www.cbs.dtu.dk/services/GenomeAtlas/suppl/bioinfdatabase.

  13. Comparison of clinical semi-quantitative assessment of muscle fat infiltration with quantitative assessment using chemical shift-based water/fat separation in MR studies of the calf of post-menopausal women

    PubMed Central

    Nardo, Lorenzo; Karampinos, Dimitrios C.; Joseph, Gabby B.; Yap, Samuel P.; Baum, Thomas; Krug, Roland; Majumdar, Sharmila; Link, Thomas M.

    2013-01-01

    Objective The goal of this study was to compare the semi-quantitative Goutallier classification for fat infiltration with quantitative fat-fraction derived from a magnetic resonance imaging (MRI) chemical shift-based water/fat separation technique. Methods Sixty-two women (age 61±6 years), 27 of whom had diabetes, underwent MRI of the calf using a T1-weighted fast spin-echo sequence and a six-echo spoiled gradient-echo sequence at 3 T. Water/fat images and fat fraction maps were reconstructed using the IDEAL algorithm with T2* correction and a multi-peak model for the fat spectrum. Two radiologists scored fat infiltration on the T1-weighted images using the Goutallier classification in six muscle compartments. Spearman correlations between the Goutallier grades and the fat fraction were calculated; in addition, intra-observer and inter-observer agreement were calculated. Results A significant correlation between the clinical grading and the fat fraction values was found for all muscle compartments (P<0.0001, R values ranging from 0.79 to 0.88). Goutallier grades 0–4 had a fat fraction ranging from 3.5 to 19%. Intra-observer and inter-observer agreement values of 0.83 and 0.81 were calculated for the semi-quantitative grading. Conclusion Semi-quantitative grading of intramuscular fat and quantitative fat fraction were significantly correlated and both techniques had excellent reproducibility. However, the clinical grading was found to overestimate muscle fat. PMID:22411305

  14. PHYSICO: An UNIX based Standalone Procedure for Computation of Individual and Group Properties of Protein Sequences.

    PubMed

    Gupta, Parth Sarthi Sen; Banerjee, Shyamashree; Islam, Rifat Nawaz Ul; Mondal, Sudipta; Mondal, Buddhadev; Bandyopadhyay, Amal K

    2014-01-01

    In the genomic and proteomic era, efficient and automated analyses of sequence properties of protein have become an important task in bioinformatics. There are general public licensed (GPL) software tools to perform a part of the job. However, computations of mean properties of large number of orthologous sequences are not possible from the above mentioned GPL sets. Further, there is no GPL software or server which can calculate window dependent sequence properties for a large number of sequences in a single run. With a view to overcome above limitations, we have developed a standalone procedure i.e. PHYSICO, which performs various stages of computation in a single run based on the type of input provided either in RAW-FASTA or BLOCK-FASTA format and makes excel output for: a) Composition, Class composition, Mean molecular weight, Isoelectic point, Aliphatic index and GRAVY, b) column based compositions, variability and difference matrix, c) 25 kinds of window dependent sequence properties. The program is fast, efficient, error free and user friendly. Calculation of mean and standard deviation of homologous sequences sets, for comparison purpose when relevant, is another attribute of the program; a property seldom seen in existing GPL softwares. PHYSICO is freely available for non-commercial/academic user in formal request to the corresponding author akbanerjee@biotech.buruniv.ac.in.

  15. PHYSICO: An UNIX based Standalone Procedure for Computation of Individual and Group Properties of Protein Sequences

    PubMed Central

    Gupta, Parth Sarthi Sen; Banerjee, Shyamashree; Islam, Rifat Nawaz Ul; Mondal, Sudipta; Mondal, Buddhadev; Bandyopadhyay, Amal K

    2014-01-01

    In the genomic and proteomic era, efficient and automated analyses of sequence properties of protein have become an important task in bioinformatics. There are general public licensed (GPL) software tools to perform a part of the job. However, computations of mean properties of large number of orthologous sequences are not possible from the above mentioned GPL sets. Further, there is no GPL software or server which can calculate window dependent sequence properties for a large number of sequences in a single run. With a view to overcome above limitations, we have developed a standalone procedure i.e. PHYSICO, which performs various stages of computation in a single run based on the type of input provided either in RAW-FASTA or BLOCK-FASTA format and makes excel output for: a) Composition, Class composition, Mean molecular weight, Isoelectic point, Aliphatic index and GRAVY, b) column based compositions, variability and difference matrix, c) 25 kinds of window dependent sequence properties. The program is fast, efficient, error free and user friendly. Calculation of mean and standard deviation of homologous sequences sets, for comparison purpose when relevant, is another attribute of the program; a property seldom seen in existing GPL softwares. Availability PHYSICO is freely available for non-commercial/academic user in formal request to the corresponding author akbanerjee@biotech.buruniv.ac.in PMID:24616564

  16. A generalized theoretical framework for the description of spin decoupling in solid-state MAS NMR: Offset effect on decoupling performance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tan, Kong Ooi; Meier, Beat H., E-mail: beme@ethz.ch, E-mail: maer@ethz.ch; Ernst, Matthias, E-mail: beme@ethz.ch, E-mail: maer@ethz.ch

    2016-09-07

    We present a generalized theoretical framework that allows the approximate but rapid analysis of residual couplings of arbitrary decoupling sequences in solid-state NMR under magic-angle spinning conditions. It is a generalization of the tri-modal Floquet analysis of TPPM decoupling [Scholz et al., J. Chem. Phys. 130, 114510 (2009)] where three characteristic frequencies are used to describe the pulse sequence. Such an approach can be used to describe arbitrary periodic decoupling sequences that differ only in the magnitude of the Fourier coefficients of the interaction-frame transformation. It allows a ∼100 times faster calculation of second-order residual couplings as a function ofmore » pulse sequence parameters than full spin-dynamics simulations. By comparing the theoretical calculations with full numerical simulations, we show the potential of the new approach to examine the performance of decoupling sequences. We exemplify the usefulness of this framework by analyzing the performance of commonly used high-power decoupling sequences and low-power decoupling sequences such as amplitude-modulated XiX (AM-XiX) and its super-cycled variant SC-AM-XiX. In addition, the effect of chemical-shift offset is examined for both high- and low-power decoupling sequences. The results show that the cross-terms between the dipolar couplings are the main contributions to the line broadening when offset is present. We also show that the SC-AM-XIX shows a better offset compensation.« less

  17. A generalized theoretical framework for the description of spin decoupling in solid-state MAS NMR: Offset effect on decoupling performance.

    PubMed

    Tan, Kong Ooi; Agarwal, Vipin; Meier, Beat H; Ernst, Matthias

    2016-09-07

    We present a generalized theoretical framework that allows the approximate but rapid analysis of residual couplings of arbitrary decoupling sequences in solid-state NMR under magic-angle spinning conditions. It is a generalization of the tri-modal Floquet analysis of TPPM decoupling [Scholz et al., J. Chem. Phys. 130, 114510 (2009)] where three characteristic frequencies are used to describe the pulse sequence. Such an approach can be used to describe arbitrary periodic decoupling sequences that differ only in the magnitude of the Fourier coefficients of the interaction-frame transformation. It allows a ∼100 times faster calculation of second-order residual couplings as a function of pulse sequence parameters than full spin-dynamics simulations. By comparing the theoretical calculations with full numerical simulations, we show the potential of the new approach to examine the performance of decoupling sequences. We exemplify the usefulness of this framework by analyzing the performance of commonly used high-power decoupling sequences and low-power decoupling sequences such as amplitude-modulated XiX (AM-XiX) and its super-cycled variant SC-AM-XiX. In addition, the effect of chemical-shift offset is examined for both high- and low-power decoupling sequences. The results show that the cross-terms between the dipolar couplings are the main contributions to the line broadening when offset is present. We also show that the SC-AM-XIX shows a better offset compensation.

  18. K-shell photoabsorption and photoionization of trace elements. II. Isoelectronic sequences with electron number 12 ≤N ≤ 18

    NASA Astrophysics Data System (ADS)

    Mendoza, C.; Bautista, M. A.; Palmeri, P.; Quinet, P.; Witthoeft, M. C.; Kallman, T. R.

    2017-08-01

    Context. We are concerned with improving the diagnostic potential of the K lines and edges of elements with low cosmic abundances, namely F, Na, P, Cl, K, Sc, Ti, V, Cr, Mn, Co, Cu, and Zn, that are observed in the X-ray spectra of supernova remnants, galaxy clusters, and accreting black holes and neutron stars. Aims: Since accurate photoabsorption and photoionization cross sections are needed in their spectral models, they have been computed for isoelectronic sequences with electron number 12 ≤ N ≤ 18 using a multi-channel method. Methods: Target representations are obtained with the atomic structure code autostructure, and ground-state cross sections are computed with the Breit-Pauli R-matrix method (bprm) in intermediate coupling, including damping (radiative and Auger) effects. Results: Following the findings in our earlier work on sequences with 2 ≤ N ≤ 11, the contributions from channels associated with the 2s-hole [2s] μ target configurations and those containing 3d orbitals are studied in the Mg (N = 12) and Ar (N = 18) isoelectronic sequences. Cross sections for the latter ions are also calculated in the isolated-resonance approximation as implemented in autostructure and compared with bprm to test their accuracy. Conclusions: It is confirmed that the collisional channels associated with the [2s] μ target configurations must be taken into account owing to significant increases in the monotonic background cross section between the L and K edges. Target configurations with 3d orbitals give rise to fairly conspicuous unresolved transition arrays in the L-edge region, but to a much lesser extent in the K-edge that is our main concern; therefore, they have been neglected throughout owing to their computationally intractable channel inventory, thus allowing the computation of cross sections for all the ions with 12 ≤ N ≤ 18 in intermediate coupling with bprm. We find that the isolated-resonance approximations performs satisfactorily and will be our best choice to tackle the systems with ground configuration 3p63dm (3 ≤ m ≤ 8) in isoelectronic sequences with N> 20.

  19. GazeAppraise v. 0.1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wilson, Andrew; Haass, Michael; Rintoul, Mark Daniel

    GazeAppraise advances the state of the art of gaze pattern analysis using methods that simultaneously analyze spatial and temporal characteristics of gaze patterns. GazeAppraise enables novel research in visual perception and cognition; for example, using shape features as distinguishing elements to assess individual differences in visual search strategy. Given a set of point-to-point gaze sequences, hereafter referred to as scanpaths, the method constructs multiple descriptive features for each scanpath. Once the scanpath features have been calculated, they are used to form a multidimensional vector representing each scanpath and cluster analysis is performed on the set of vectors from all scanpaths.more » An additional benefit of this method is the identification of causal or correlated characteristics of the stimuli, subjects, and visual task through statistical analysis of descriptive metadata distributions within and across clusters.« less

  20. The CV period minimum

    NASA Astrophysics Data System (ADS)

    Kolb, Ulrich; Baraffe, Isabelle

    Using improved, up-to-date stellar input physics tested against observations of low-mass stars and brown dwarfs we calculate the secular evolution of low-donor-mass CVs, including those which form with a brown dwarf donor star. Our models confirm the mismatch between the calculated minimum period (plus or minus in ~= 70 min) and the observed short-period cut-off (~= 80 min) in the CV period histogram. Theoretical period distributions synthesized from our model sequences always show an accumulation of systems at the minimum period, a feature absent in the observed distribution. We suggest that non-magnetic CVs become unobservable as they are effectively trapped in permanent quiescence before they reach plus or minus in, and that small-number statistics may hide the period spike for magnetic CVs. We calculate the minimum period for high mass transfer rate sequences and discuss the relevance of these for explaining the location of CV secondaries in the orbital-period-spectral-type diagram. We also show that a recently suggested revised mass-radius relation for low-mass main-sequence stars cannot explain the CV period gap.

  1. Sequence periodicity in nucleosomal DNA and intrinsic curvature.

    PubMed

    Nair, T Murlidharan

    2010-05-17

    Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.

  2. The problem of hole localization in inner-shell states of N2 and CO2 revisited with complete active space self-consistent field approach.

    PubMed

    Rocha, Alexandre B; de Moura, Carlos E V

    2011-12-14

    Potential energy curves for inner-shell states of nitrogen and carbon dioxide molecules are calculated by inner-shell complete active space self-consistent field (CASSCF) method, which is a protocol, recently proposed, to obtain specifically converged inner-shell states at multiconfigurational level. This is possible since the collapse of the wave function to a low-lying state is avoided by a sequence of constrained optimization in the orbital mixing step. The problem of localization of K-shell states is revisited by calculating their energies at CASSCF level based on both localized and delocalized orbitals. The localized basis presents the best results at this level of calculation. Transition energies are also calculated by perturbation theory, by taking the above mentioned MCSCF function as zeroth order wave function. Values for transition energy are in fairly good agreement with experimental ones. Bond dissociation energies for N(2) are considerably high, which means that these states are strongly bound. Potential curves along ground state normal modes of CO(2) indicate the occurrence of Renner-Teller effect in inner-shell states. © 2011 American Institute of Physics

  3. Spectra library assisted de novo peptide sequencing for HCD and ETD spectra pairs.

    PubMed

    Yan, Yan; Zhang, Kaizhong

    2016-12-23

    De novo peptide sequencing via tandem mass spectrometry (MS/MS) has been developed rapidly in recent years. With the use of spectra pairs from the same peptide under different fragmentation modes, performance of de novo sequencing is greatly improved. Currently, with large amount of spectra sequenced everyday, spectra libraries containing tens of thousands of annotated experimental MS/MS spectra become available. These libraries provide information of the spectra properties, thus have the potential to be used with de novo sequencing to improve its performance. In this study, an improved de novo sequencing method assisted with spectra library is proposed. It uses spectra libraries as training datasets and introduces significant scores of the features used in our previous de novo sequencing method for HCD and ETD spectra pairs. Two pairs of HCD and ETD spectral datasets were used to test the performance of the proposed method and our previous method. The results show that this proposed method achieves better sequencing accuracy with higher ranked correct sequences and less computational time. This paper proposed an advanced de novo sequencing method for HCD and ETD spectra pair and used information from spectra libraries and significant improved previous similar methods.

  4. A new method for calculation of water saturation in shale gas reservoirs using V P -to-V S ratio and porosity

    NASA Astrophysics Data System (ADS)

    Liu, Kun; Sun, Jianmeng; Zhang, Hongpan; Liu, Haitao; Chen, Xiangyang

    2018-02-01

    Total water saturation is an important parameter for calculating the free gas content of shale gas reservoirs. Owing to the limitations of the Archie formula and its extended solutions in zones rich in organic or conductive minerals, a new method was proposed to estimate total water saturation according to the relationship between total water saturation, V P -to-V S ratio and total porosity. Firstly, the ranges of the relevant parameters in the viscoelastic BISQ model in shale gas reservoirs were estimated. Then, the effects of relevant parameters on the V P -to-V S ratio were simulated based on the partially saturated viscoelastic BISQ model. These parameters were total water saturation, total porosity, permeability, characteristic squirt-flow length, fluid viscosity and sonic frequency. The simulation results showed that the main factors influencing V P -to-V S ratio were total porosity and total water saturation. When the permeability and the characteristic squirt-flow length changed slightly for a particular shale gas reservoir, their influences could be neglected. Then an empirical equation for total water saturation with respect to total porosity and V P -to-V S ratio was obtained according to the experimental data. Finally, the new method was successfully applied to estimate total water saturation in a sequence formation of shale gas reservoirs. Practical applications have shown good agreement with the results calculated by the Archie model.

  5. Various methods of determining the natural frequencies and damping of composite cantilever plates. 3. The Ritz method

    NASA Astrophysics Data System (ADS)

    Ekel'chik, V. S.; Ryabov, V. M.

    1997-03-01

    The Ritz method was used to determine the frequencies and forms of free vibrations of rectangular cantilever plates made of anisotropic laminated composites. Orthogonal Jacobi and Legendre polynomials were used as coordinate functions. The results of the calculations are in good agreement with the published experimental and calculated data of other authors for plates made of boron and carbon fiber reinforced plastics with different angles of reinforcement of unidirectional layers and different sequence of placing the layers, and also of isotropic plates. The dissipative characteristics in vibrations were determined on the basis of the concept of complex moduli. The solution of the frequency equation with complex coefficients yields a complex frequency; the loss factors are determined from the ratio of the imaginary component of the complex frequency to the real component. For plates of unidirectionally reinforced carbon fiber plastic with different relative length a detailed analysis of the influence of the angle of reinforcement on the interaction and frequency transformation and on the loss factor was carried out. The article shows that the loss factor of a plate depends substantially on the type of vibration mode: bending or torsional. It also examines the asymptotics of the loss factors of plates when their length is increased, and it notes that the binomial model of deformation leads to a noticeable error in the calculation of the loss factor of long plates when the angle of reinforcement lies in the range 20°<φ<70°.

  6. The Role of Economic Uncertainty on the Block Economic Value - a New Valuation Approach / Rola Czynnika Niepewności Przy Obliczaniu Wskaźnika Rentowności - Nowe Podejście

    NASA Astrophysics Data System (ADS)

    Dehghani, H.; Ataee-Pour, M.

    2012-12-01

    The block economic value (EV) is one of the most important parameters in mine evaluation. This parameter can affect significant factors such as mining sequence, final pit limit and net present value. Nowadays, the aim of open pit mine planning is to define optimum pit limits and an optimum life of mine production scheduling that maximizes the pit value under some technical and operational constraints. Therefore, it is necessary to calculate the block economic value at the first stage of the mine planning process, correctly. Unrealistic block economic value estimation may cause the mining project managers to make the wrong decision and thus may impose inexpiable losses to the project. The effective parameters such as metal price, operating cost, grade and so forth are always assumed certain in the conventional methods of EV calculation. While, obviously, these parameters have uncertain nature. Therefore, usually, the conventional methods results are far from reality. In order to solve this problem, a new technique is used base on an invented binomial tree which is developed in this research. This method can calculate the EV and project PV under economic uncertainty. In this paper, the EV and project PV were initially determined using Whittle formula based on certain economic parameters and a multivariate binomial tree based on the economic uncertainties such as the metal price and cost uncertainties. Finally the results were compared. It is concluded that applying the metal price and cost uncertainties causes the calculated block economic value and net present value to be more realistic than certain conditions.

  7. Analysis and Visualization Tool for Targeted Amplicon Bisulfite Sequencing on Ion Torrent Sequencers

    PubMed Central

    Pabinger, Stephan; Ernst, Karina; Pulverer, Walter; Kallmeyer, Rainer; Valdes, Ana M.; Metrustry, Sarah; Katic, Denis; Nuzzo, Angelo; Kriegner, Albert; Vierlinger, Klemens; Weinhaeusel, Andreas

    2016-01-01

    Targeted sequencing of PCR amplicons generated from bisulfite deaminated DNA is a flexible, cost-effective way to study methylation of a sample at single CpG resolution and perform subsequent multi-target, multi-sample comparisons. Currently, no platform specific protocol, support, or analysis solution is provided to perform targeted bisulfite sequencing on a Personal Genome Machine (PGM). Here, we present a novel tool, called TABSAT, for analyzing targeted bisulfite sequencing data generated on Ion Torrent sequencers. The workflow starts with raw sequencing data, performs quality assessment, and uses a tailored version of Bismark to map the reads to a reference genome. The pipeline visualizes results as lollipop plots and is able to deduce specific methylation-patterns present in a sample. The obtained profiles are then summarized and compared between samples. In order to assess the performance of the targeted bisulfite sequencing workflow, 48 samples were used to generate 53 different Bisulfite-Sequencing PCR amplicons from each sample, resulting in 2,544 amplicon targets. We obtained a mean coverage of 282X using 1,196,822 aligned reads. Next, we compared the sequencing results of these targets to the methylation level of the corresponding sites on an Illumina 450k methylation chip. The calculated average Pearson correlation coefficient of 0.91 confirms the sequencing results with one of the industry-leading CpG methylation platforms and shows that targeted amplicon bisulfite sequencing provides an accurate and cost-efficient method for DNA methylation studies, e.g., to provide platform-independent confirmation of Illumina Infinium 450k methylation data. TABSAT offers a novel way to analyze data generated by Ion Torrent instruments and can also be used with data from the Illumina MiSeq platform. It can be easily accessed via the Platomics platform, which offers a web-based graphical user interface along with sample and parameter storage. TABSAT is freely available under a GNU General Public License version 3.0 (GPLv3) at https://github.com/tadkeys/tabsat/ and http://demo.platomics.com/. PMID:27467908

  8. Long-range barcode labeling-sequencing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Feng; Zhang, Tao; Singh, Kanwar K.

    Methods for sequencing single large DNA molecules by clonal multiple displacement amplification using barcoded primers. Sequences are binned based on barcode sequences and sequenced using a microdroplet-based method for sequencing large polynucleotide templates to enable assembly of haplotype-resolved complex genomes and metagenomes.

  9. Generalising Ward's Method for Use with Manhattan Distances.

    PubMed

    Strauss, Trudie; von Maltitz, Michael Johan

    2017-01-01

    The claim that Ward's linkage algorithm in hierarchical clustering is limited to use with Euclidean distances is investigated. In this paper, Ward's clustering algorithm is generalised to use with l1 norm or Manhattan distances. We argue that the generalisation of Ward's linkage method to incorporate Manhattan distances is theoretically sound and provide an example of where this method outperforms the method using Euclidean distances. As an application, we perform statistical analyses on languages using methods normally applied to biology and genetic classification. We aim to quantify differences in character traits between languages and use a statistical language signature based on relative bi-gram (sequence of two letters) frequencies to calculate a distance matrix between 32 Indo-European languages. We then use Ward's method of hierarchical clustering to classify the languages, using the Euclidean distance and the Manhattan distance. Results obtained from using the different distance metrics are compared to show that the Ward's algorithm characteristic of minimising intra-cluster variation and maximising inter-cluster variation is not violated when using the Manhattan metric.

  10. MGUPGMA: A Fast UPGMA Algorithm With Multiple Graphics Processing Units Using NCCL

    PubMed Central

    Hua, Guan-Jie; Hung, Che-Lun; Lin, Chun-Yuan; Wu, Fu-Che; Chan, Yu-Wei; Tang, Chuan Yi

    2017-01-01

    A phylogenetic tree is a visual diagram of the relationship between a set of biological species. The scientists usually use it to analyze many characteristics of the species. The distance-matrix methods, such as Unweighted Pair Group Method with Arithmetic Mean and Neighbor Joining, construct a phylogenetic tree by calculating pairwise genetic distances between taxa. These methods have the computational performance issue. Although several new methods with high-performance hardware and frameworks have been proposed, the issue still exists. In this work, a novel parallel Unweighted Pair Group Method with Arithmetic Mean approach on multiple Graphics Processing Units is proposed to construct a phylogenetic tree from extremely large set of sequences. The experimental results present that the proposed approach on a DGX-1 server with 8 NVIDIA P100 graphic cards achieves approximately 3-fold to 7-fold speedup over the implementation of Unweighted Pair Group Method with Arithmetic Mean on a modern CPU and a single GPU, respectively. PMID:29051701

  11. MGUPGMA: A Fast UPGMA Algorithm With Multiple Graphics Processing Units Using NCCL.

    PubMed

    Hua, Guan-Jie; Hung, Che-Lun; Lin, Chun-Yuan; Wu, Fu-Che; Chan, Yu-Wei; Tang, Chuan Yi

    2017-01-01

    A phylogenetic tree is a visual diagram of the relationship between a set of biological species. The scientists usually use it to analyze many characteristics of the species. The distance-matrix methods, such as Unweighted Pair Group Method with Arithmetic Mean and Neighbor Joining, construct a phylogenetic tree by calculating pairwise genetic distances between taxa. These methods have the computational performance issue. Although several new methods with high-performance hardware and frameworks have been proposed, the issue still exists. In this work, a novel parallel Unweighted Pair Group Method with Arithmetic Mean approach on multiple Graphics Processing Units is proposed to construct a phylogenetic tree from extremely large set of sequences. The experimental results present that the proposed approach on a DGX-1 server with 8 NVIDIA P100 graphic cards achieves approximately 3-fold to 7-fold speedup over the implementation of Unweighted Pair Group Method with Arithmetic Mean on a modern CPU and a single GPU, respectively.

  12. Preliminary calculations related to the accident at Three Mile Island

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kirchner, W.L.; Stevenson, M.G.

    This report discusses preliminary studies of the Three Mile Island Unit 2 (TMI-2) accident based on available methods and data. The work reported includes: (1) a TRAC base case calculation out to 3 hours into the accident sequence; (2) TRAC parametric calculations, these are the same as the base case except for a single hypothetical change in the system conditions, such as assuming the high pressure injection (HPI) system operated as designed rather than as in the accident; (3) fuel rod cladding failure, cladding oxidation due to zirconium metal-steam reactions, hydrogen release due to cladding oxidation, cladding ballooning, cladding embrittlement,more » and subsequent cladding breakup estimates based on TRAC calculated cladding temperatures and system pressures. Some conclusions of this work are: the TRAC base case accident calculation agrees very well with known system conditions to nearly 3 hours into the accident; the parametric calculations indicate that, loss-of-core cooling was most influenced by the throttling of High-Pressure Injection (HPI) flows, given the accident initiating events and the pressurizer electromagnetic-operated valve (EMOV) failing to close as designed; failure of nearly all the rods and gaseous fission product gas release from the failed rods is predicted to have occurred at about 2 hours and 30 minutes; cladding oxidation (zirconium-steam reaction) up to 3 hours resulted in the production of approximately 40 kilograms of hydrogen.« less

  13. Direct Calculation of the Scattering Amplitude Without Partial Wave Decomposition. III; Inclusion of Correlation Effects

    NASA Technical Reports Server (NTRS)

    Shertzer, Janine; Temkin, Aaron

    2007-01-01

    In the first two papers in this series, we developed a method for studying electron-hydrogen scattering that does not use partial wave analysis. We constructed an ansatz for the wave function in both the static and static exchange approximations and calculated the full scattering amplitude. Here we go beyond the static exchange approximation, and include correlation in the wave function via a modified polarized orbital. This correlation function provides a significant improvement over the static exchange approximation: the resultant elastic scattering amplitudes are in very good agreement with fully converged partial wave calculations for electron-hydrogen scattering. A fully variational modification of this approach is discussed in the conclusion of the article Popular summary of Direct calculation of the scattering amplitude without partial wave expansion. III ....." by J. Shertzer and A. Temkin. In this paper we continue the development of In this paper we continue the development of a new approach to the way in which researchers have traditionally used to calculate the scattering cross section of (low-energy) electrons from atoms. The basic mathematical problem is to solve the Schroedinger Equation (SE) corresponding the above physical process. Traditionally it was always the case that the SE was reduced to a sequence of one-dimensional (ordinary) differential equations - called partial waves which were solved and from the solutions "phase shifts" were extracted, from which the scattering cross section was calculated.

  14. Sequence Bundles: a novel method for visualising, discovering and exploring sequence motifs

    PubMed Central

    2014-01-01

    Background We introduce Sequence Bundles--a novel data visualisation method for representing multiple sequence alignments (MSAs). We identify and address key limitations of the existing bioinformatics data visualisation methods (i.e. the Sequence Logo) by enabling Sequence Bundles to give salient visual expression to sequence motifs and other data features, which would otherwise remain hidden. Methods For the development of Sequence Bundles we employed research-led information design methodologies. Sequences are encoded as uninterrupted, semi-opaque lines plotted on a 2-dimensional reconfigurable grid. Each line represents a single sequence. The thickness and opacity of the stack at each residue in each position indicates the level of conservation and the lines' curved paths expose patterns in correlation and functionality. Several MSAs can be visualised in a composite image. The Sequence Bundles method is designed to favour a tangible, continuous and intuitive display of information. Results We have developed a software demonstration application for generating a Sequence Bundles visualisation of MSAs provided for the BioVis 2013 redesign contest. A subsequent exploration of the visualised line patterns allowed for the discovery of a number of interesting features in the dataset. Reported features include the extreme conservation of sequences displaying a specific residue and bifurcations of the consensus sequence. Conclusions Sequence Bundles is a novel method for visualisation of MSAs and the discovery of sequence motifs. It can aid in generating new insight and hypothesis making. Sequence Bundles is well disposed for future implementation as an interactive visual analytics software, which can complement existing visualisation tools. PMID:25237395

  15. Lung dynamic MRI deblurring using low-rank decomposition and dictionary learning.

    PubMed

    Gou, Shuiping; Wang, Yueyue; Wu, Jiaolong; Lee, Percy; Sheng, Ke

    2015-04-01

    Lung dynamic MRI (dMRI) has emerged to be an appealing tool to quantify lung motion for both planning and treatment guidance purposes. However, this modality can result in blurry images due to intrinsically low signal-to-noise ratio in the lung and spatial/temporal interpolation. The image blurring could adversely affect the image processing that depends on the availability of fine landmarks. The purpose of this study is to reduce dMRI blurring using image postprocessing. To enhance the image quality and exploit the spatiotemporal continuity of dMRI sequences, a low-rank decomposition and dictionary learning (LDDL) method was employed to deblur lung dMRI and enhance the conspicuity of lung blood vessels. Fifty frames of continuous 2D coronal dMRI frames using a steady state free precession sequence were obtained from five subjects including two healthy volunteer and three lung cancer patients. In LDDL, the lung dMRI was decomposed into sparse and low-rank components. Dictionary learning was employed to estimate the blurring kernel based on the whole image, low-rank or sparse component of the first image in the lung MRI sequence. Deblurring was performed on the whole image sequences using deconvolution based on the estimated blur kernel. The deblurring results were quantified using an automated blood vessel extraction method based on the classification of Hessian matrix filtered images. Accuracy of automated extraction was calculated using manual segmentation of the blood vessels as the ground truth. In the pilot study, LDDL based on the blurring kernel estimated from the sparse component led to performance superior to the other ways of kernel estimation. LDDL consistently improved image contrast and fine feature conspicuity of the original MRI without introducing artifacts. The accuracy of automated blood vessel extraction was on average increased by 16% using manual segmentation as the ground truth. Image blurring in dMRI images can be effectively reduced using a low-rank decomposition and dictionary learning method using kernels estimated by the sparse component.

  16. Comparison of normalization methods for the analysis of metagenomic gene abundance data.

    PubMed

    Pereira, Mariana Buongermino; Wallroth, Mikael; Jonsson, Viktor; Kristiansson, Erik

    2018-04-20

    In shotgun metagenomics, microbial communities are studied through direct sequencing of DNA without any prior cultivation. By comparing gene abundances estimated from the generated sequencing reads, functional differences between the communities can be identified. However, gene abundance data is affected by high levels of systematic variability, which can greatly reduce the statistical power and introduce false positives. Normalization, which is the process where systematic variability is identified and removed, is therefore a vital part of the data analysis. A wide range of normalization methods for high-dimensional count data has been proposed but their performance on the analysis of shotgun metagenomic data has not been evaluated. Here, we present a systematic evaluation of nine normalization methods for gene abundance data. The methods were evaluated through resampling of three comprehensive datasets, creating a realistic setting that preserved the unique characteristics of metagenomic data. Performance was measured in terms of the methods ability to identify differentially abundant genes (DAGs), correctly calculate unbiased p-values and control the false discovery rate (FDR). Our results showed that the choice of normalization method has a large impact on the end results. When the DAGs were asymmetrically present between the experimental conditions, many normalization methods had a reduced true positive rate (TPR) and a high false positive rate (FPR). The methods trimmed mean of M-values (TMM) and relative log expression (RLE) had the overall highest performance and are therefore recommended for the analysis of gene abundance data. For larger sample sizes, CSS also showed satisfactory performance. This study emphasizes the importance of selecting a suitable normalization methods in the analysis of data from shotgun metagenomics. Our results also demonstrate that improper methods may result in unacceptably high levels of false positives, which in turn may lead to incorrect or obfuscated biological interpretation.

  17. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  18. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  19. Computational analysis of sequence selection mechanisms.

    PubMed

    Meyerguz, Leonid; Grasso, Catherine; Kleinberg, Jon; Elber, Ron

    2004-04-01

    Mechanisms leading to gene variations are responsible for the diversity of species and are important components of the theory of evolution. One constraint on gene evolution is that of protein foldability; the three-dimensional shapes of proteins must be thermodynamically stable. We explore the impact of this constraint and calculate properties of foldable sequences using 3660 structures from the Protein Data Bank. We seek a selection function that receives sequences as input, and outputs survival probability based on sequence fitness to structure. We compute the number of sequences that match a particular protein structure with energy lower than the native sequence, the density of the number of sequences, the entropy, and the "selection" temperature. The mechanism of structure selection for sequences longer than 200 amino acids is approximately universal. For shorter sequences, it is not. We speculate on concrete evolutionary mechanisms that show this behavior.

  20. OPAL: prediction of MoRF regions in intrinsically disordered protein sequences.

    PubMed

    Sharma, Ronesh; Raicar, Gaurav; Tsunoda, Tatsuhiko; Patil, Ashwini; Sharma, Alok

    2018-06-01

    Intrinsically disordered proteins lack stable 3-dimensional structure and play a crucial role in performing various biological functions. Key to their biological function are the molecular recognition features (MoRFs) located within long disordered regions. Computationally identifying these MoRFs from disordered protein sequences is a challenging task. In this study, we present a new MoRF predictor, OPAL, to identify MoRFs in disordered protein sequences. OPAL utilizes two independent sources of information computed using different component predictors. The scores are processed and combined using common averaging method. The first score is computed using a component MoRF predictor which utilizes composition and sequence similarity of MoRF and non-MoRF regions to detect MoRFs. The second score is calculated using half-sphere exposure (HSE), solvent accessible surface area (ASA) and backbone angle information of the disordered protein sequence, using information from the amino acid properties of flanks surrounding the MoRFs to distinguish MoRF and non-MoRF residues. OPAL is evaluated using test sets that were previously used to evaluate MoRF predictors, MoRFpred, MoRFchibi and MoRFchibi-web. The results demonstrate that OPAL outperforms all the available MoRF predictors and is the most accurate predictor available for MoRF prediction. It is available at http://www.alok-ai-lab.com/tools/opal/. ashwini@hgc.jp or alok.sharma@griffith.edu.au. Supplementary data are available at Bioinformatics online.

Top