Sample records for local alignment algorithm

  1. DNA motif alignment by evolving a population of Markov chains.

    PubMed

    Bi, Chengpeng

    2009-01-30

    Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.

  2. AlignNemo: a local network alignment method to integrate homology and topology.

    PubMed

    Ciriello, Giovanni; Mina, Marco; Guzzi, Pietro H; Cannataro, Mario; Guerra, Concettina

    2012-01-01

    Local network alignment is an important component of the analysis of protein-protein interaction networks that may lead to the identification of evolutionary related complexes. We present AlignNemo, a new algorithm that, given the networks of two organisms, uncovers subnetworks of proteins that relate in biological function and topology of interactions. The discovered conserved subnetworks have a general topology and need not to correspond to specific interaction patterns, so that they more closely fit the models of functional complexes proposed in the literature. The algorithm is able to handle sparse interaction data with an expansion process that at each step explores the local topology of the networks beyond the proteins directly interacting with the current solution. To assess the performance of AlignNemo, we ran a series of benchmarks using statistical measures as well as biological knowledge. Based on reference datasets of protein complexes, AlignNemo shows better performance than other methods in terms of both precision and recall. We show our solutions to be biologically sound using the concept of semantic similarity applied to Gene Ontology vocabularies. The binaries of AlignNemo and supplementary details about the algorithms and the experiments are available at: sourceforge.net/p/alignnemo.

  3. FEAST: sensitive local alignment with multiple rates of evolution.

    PubMed

    Hudek, Alexander K; Brown, Daniel G

    2011-01-01

    We present a pairwise local aligner, FEAST, which uses two new techniques: a sensitive extension algorithm for identifying homologous subsequences, and a descriptive probabilistic alignment model. We also present a new procedure for training alignment parameters and apply it to the human and mouse genomes, producing a better parameter set for these sequences. Our extension algorithm identifies homologous subsequences by considering all evolutionary histories. It has higher maximum sensitivity than Viterbi extensions, and better balances specificity. We model alignments with several submodels, each with unique statistical properties, describing strongly similar and weakly similar regions of homologous DNA. Training parameters using two submodels produces superior alignments, even when we align with only the parameters from the weaker submodel. Our extension algorithm combined with our new parameter set achieves sensitivity 0.59 on synthetic tests. In contrast, LASTZ with default settings achieves sensitivity 0.35 with the same false positive rate. Using the weak submodel as parameters for LASTZ increases its sensitivity to 0.59 with high error. FEAST is available at http://monod.uwaterloo.ca/feast/.

  4. Cascaded face alignment via intimacy definition feature

    NASA Astrophysics Data System (ADS)

    Li, Hailiang; Lam, Kin-Man; Chiu, Man-Yau; Wu, Kangheng; Lei, Zhibin

    2017-09-01

    Recent years have witnessed the emerging popularity of regression-based face aligners, which directly learn mappings between facial appearance and shape-increment manifolds. We propose a random-forest based, cascaded regression model for face alignment by using a locally lightweight feature, namely intimacy definition feature. This feature is more discriminative than the pose-indexed feature, more efficient than the histogram of oriented gradients feature and the scale-invariant feature transform feature, and more compact than the local binary feature (LBF). Experimental validation of our algorithm shows that our approach achieves state-of-the-art performance when testing on some challenging datasets. Compared with the LBF-based algorithm, our method achieves about twice the speed, 20% improvement in terms of alignment accuracy and saves an order of magnitude on memory requirement.

  5. STELLAR: fast and exact local alignments

    PubMed Central

    2011-01-01

    Background Large-scale comparison of genomic sequences requires reliable tools for the search of local alignments. Practical local aligners are in general fast, but heuristic, and hence sometimes miss significant matches. Results We present here the local pairwise aligner STELLAR that has full sensitivity for ε-alignments, i.e. guarantees to report all local alignments of a given minimal length and maximal error rate. The aligner is composed of two steps, filtering and verification. We apply the SWIFT algorithm for lossless filtering, and have developed a new verification strategy that we prove to be exact. Our results on simulated and real genomic data confirm and quantify the conjecture that heuristic tools like BLAST or BLAT miss a large percentage of significant local alignments. Conclusions STELLAR is very practical and fast on very long sequences which makes it a suitable new tool for finding local alignments between genomic sequences under the edit distance model. Binaries are freely available for Linux, Windows, and Mac OS X at http://www.seqan.de/projects/stellar. The source code is freely distributed with the SeqAn C++ library version 1.3 and later at http://www.seqan.de. PMID:22151882

  6. Alignment-free detection of horizontal gene transfer between closely related bacterial genomes.

    PubMed

    Domazet-Lošo, Mirjana; Haubold, Bernhard

    2011-09-01

    Bacterial epidemics are often caused by strains that have acquired their increased virulence through horizontal gene transfer. Due to this association with disease, the detection of horizontal gene transfer continues to receive attention from microbiologists and bioinformaticians alike. Most software for detecting transfer events is based on alignments of sets of genes or of entire genomes. But despite great advances in the design of algorithms and computer programs, genome alignment remains computationally challenging. We have therefore developed an alignment-free algorithm for rapidly detecting horizontal gene transfer between closely related bacterial genomes. Our implementation of this algorithm is called alfy for "ALignment Free local homologY" and is freely available from http://guanine.evolbio.mpg.de/alfy/. In this comment we demonstrate the application of alfy to the genomes of Staphylococcus aureus. We also argue that-contrary to popular belief and in spite of increasing computer speed-algorithmic optimization is becoming more, not less, important if genome data continues to accumulate at the present rate.

  7. A greedy, graph-based algorithm for the alignment of multiple homologous gene lists.

    PubMed

    Fostier, Jan; Proost, Sebastian; Dhoedt, Bart; Saeys, Yvan; Demeester, Piet; Van de Peer, Yves; Vandepoele, Klaas

    2011-03-15

    Many comparative genomics studies rely on the correct identification of homologous genomic regions using accurate alignment tools. In such case, the alphabet of the input sequences consists of complete genes, rather than nucleotides or amino acids. As optimal multiple sequence alignment is computationally impractical, a progressive alignment strategy is often employed. However, such an approach is susceptible to the propagation of alignment errors in early pairwise alignment steps, especially when dealing with strongly diverged genomic regions. In this article, we present a novel accurate and efficient greedy, graph-based algorithm for the alignment of multiple homologous genomic segments, represented as ordered gene lists. Based on provable properties of the graph structure, several heuristics are developed to resolve local alignment conflicts that occur due to gene duplication and/or rearrangement events on the different genomic segments. The performance of the algorithm is assessed by comparing the alignment results of homologous genomic segments in Arabidopsis thaliana to those obtained by using both a progressive alignment method and an earlier graph-based implementation. Especially for datasets that contain strongly diverged segments, the proposed method achieves a substantially higher alignment accuracy, and proves to be sufficiently fast for large datasets including a few dozens of eukaryotic genomes. http://bioinformatics.psb.ugent.be/software. The algorithm is implemented as a part of the i-ADHoRe 3.0 package.

  8. PSO-based methods for medical image registration and change assessment of pigmented skin

    NASA Astrophysics Data System (ADS)

    Kacenjar, Steve; Zook, Matthew; Balint, Michael

    2011-03-01

    There are various scientific and technological areas in which it is imperative to rapidly detect and quantify changes in imagery over time. In fields such as earth remote sensing, aerospace systems, and medical imaging, searching for timedependent, regional changes across deformable topographies is complicated by varying camera acquisition geometries, lighting environments, background clutter conditions, and occlusion. Under these constantly-fluctuating conditions, the use of standard, rigid-body registration approaches often fail to provide sufficient fidelity to overlay image scenes together. This is problematic because incorrect assessments of the underlying changes of high-level topography can result in systematic errors in the quantification and classification of interested areas. For example, in the current naked-eye detection strategies of melanoma, a dermatologist often uses static morphological attributes to identify suspicious skin lesions for biopsy. This approach does not incorporate temporal changes which suggest malignant degeneration. By performing the co-registration of time-separated skin imagery, a dermatologist may more effectively detect and identify early morphological changes in pigmented lesions; enabling the physician to detect cancers at an earlier stage resulting in decreased morbidity and mortality. This paper describes an image processing system which will be used to detect changes in the characteristics of skin lesions over time. The proposed system consists of three main functional elements: 1.) coarse alignment of timesequenced imagery, 2.) refined alignment of local skin topographies, and 3.) assessment of local changes in lesion size. During the coarse alignment process, various approaches can be used to obtain a rough alignment, including: 1.) a manual landmark/intensity-based registration method1, and 2.) several flavors of autonomous optical matched filter methods2. These procedures result in the rough alignment of a patient's back topography. Since the skin is a deformable membrane, this process only provides an initial condition for subsequent refinements in aligning the localized topography of the skin. To achieve a refined enhancement, a Particle Swarm Optimizer (PSO) is used to optimally determine the local camera models associated with a generalized geometric transform. Here the optimization process is driven using the minimization of entropy between the multiple time-separated images. Once the camera models are corrected for local skin deformations, the images are compared using both pixel-based and regional-based methods. Limits on the detectability of change are established by the fidelity to which the algorithm corrects for local skin deformation and background alterations. These limits provide essential information in establishing early-warning thresholds for Melanoma detection. Key to this work is the development of a PSO alignment algorithm to perform the refined alignment in local skin topography between the time sequenced imagery (TSI). Test and validation of this alignment process is achieved using a forward model producing known geometric artifacts in the images and afterwards using a PSO algorithm to demonstrate the ability to identify and correct for these artifacts. Specifically, the forward model introduces local translational, rotational, and magnification changes within the image. These geometric modifiers are expected during TSI acquisition because of logistical issues to precisely align the patient to the image recording geometry and is therefore of paramount importance to any viable image registration system. This paper shows that the PSO alignment algorithm is effective in autonomously determining and mitigating these geometric modifiers. The degree of efficacy is measured by several statistically and morphologically based pre-image filtering operations applied to the TSI imagery before applying the PSO alignment algorithm. These trade studies show that global image threshold binarization provides rapid and superior convergence characteristics relative to that of morphologically based methods.

  9. A new algorithm for distorted fingerprints matching based on normalized fuzzy similarity measure.

    PubMed

    Chen, Xinjian; Tian, Jie; Yang, Xin

    2006-03-01

    Coping with nonlinear distortions in fingerprint matching is a challenging task. This paper proposes a novel algorithm, normalized fuzzy similarity measure (NFSM), to deal with the nonlinear distortions. The proposed algorithm has two main steps. First, the template and input fingerprints were aligned. In this process, the local topological structure matching was introduced to improve the robustness of global alignment. Second, the method NFSM was introduced to compute the similarity between the template and input fingerprints. The proposed algorithm was evaluated on fingerprints databases of FVC2004. Experimental results confirm that NFSM is a reliable and effective algorithm for fingerprint matching with nonliner distortions. The algorithm gives considerably higher matching scores compared to conventional matching algorithms for the deformed fingerprints.

  10. Net2Align: An Algorithm For Pairwise Global Alignment of Biological Networks

    PubMed Central

    Wadhwab, Gulshan; Upadhyayaa, K. C.

    2016-01-01

    The amount of data on molecular interactions is growing at an enormous pace, whereas the progress of methods for analysing this data is still lacking behind. Particularly, in the area of comparative analysis of biological networks, where one wishes to explore the similarity between two biological networks, this holds a potential problem. In consideration that the functionality primarily runs at the network level, it advocates the need for robust comparison methods. In this paper, we describe Net2Align, an algorithm for pairwise global alignment that can perform node-to-node correspondences as well as edge-to-edge correspondences into consideration. The uniqueness of our algorithm is in the fact that it is also able to detect the type of interaction, which is essential in case of directed graphs. The existing algorithm is only able to identify the common nodes but not the common edges. Another striking feature of the algorithm is that it is able to remove duplicate entries in case of variable datasets being aligned. This is achieved through creation of a local database which helps exclude duplicate links. In a pervasive computational study on gene regulatory network, we establish that our algorithm surpasses its counterparts in its results. Net2Align has been implemented in Java 7 and the source code is available as supplementary files. PMID:28356678

  11. A multimedia retrieval framework based on semi-supervised ranking and relevance feedback.

    PubMed

    Yang, Yi; Nie, Feiping; Xu, Dong; Luo, Jiebo; Zhuang, Yueting; Pan, Yunhe

    2012-04-01

    We present a new framework for multimedia content analysis and retrieval which consists of two independent algorithms. First, we propose a new semi-supervised algorithm called ranking with Local Regression and Global Alignment (LRGA) to learn a robust Laplacian matrix for data ranking. In LRGA, for each data point, a local linear regression model is used to predict the ranking scores of its neighboring points. A unified objective function is then proposed to globally align the local models from all the data points so that an optimal ranking score can be assigned to each data point. Second, we propose a semi-supervised long-term Relevance Feedback (RF) algorithm to refine the multimedia data representation. The proposed long-term RF algorithm utilizes both the multimedia data distribution in multimedia feature space and the history RF information provided by users. A trace ratio optimization problem is then formulated and solved by an efficient algorithm. The algorithms have been applied to several content-based multimedia retrieval applications, including cross-media retrieval, image retrieval, and 3D motion/pose data retrieval. Comprehensive experiments on four data sets have demonstrated its advantages in precision, robustness, scalability, and computational efficiency.

  12. An efficient algorithm for pairwise local alignment of protein interaction networks

    DOE PAGES

    Chen, Wenbin; Schmidt, Matthew; Tian, Wenhong; ...

    2015-04-01

    Recently, researchers seeking to understand, modify, and create beneficial traits in organisms have looked for evolutionarily conserved patterns of protein interactions. Their conservation likely means that the proteins of these conserved functional modules are important to the trait's expression. In this paper, we formulate the problem of identifying these conserved patterns as a graph optimization problem, and develop a fast heuristic algorithm for this problem. We compare the performance of our network alignment algorithm to that of the MaWISh algorithm [Koyuturk M, Kim Y, Topkara U, Subramaniam S, Szpankowski W, Grama A, Pairwise alignment of protein interaction networks, J Computmore » Biol 13(2): 182-199, 2006.], which bases its search algorithm on a related decision problem formulation. We find that our algorithm discovers conserved modules with a larger number of proteins in an order of magnitude less time. In conclusion, the protein sets found by our algorithm correspond to known conserved functional modules at comparable precision and recall rates as those produced by the MaWISh algorithm.« less

  13. Algorithms for Automatic Alignment of Arrays

    NASA Technical Reports Server (NTRS)

    Chatterjee, Siddhartha; Gilbert, John R.; Oliker, Leonid; Schreiber, Robert; Sheffler, Thomas J.

    1996-01-01

    Aggregate data objects (such as arrays) are distributed across the processor memories when compiling a data-parallel language for a distributed-memory machine. The mapping determines the amount of communication needed to bring operands of parallel operations into alignment with each other. A common approach is to break the mapping into two stages: an alignment that maps all the objects to an abstract template, followed by a distribution that maps the template to the processors. This paper describes algorithms for solving the various facets of the alignment problem: axis and stride alignment, static and mobile offset alignment, and replication labeling. We show that optimal axis and stride alignment is NP-complete for general program graphs, and give a heuristic method that can explore the space of possible solutions in a number of ways. We show that some of these strategies can give better solutions than a simple greedy approach proposed earlier. We also show how local graph contractions can reduce the size of the problem significantly without changing the best solution. This allows more complex and effective heuristics to be used. We show how to model the static offset alignment problem using linear programming, and we show that loop-dependent mobile offset alignment is sometimes necessary for optimum performance. We describe an algorithm with for determining mobile alignments for objects within do loops. We also identify situations in which replicated alignment is either required by the program itself or can be used to improve performance. We describe an algorithm based on network flow that replicates objects so as to minimize the total amount of broadcast communication in replication.

  14. Local alignment of two-base encoded DNA sequence

    PubMed Central

    Homer, Nils; Merriman, Barry; Nelson, Stanley F

    2009-01-01

    Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732

  15. SW#db: GPU-Accelerated Exact Sequence Similarity Database Search.

    PubMed

    Korpar, Matija; Šošić, Martin; Blažeka, Dino; Šikić, Mile

    2015-01-01

    In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result-the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goals: how to keep the running times acceptable while maintaining a high-enough level of sensitivity. The most time consuming step of similarity search are the local alignments between query and database sequences. This step is usually performed using exact local alignment algorithms such as Smith-Waterman. Due to its quadratic time complexity, alignments of a query to the whole database are usually too slow. Therefore, the majority of the protein similarity search methods prior to doing the exact local alignment apply heuristics to reduce the number of possible candidate sequences in the database. However, there is still a need for the alignment of a query sequence to a reduced database. In this paper we present the SW#db tool and a library for fast exact similarity search. Although its running times, as a standalone tool, are comparable to the running times of BLAST, it is primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced. It uses both GPU and CPU parallelization and was 4-5 times faster than SSEARCH, 6-25 times faster than CUDASW++ and more than 20 times faster than SSW at the time of writing, using multiple queries on Swiss-prot and Uniref90 databases.

  16. Sparse alignment for robust tensor learning.

    PubMed

    Lai, Zhihui; Wong, Wai Keung; Xu, Yong; Zhao, Cairong; Sun, Mingming

    2014-10-01

    Multilinear/tensor extensions of manifold learning based algorithms have been widely used in computer vision and pattern recognition. This paper first provides a systematic analysis of the multilinear extensions for the most popular methods by using alignment techniques, thereby obtaining a general tensor alignment framework. From this framework, it is easy to show that the manifold learning based tensor learning methods are intrinsically different from the alignment techniques. Based on the alignment framework, a robust tensor learning method called sparse tensor alignment (STA) is then proposed for unsupervised tensor feature extraction. Different from the existing tensor learning methods, L1- and L2-norms are introduced to enhance the robustness in the alignment step of the STA. The advantage of the proposed technique is that the difficulty in selecting the size of the local neighborhood can be avoided in the manifold learning based tensor feature extraction algorithms. Although STA is an unsupervised learning method, the sparsity encodes the discriminative information in the alignment step and provides the robustness of STA. Extensive experiments on the well-known image databases as well as action and hand gesture databases by encoding object images as tensors demonstrate that the proposed STA algorithm gives the most competitive performance when compared with the tensor-based unsupervised learning methods.

  17. Sequence analysis of Leukemia DNA

    NASA Astrophysics Data System (ADS)

    Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa

    2018-03-01

    Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.

  18. Accelerated Profile HMM Searches

    PubMed Central

    Eddy, Sean R.

    2011-01-01

    Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call “sparse rescaling”. These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches. PMID:22039361

  19. LPV Modeling of a Flexible Wing Aircraft Using Modal Alignment and Adaptive Gridding Methods

    NASA Technical Reports Server (NTRS)

    Al-Jiboory, Ali Khudhair; Zhu, Guoming; Swei, Sean Shan-Min; Su, Weihua; Nguyen, Nhan T.

    2017-01-01

    One of the earliest approaches in gain-scheduling control is the gridding based approach, in which a set of local linear time-invariant models are obtained at various gridded points corresponding to the varying parameters within the flight envelop. In order to ensure smooth and effective Linear Parameter-Varying control, aligning all the flexible modes within each local model and maintaining small number of representative local models over the gridded parameter space are crucial. In addition, since the flexible structural models tend to have large dimensions, a tractable model reduction process is necessary. In this paper, the notion of s-shifted H2- and H Infinity-norm are introduced and used as a metric to measure the model mismatch. A new modal alignment algorithm is developed which utilizes the defined metric for aligning all the local models over the entire gridded parameter space. Furthermore, an Adaptive Grid Step Size Determination algorithm is developed to minimize the number of local models required to represent the gridded parameter space. For model reduction, we propose to utilize the concept of Composite Modal Cost Analysis, through which the collective contribution of each flexible mode is computed and ranked. Therefore, a reduced-order model is constructed by retaining only those modes with significant contribution. The NASA Generic Transport Model operating at various flight speeds is studied for verification purpose, and the analysis and simulation results demonstrate the effectiveness of the proposed modeling approach.

  20. A basic analysis toolkit for biological sequences

    PubMed Central

    Giancarlo, Raffaele; Siragusa, Alessandro; Siragusa, Enrico; Utro, Filippo

    2007-01-01

    This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functions. Moreover, it also supports filtering operations to select strings from a set and establish their statistical significance, via z-score computation. None of the algorithms is new, but although they are generally regarded as fundamental for sequence analysis, they have not been implemented in a single and consistent software package, as we do here. Therefore, our main contribution is to fill this gap between algorithmic theory and practice by providing an extensible and easy to use software library that includes algorithms for the mentioned string matching and alignment problems. The library consists of C/C++ library functions as well as Perl library functions. It can be interfaced with Bioperl and can also be used as a stand-alone system with a GUI. The software is available at under the GNU GPL. PMID:17877802

  1. A Demons algorithm for image registration with locally adaptive regularization.

    PubMed

    Cahill, Nathan D; Noble, J Alison; Hawkes, David J

    2009-01-01

    Thirion's Demons is a popular algorithm for nonrigid image registration because of its linear computational complexity and ease of implementation. It approximately solves the diffusion registration problem by successively estimating force vectors that drive the deformation toward alignment and smoothing the force vectors by Gaussian convolution. In this article, we show how the Demons algorithm can be generalized to allow image-driven locally adaptive regularization in a manner that preserves both the linear complexity and ease of implementation of the original Demons algorithm. We show that the proposed algorithm exhibits lower target registration error and requires less computational effort than the original Demons algorithm on the registration of serial chest CT scans of patients with lung nodules.

  2. Memory-efficient dynamic programming backtrace and pairwise local sequence alignment.

    PubMed

    Newberg, Lee A

    2008-08-15

    A backtrace through a dynamic programming algorithm's intermediate results in search of an optimal path, or to sample paths according to an implied probability distribution, or as the second stage of a forward-backward algorithm, is a task of fundamental importance in computational biology. When there is insufficient space to store all intermediate results in high-speed memory (e.g. cache) existing approaches store selected stages of the computation, and recompute missing values from these checkpoints on an as-needed basis. Here we present an optimal checkpointing strategy, and demonstrate its utility with pairwise local sequence alignment of sequences of length 10,000. Sample C++-code for optimal backtrace is available in the Supplementary Materials. Supplementary data is available at Bioinformatics online.

  3. ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures.

    PubMed

    Konc, Janez; Cesnik, Tomo; Konc, Joanna Trykowska; Penca, Matej; Janežič, Dušanka

    2012-02-27

    ProBiS-Database is a searchable repository of precalculated local structural alignments in proteins detected by the ProBiS algorithm in the Protein Data Bank. Identification of functionally important binding regions of the protein is facilitated by structural similarity scores mapped to the query protein structure. PDB structures that have been aligned with a query protein may be rapidly retrieved from the ProBiS-Database, which is thus able to generate hypotheses concerning the roles of uncharacterized proteins. Presented with uncharacterized protein structure, ProBiS-Database can discern relationships between such a query protein and other better known proteins in the PDB. Fast access and a user-friendly graphical interface promote easy exploration of this database of over 420 million local structural alignments. The ProBiS-Database is updated weekly and is freely available online at http://probis.cmm.ki.si/database.

  4. Prostate lesion detection and localization based on locality alignment discriminant analysis

    NASA Astrophysics Data System (ADS)

    Lin, Mingquan; Chen, Weifu; Zhao, Mingbo; Gibson, Eli; Bastian-Jordan, Matthew; Cool, Derek W.; Kassam, Zahra; Chow, Tommy W. S.; Ward, Aaron; Chiu, Bernard

    2017-03-01

    Prostatic adenocarcinoma is one of the most commonly occurring cancers among men in the world, and it also the most curable cancer when it is detected early. Multiparametric MRI (mpMRI) combines anatomic and functional prostate imaging techniques, which have been shown to produce high sensitivity and specificity in cancer localization, which is important in planning biopsies and focal therapies. However, in previous investigations, lesion localization was achieved mainly by manual segmentation, which is time-consuming and prone to observer variability. Here, we developed an algorithm based on locality alignment discriminant analysis (LADA) technique, which can be considered as a version of linear discriminant analysis (LDA) localized to patches in the feature space. Sensitivity, specificity and accuracy generated by the proposed algorithm in five prostates by LADA were 52.2%, 89.1% and 85.1% respectively, compared to 31.3%, 85.3% and 80.9% generated by LDA. The delineation accuracy attainable by this tool has a potential in increasing the cancer detection rate in biopsies and in minimizing collateral damage of surrounding tissues in focal therapies.

  5. A Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm

    PubMed Central

    Guo, Xinyu; Wang, Hong; Devabhaktuni, Vijay

    2012-01-01

    A design of systolic array-based Field Programmable Gate Array (FPGA) parallel architecture for Basic Local Alignment Search Tool (BLAST) Algorithm is proposed. BLAST is a heuristic biological sequence alignment algorithm which has been used by bioinformatics experts. In contrast to other designs that detect at most one hit in one-clock-cycle, our design applies a Multiple Hits Detection Module which is a pipelining systolic array to search multiple hits in a single-clock-cycle. Further, we designed a Hits Combination Block which combines overlapping hits from systolic array into one hit. These implementations completed the first and second step of BLAST architecture and achieved significant speedup comparing with previously published architectures. PMID:25969747

  6. Score distributions of gapped multiple sequence alignments down to the low-probability tail

    NASA Astrophysics Data System (ADS)

    Fieth, Pascal; Hartmann, Alexander K.

    2016-08-01

    Assessing the significance of alignment scores of optimally aligned DNA or amino acid sequences can be achieved via the knowledge of the score distribution of random sequences. But this requires obtaining the distribution in the biologically relevant high-scoring region, where the probabilities are exponentially small. For gapless local alignments of infinitely long sequences this distribution is known analytically to follow a Gumbel distribution. Distributions for gapped local alignments and global alignments of finite lengths can only be obtained numerically. To obtain result for the small-probability region, specific statistical mechanics-based rare-event algorithms can be applied. In previous studies, this was achieved for pairwise alignments. They showed that, contrary to results from previous simple sampling studies, strong deviations from the Gumbel distribution occur in case of finite sequence lengths. Here we extend the studies to multiple sequence alignments with gaps, which are much more relevant for practical applications in molecular biology. We study the distributions of scores over a large range of the support, reaching probabilities as small as 10-160, for global and local (sum-of-pair scores) multiple alignments. We find that even after suitable rescaling, eliminating the sequence-length dependence, the distributions for multiple alignment differ from the pairwise alignment case. Furthermore, we also show that the previously discussed Gaussian correction to the Gumbel distribution needs to be refined, also for the case of pairwise alignments.

  7. Initial Alignment for SINS Based on Pseudo-Earth Frame in Polar Regions.

    PubMed

    Gao, Yanbin; Liu, Meng; Li, Guangchun; Guang, Xingxing

    2017-06-16

    An accurate initial alignment must be required for inertial navigation system (INS). The performance of initial alignment directly affects the following navigation accuracy. However, the rapid convergence of meridians and the small horizontalcomponent of rotation of Earth make the traditional alignment methods ineffective in polar regions. In this paper, from the perspective of global inertial navigation, a novel alignment algorithm based on pseudo-Earth frame and backward process is proposed to implement the initial alignment in polar regions. Considering that an accurate coarse alignment of azimuth is difficult to obtain in polar regions, the dynamic error modeling with large azimuth misalignment angle is designed. At the end of alignment phase, the strapdown attitude matrix relative to local geographic frame is obtained without influence of position errors and cumbersome computation. As a result, it would be more convenient to access the following polar navigation system. Then, it is also expected to unify the polar alignment algorithm as much as possible, thereby further unifying the form of external reference information. Finally, semi-physical static simulation and in-motion tests with large azimuth misalignment angle assisted by unscented Kalman filter (UKF) validate the effectiveness of the proposed method.

  8. fRMSDPred: Predicting Local RMSD Between Structural Fragments Using Sequence Information

    DTIC Science & Technology

    2007-04-04

    machine learning approaches for estimating the RMSD value of a pair of protein fragments. These estimated fragment-level RMSD values can be used to construct the alignment, assess the quality of an alignment, and identify high-quality alignment segments. We present algorithms to solve this fragment-level RMSD prediction problem using a supervised learning framework based on support vector regression and classification that incorporates protein profiles, predicted secondary structure, effective information encoding schemes, and novel second-order pairwise exponential kernel

  9. Heuristics for multiobjective multiple sequence alignment.

    PubMed

    Abbasi, Maryam; Paquete, Luís; Pereira, Francisco B

    2016-07-15

    Aligning multiple sequences arises in many tasks in Bioinformatics. However, the alignments produced by the current software packages are highly dependent on the parameters setting, such as the relative importance of opening gaps with respect to the increase of similarity. Choosing only one parameter setting may provide an undesirable bias in further steps of the analysis and give too simplistic interpretations. In this work, we reformulate multiple sequence alignment from a multiobjective point of view. The goal is to generate several sequence alignments that represent a trade-off between maximizing the substitution score and minimizing the number of indels/gaps in the sum-of-pairs score function. This trade-off gives to the practitioner further information about the similarity of the sequences, from which she could analyse and choose the most plausible alignment. We introduce several heuristic approaches, based on local search procedures, that compute a set of sequence alignments, which are representative of the trade-off between the two objectives (substitution score and indels). Several algorithm design options are discussed and analysed, with particular emphasis on the influence of the starting alignment and neighborhood search definitions on the overall performance. A perturbation technique is proposed to improve the local search, which provides a wide range of high-quality alignments. The proposed approach is tested experimentally on a wide range of instances. We performed several experiments with sequences obtained from the benchmark database BAliBASE 3.0. To evaluate the quality of the results, we calculate the hypervolume indicator of the set of score vectors returned by the algorithms. The results obtained allow us to identify reasonably good choices of parameters for our approach. Further, we compared our method in terms of correctly aligned pairs ratio and columns correctly aligned ratio with respect to reference alignments. Experimental results show that our approaches can obtain better results than TCoffee and Clustal Omega in terms of the first ratio.

  10. StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase.

    PubMed

    Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L

    2011-06-02

    Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.

  11. Heuristic reusable dynamic programming: efficient updates of local sequence alignment.

    PubMed

    Hong, Changjin; Tewfik, Ahmed H

    2009-01-01

    Recomputation of the previously evaluated similarity results between biological sequences becomes inevitable when researchers realize errors in their sequenced data or when the researchers have to compare nearly similar sequences, e.g., in a family of proteins. We present an efficient scheme for updating local sequence alignments with an affine gap model. In principle, using the previous matching result between two amino acid sequences, we perform a forward-backward alignment to generate heuristic searching bands which are bounded by a set of suboptimal paths. Given a correctly updated sequence, we initially predict a new score of the alignment path for each contour to select the best candidates among them. Then, we run the Smith-Waterman algorithm in this confined space. Furthermore, our heuristic alignment for an updated sequence shows that it can be further accelerated by using reusable dynamic programming (rDP), our prior work. In this study, we successfully validate "relative node tolerance bound" (RNTB) in the pruned searching space. Furthermore, we improve the computational performance by quantifying the successful RNTB tolerance probability and switch to rDP on perturbation-resilient columns only. In our searching space derived by a threshold value of 90 percent of the optimal alignment score, we find that 98.3 percent of contours contain correctly updated paths. We also find that our method consumes only 25.36 percent of the runtime cost of sparse dynamic programming (sDP) method, and to only 2.55 percent of that of a normal dynamic programming with the Smith-Waterman algorithm.

  12. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daily, Jeffrey A.

    Sequence alignment algorithms are a key component of many bioinformatics applications. Though various fast Smith-Waterman local sequence alignment implementations have been developed for x86 CPUs, most are embedded into larger database search tools. In addition, fast implementations of Needleman-Wunsch global sequence alignment and its semi-global variants are not as widespread. This article presents the first software library for local, global, and semi-global pairwise intra-sequence alignments and improves the performance of previous intra-sequence implementations. As a result, a faster intra-sequence pairwise alignment implementation is described and benchmarked. Using a 375 residue query sequence a speed of 136 billion cell updates permore » second (GCUPS) was achieved on a dual Intel Xeon E5-2670 12-core processor system, the highest reported for an implementation based on Farrar’s ’striped’ approach. When using only a single thread, parasail was 1.7 times faster than Rognes’s SWIPE. For many score matrices, parasail is faster than BLAST. The software library is designed for 64 bit Linux, OS X, or Windows on processors with SSE2, SSE41, or AVX2. Source code is available from https://github.com/jeffdaily/parasail under the Battelle BSD-style license. In conclusion, applications that require optimal alignment scores could benefit from the improved performance. For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library.« less

  13. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments

    DOE PAGES

    Daily, Jeffrey A.

    2016-02-10

    Sequence alignment algorithms are a key component of many bioinformatics applications. Though various fast Smith-Waterman local sequence alignment implementations have been developed for x86 CPUs, most are embedded into larger database search tools. In addition, fast implementations of Needleman-Wunsch global sequence alignment and its semi-global variants are not as widespread. This article presents the first software library for local, global, and semi-global pairwise intra-sequence alignments and improves the performance of previous intra-sequence implementations. As a result, a faster intra-sequence pairwise alignment implementation is described and benchmarked. Using a 375 residue query sequence a speed of 136 billion cell updates permore » second (GCUPS) was achieved on a dual Intel Xeon E5-2670 12-core processor system, the highest reported for an implementation based on Farrar’s ’striped’ approach. When using only a single thread, parasail was 1.7 times faster than Rognes’s SWIPE. For many score matrices, parasail is faster than BLAST. The software library is designed for 64 bit Linux, OS X, or Windows on processors with SSE2, SSE41, or AVX2. Source code is available from https://github.com/jeffdaily/parasail under the Battelle BSD-style license. In conclusion, applications that require optimal alignment scores could benefit from the improved performance. For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library.« less

  14. A rib-specific multimodal registration algorithm for fused unfolded rib visualization using PET/CT

    NASA Astrophysics Data System (ADS)

    Kaftan, Jens N.; Kopaczka, Marcin; Wimmer, Andreas; Platsch, Günther; Declerck, Jérôme

    2014-03-01

    Respiratory motion affects the alignment of PET and CT volumes from PET/CT examinations in a non-rigid manner. This becomes particularly apparent if reviewing fine anatomical structures such as ribs when assessing bone metastases, which frequently occur in many advanced cancers. To make this routine diagnostic task more efficient, a fused unfolded rib visualization for 18F-NaF PET/CT is presented. It allows to review the whole rib cage in a single image. This advanced visualization is enabled by a novel rib-specific registration algorithm that rigidly optimizes the local alignment of each individual rib in both modalities based on a matched filter response function. More specifically, rib centerlines are automatically extracted from CT and subsequently individually aligned to the corresponding bone-specific PET rib uptake pattern. The proposed method has been validated on 20 PET/CT scans acquired at different clinical sites. It has been demonstrated that the presented rib- specific registration method significantly improves the rib alignment without having to run complex deformable registration algorithms. At the same time, it guarantees that rib lesions are not further deformed, which may otherwise affect quantitative measurements such as SUVs. Considering clinically relevant distance thresholds, the centerline portion with good alignment compared to the ground truth improved from 60:6% to 86:7% after registration while approximately 98% can be still considered as acceptably aligned.

  15. Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments.

    PubMed

    Daily, Jeff

    2016-02-10

    Sequence alignment algorithms are a key component of many bioinformatics applications. Though various fast Smith-Waterman local sequence alignment implementations have been developed for x86 CPUs, most are embedded into larger database search tools. In addition, fast implementations of Needleman-Wunsch global sequence alignment and its semi-global variants are not as widespread. This article presents the first software library for local, global, and semi-global pairwise intra-sequence alignments and improves the performance of previous intra-sequence implementations. A faster intra-sequence local pairwise alignment implementation is described and benchmarked, including new global and semi-global variants. Using a 375 residue query sequence a speed of 136 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon E5-2670 24-core processor system, the highest reported for an implementation based on Farrar's 'striped' approach. Rognes's SWIPE optimal database search application is still generally the fastest available at 1.2 to at best 2.4 times faster than Parasail for sequences shorter than 500 amino acids. However, Parasail was faster for longer sequences. For global alignments, Parasail's prefix scan implementation is generally the fastest, faster even than Farrar's 'striped' approach, however the opal library is faster for single-threaded applications. The software library is designed for 64 bit Linux, OS X, or Windows on processors with SSE2, SSE41, or AVX2. Source code is available from https://github.com/jeffdaily/parasail under the Battelle BSD-style license. Applications that require optimal alignment scores could benefit from the improved performance. For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library.

  16. Accuracy in breast shape alignment with 3D surface fitting algorithms.

    PubMed

    Riboldi, Marco; Gierga, David P; Chen, George T Y; Baroni, Guido

    2009-04-01

    Surface imaging is in use in radiotherapy clinical practice for patient setup optimization and monitoring. Breast alignment is accomplished by searching for a tentative spatial correspondence between the reference and daily surface shape models. In this study, the authors quantify whole breast shape alignment by relying on texture features digitized on 3D surface models. Texture feature localization was validated through repeated measurements in a silicone breast phantom, mounted on a high precision mechanical stage. Clinical investigations on breast shape alignment included 133 fractions in 18 patients treated with accelerated partial breast irradiation. The breast shape was detected with a 3D video based surface imaging system so that breathing was compensated. An in-house algorithm for breast alignment, based on surface fitting constrained by nipple matching (constrained surface fitting), was applied. Results were compared with a commercial software where no constraints are utilized (unconstrained surface fitting). Texture feature localization was validated within 2 mm in each anatomical direction. Clinical data show that unconstrained surface fitting achieves adequate accuracy in most cases, though nipple mismatch is considerably higher than residual surface distances (3.9 mm vs 0.6 mm on average). Outliers beyond 1 cm can be experienced as the result of a degenerate surface fit, where unconstrained surface fitting is not sufficient to establish spatial correspondence. In the constrained surface fitting algorithm, average surface mismatch within 1 mm was obtained when nipple position was forced to match in the [1.5; 5] mm range. In conclusion, optimal results can be obtained by trading off the desired overall surface congruence vs matching of selected landmarks (constraint). Constrained surface fitting is put forward to represent an improvement in setup accuracy for those applications where whole breast positional reproducibility is an issue.

  17. Evaluation of peak picking quality in LC-MS metabolomics data.

    PubMed

    Brodsky, Leonid; Moussaieff, Arieh; Shahaf, Nir; Aharoni, Asaph; Rogachev, Ilana

    2010-11-15

    The output of LC-MS metabolomics experiments consists of mass-peak intensities identified through a peak-picking/alignment procedure. Besides imperfections in biological samples and instrumentation, data accuracy is highly dependent on the applied algorithms and their parameters. Consequently, quality control (QC) is essential for further data analysis. Here, we present a QC approach that is based on discrepancies between replicate samples. First, the quantile normalization of per-sample log-signal distributions is applied to each group of biologically homogeneous samples. Next, the overall quality of each replicate group is characterized by the Z-transformed correlation coefficients between samples. This general QC allows a tuning of the procedure's parameters which minimizes the inter-replicate discrepancies in the generated output. Subsequently, an in-depth QC measure detects local neighborhoods on a template of aligned chromatograms that are enriched by divergences between intensity profiles of replicate samples. These neighborhoods are determined through a segmentation algorithm. The retention time (RT)-m/z positions of the neighborhoods with local divergences are indicative of either: incorrect alignment of chromatographic features, technical problems in the chromatograms, or to a true biological discrepancy between replicates for particular metabolites. We expect this method to aid in the accurate analysis of metabolomics data and in the development of new peak-picking/alignment procedures.

  18. A Polar Initial Alignment Algorithm for Unmanned Underwater Vehicles

    PubMed Central

    Yan, Zheping; Wang, Lu; Wang, Tongda; Zhang, Honghan; Zhang, Xun; Liu, Xiangling

    2017-01-01

    Due to its highly autonomy, the strapdown inertial navigation system (SINS) is widely used in unmanned underwater vehicles (UUV) navigation. Initial alignment is crucial because the initial alignment results will be used as the initial SINS value, which might affect the subsequent SINS results. Due to the rapid convergence of Earth meridians, there is a calculation overflow in conventional initial alignment algorithms, making conventional initial algorithms are invalid for polar UUV navigation. To overcome these problems, a polar initial alignment algorithm for UUV is proposed in this paper, which consists of coarse and fine alignment algorithms. Based on the principle of the conical slow drift of gravity, the coarse alignment algorithm is derived under the grid frame. By choosing the velocity and attitude as the measurement, the fine alignment with the Kalman filter (KF) is derived under the grid frame. Simulation and experiment are realized among polar, conventional and transversal initial alignment algorithms for polar UUV navigation. Results demonstrate that the proposed polar initial alignment algorithm can complete the initial alignment of UUV in the polar region rapidly and accurately. PMID:29168735

  19. Optimized graph-based mosaicking for virtual microscopy

    NASA Astrophysics Data System (ADS)

    Steckhan, Dirk G.; Wittenberg, Thomas

    2009-02-01

    Virtual microscopy has the potential to partially replace traditional microscopy. For virtualization, the slide is scanned once by a fully automatized robotic microscope and saved digitally. Typically, such a scan results in several hundreds to thousands of fields of view. Since robotic stages have positioning errors, these fields of view have to be registered locally and globally in an additional step. In this work we propose a new global mosaicking method for the creation of virtual slides based on sub-pixel exact phase correlation for local alignment in combination with Prim's minimum spanning tree algorithm for global alignment. Our algorithm allows for a robust reproduction of the original slide even in the presence of views with little to no information content. This makes it especially suitable for the mosaicking of cervical smears. These smears often exhibit large empty areas, which do not contain enough information for common stitching approaches.

  20. GIRAF: a method for fast search and flexible alignment of ligand binding interfaces in proteins at atomic resolution

    PubMed Central

    Kinjo, Akira R.; Nakamura, Haruki

    2012-01-01

    Comparison and classification of protein structures are fundamental means to understand protein functions. Due to the computational difficulty and the ever-increasing amount of structural data, however, it is in general not feasible to perform exhaustive all-against-all structure comparisons necessary for comprehensive classifications. To efficiently handle such situations, we have previously proposed a method, now called GIRAF. We herein describe further improvements in the GIRAF protein structure search and alignment method. The GIRAF method achieves extremely efficient search of similar structures of ligand binding sites of proteins by exploiting database indexing of structural features of local coordinate frames. In addition, it produces refined atom-wise alignments by iterative applications of the Hungarian method to the bipartite graph defined for a pair of superimposed structures. By combining the refined alignments based on different local coordinate frames, it is made possible to align structures involving domain movements. We provide detailed accounts for the database design, the search and alignment algorithms as well as some benchmark results. PMID:27493524

  1. Automatic alignment of pre- and post-interventional liver CT images for assessment of radiofrequency ablation

    NASA Astrophysics Data System (ADS)

    Rieder, Christian; Wirtz, Stefan; Strehlow, Jan; Zidowitz, Stephan; Bruners, Philipp; Isfort, Peter; Mahnken, Andreas H.; Peitgen, Heinz-Otto

    2012-02-01

    Image-guided radiofrequency ablation (RFA) is becoming a standard procedure for minimally invasive tumor treatment in clinical practice. To verify the treatment success of the therapy, reliable post-interventional assessment of the ablation zone (coagulation) is essential. Typically, pre- and post-interventional CT images have to be aligned to compare the shape, size, and position of tumor and coagulation zone. In this work, we present an automatic workflow for masking liver tissue, enabling a rigid registration algorithm to perform at least as accurate as experienced medical experts. To minimize the effect of global liver deformations, the registration is computed in a local region of interest around the pre-interventional lesion and post-interventional coagulation necrosis. A registration mask excluding lesions and neighboring organs is calculated to prevent the registration algorithm from matching both lesion shapes instead of the surrounding liver anatomy. As an initial registration step, the centers of gravity from both lesions are aligned automatically. The subsequent rigid registration method is based on the Local Cross Correlation (LCC) similarity measure and Newton-type optimization. To assess the accuracy of our method, 41 RFA cases are registered and compared with the manually aligned cases from four medical experts. Furthermore, the registration results are compared with ground truth transformations based on averaged anatomical landmark pairs. In the evaluation, we show that our method allows to automatic alignment of the data sets with equal accuracy as medical experts, but requiring significancy less time consumption and variability.

  2. Dinucleotide controlled null models for comparative RNA gene prediction.

    PubMed

    Gesell, Tanja; Washietl, Stefan

    2008-05-27

    Comparative prediction of RNA structures can be used to identify functional noncoding RNAs in genomic screens. It was shown recently by Babak et al. [BMC Bioinformatics. 8:33] that RNA gene prediction programs can be biased by the genomic dinucleotide content, in particular those programs using a thermodynamic folding model including stacking energies. As a consequence, there is need for dinucleotide-preserving control strategies to assess the significance of such predictions. While there have been randomization algorithms for single sequences for many years, the problem has remained challenging for multiple alignments and there is currently no algorithm available. We present a program called SISSIz that simulates multiple alignments of a given average dinucleotide content. Meeting additional requirements of an accurate null model, the randomized alignments are on average of the same sequence diversity and preserve local conservation and gap patterns. We make use of a phylogenetic substitution model that includes overlapping dependencies and site-specific rates. Using fast heuristics and a distance based approach, a tree is estimated under this model which is used to guide the simulations. The new algorithm is tested on vertebrate genomic alignments and the effect on RNA structure predictions is studied. In addition, we directly combined the new null model with the RNAalifold consensus folding algorithm giving a new variant of a thermodynamic structure based RNA gene finding program that is not biased by the dinucleotide content. SISSIz implements an efficient algorithm to randomize multiple alignments preserving dinucleotide content. It can be used to get more accurate estimates of false positive rates of existing programs, to produce negative controls for the training of machine learning based programs, or as standalone RNA gene finding program. Other applications in comparative genomics that require randomization of multiple alignments can be considered. SISSIz is available as open source C code that can be compiled for every major platform and downloaded here: http://sourceforge.net/projects/sissiz.

  3. Implementation of a parallel protein structure alignment service on cloud.

    PubMed

    Hung, Che-Lun; Lin, Yaw-Ling

    2013-01-01

    Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.

  4. Implementation of a Parallel Protein Structure Alignment Service on Cloud

    PubMed Central

    Hung, Che-Lun; Lin, Yaw-Ling

    2013-01-01

    Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform. PMID:23671842

  5. DISCO: Distance and Spectrum Correlation Optimization Alignment for Two Dimensional Gas Chromatography Time-of-Flight Mass Spectrometry-based Metabolomics

    PubMed Central

    Wang, Bing; Fang, Aiqin; Heim, John; Bogdanov, Bogdan; Pugh, Scott; Libardoni, Mark; Zhang, Xiang

    2010-01-01

    A novel peak alignment algorithm using a distance and spectrum correlation optimization (DISCO) method has been developed for two-dimensional gas chromatography time-of-flight mass spectrometry (GC×GC/TOF-MS) based metabolomics. This algorithm uses the output of the instrument control software, ChromaTOF, as its input data. It detects and merges multiple peak entries of the same metabolite into one peak entry in each input peak list. After a z-score transformation of metabolite retention times, DISCO selects landmark peaks from all samples based on both two-dimensional retention times and mass spectrum similarity of fragment ions measured by Pearson’s correlation coefficient. A local linear fitting method is employed in the original two-dimensional retention time space to correct retention time shifts. A progressive retention time map searching method is used to align metabolite peaks in all samples together based on optimization of the Euclidean distance and mass spectrum similarity. The effectiveness of the DISCO algorithm is demonstrated using data sets acquired under different experiment conditions and a spiked-in experiment. PMID:20476746

  6. Dynamic programming algorithms for biological sequence comparison.

    PubMed

    Pearson, W R; Miller, W

    1992-01-01

    Efficient dynamic programming algorithms are available for a broad class of protein and DNA sequence comparison problems. These algorithms require computer time proportional to the product of the lengths of the two sequences being compared [O(N2)] but require memory space proportional only to the sum of these lengths [O(N)]. Although the requirement for O(N2) time limits use of the algorithms to the largest computers when searching protein and DNA sequence databases, many other applications of these algorithms, such as calculation of distances for evolutionary trees and comparison of a new sequence to a library of sequence profiles, are well within the capabilities of desktop computers. In particular, the results of library searches with rapid searching programs, such as FASTA or BLAST, should be confirmed by performing a rigorous optimal alignment. Whereas rapid methods do not overlook significant sequence similarities, FASTA limits the number of gaps that can be inserted into an alignment, so that a rigorous alignment may extend the alignment substantially in some cases. BLAST does not allow gaps in the local regions that it reports; a calculation that allows gaps is very likely to extend the alignment substantially. Although a Monte Carlo evaluation of the statistical significance of a similarity score with a rigorous algorithm is much slower than the heuristic approach used by the RDF2 program, the dynamic programming approach should take less than 1 hr on a 386-based PC or desktop Unix workstation. For descriptive purposes, we have limited our discussion to methods for calculating similarity scores and distances that use gap penalties of the form g = rk. Nevertheless, programs for the more general case (g = q+rk) are readily available. Versions of these programs that run either on Unix workstations, IBM-PC class computers, or the Macintosh can be obtained from either of the authors.

  7. Improving oncoplastic breast tumor bed localization for radiotherapy planning using image registration algorithms

    NASA Astrophysics Data System (ADS)

    Wodzinski, Marek; Skalski, Andrzej; Ciepiela, Izabela; Kuszewski, Tomasz; Kedzierawski, Piotr; Gajda, Janusz

    2018-02-01

    Knowledge about tumor bed localization and its shape analysis is a crucial factor for preventing irradiation of healthy tissues during supportive radiotherapy and as a result, cancer recurrence. The localization process is especially hard for tumors placed nearby soft tissues, which undergo complex, nonrigid deformations. Among them, breast cancer can be considered as the most representative example. A natural approach to improving tumor bed localization is the use of image registration algorithms. However, this involves two unusual aspects which are not common in typical medical image registration: the real deformation field is discontinuous, and there is no direct correspondence between the cancer and its bed in the source and the target 3D images respectively. The tumor no longer exists during radiotherapy planning. Therefore, a traditional evaluation approach based on known, smooth deformations and target registration error are not directly applicable. In this work, we propose alternative artificial deformations which model the tumor bed creation process. We perform a comprehensive evaluation of the most commonly used deformable registration algorithms: B-Splines free form deformations (B-Splines FFD), different variants of the Demons and TV-L1 optical flow. The evaluation procedure includes quantitative assessment of the dedicated artificial deformations, target registration error calculation, 3D contour propagation and medical experts visual judgment. The results demonstrate that the currently, practically applied image registration (rigid registration and B-Splines FFD) are not able to correctly reconstruct discontinuous deformation fields. We show that the symmetric Demons provide the most accurate soft tissues alignment in terms of the ability to reconstruct the deformation field, target registration error and relative tumor volume change, while B-Splines FFD and TV-L1 optical flow are not an appropriate choice for the breast tumor bed localization problem, even though the visual alignment seems to be better than for the Demons algorithm. However, no algorithm could recover the deformation field with sufficient accuracy in terms of vector length and rotation angle differences.

  8. StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zemla, A; Lang, D; Kostova, T

    2010-11-29

    Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less

  9. A novel fully automatic scheme for fiducial marker-based alignment in electron tomography.

    PubMed

    Han, Renmin; Wang, Liansan; Liu, Zhiyong; Sun, Fei; Zhang, Fa

    2015-12-01

    Although the topic of fiducial marker-based alignment in electron tomography (ET) has been widely discussed for decades, alignment without human intervention remains a difficult problem. Specifically, the emergence of subtomogram averaging has increased the demand for batch processing during tomographic reconstruction; fully automatic fiducial marker-based alignment is the main technique in this process. However, the lack of an accurate method for detecting and tracking fiducial markers precludes fully automatic alignment. In this paper, we present a novel, fully automatic alignment scheme for ET. Our scheme has two main contributions: First, we present a series of algorithms to ensure a high recognition rate and precise localization during the detection of fiducial markers. Our proposed solution reduces fiducial marker detection to a sampling and classification problem and further introduces an algorithm to solve the parameter dependence of marker diameter and marker number. Second, we propose a novel algorithm to solve the tracking of fiducial markers by reducing the tracking problem to an incomplete point set registration problem. Because a global optimization of a point set registration occurs, the result of our tracking is independent of the initial image position in the tilt series, allowing for the robust tracking of fiducial markers without pre-alignment. The experimental results indicate that our method can achieve an accurate tracking, almost identical to the current best one in IMOD with half automatic scheme. Furthermore, our scheme is fully automatic, depends on fewer parameters (only requires a gross value of the marker diameter) and does not require any manual interaction, providing the possibility of automatic batch processing of electron tomographic reconstruction. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. Protein local structure alignment under the discrete Fréchet distance.

    PubMed

    Zhu, Binhai

    2007-12-01

    Protein structure alignment is a fundamental problem in computational and structural biology. While there has been lots of experimental/heuristic methods and empirical results, very few results are known regarding the algorithmic/complexity aspects of the problem, especially on protein local structure alignment. A well-known measure to characterize the similarity of two polygonal chains is the famous Fréchet distance, and with the application of protein-related research, a related discrete Fréchet distance has been used recently. In this paper, following the recent work of Jiang et al. we investigate the protein local structural alignment problem using bounded discrete Fréchet distance. Given m proteins (or protein backbones, which are 3D polygonal chains), each of length O(n), our main results are summarized as follows: * If the number of proteins, m, is not part of the input, then the problem is NP-complete; moreover, under bounded discrete Fréchet distance it is NP-hard to approximate the maximum size common local structure within a factor of n(1-epsilon). These results hold both when all the proteins are static and when translation/rotation are allowed. * If the number of proteins, m, is a constant, then there is a polynomial time solution for the problem.

  11. Experimental Evaluation of a Deformable Registration Algorithm for Motion Correction in PET-CT Guided Biopsy.

    PubMed

    Khare, Rahul; Sala, Guillaume; Kinahan, Paul; Esposito, Giuseppe; Banovac, Filip; Cleary, Kevin; Enquobahrie, Andinet

    2013-01-01

    Positron emission tomography computed tomography (PET-CT) images are increasingly being used for guidance during percutaneous biopsy. However, due to the physics of image acquisition, PET-CT images are susceptible to problems due to respiratory and cardiac motion, leading to inaccurate tumor localization, shape distortion, and attenuation correction. To address these problems, we present a method for motion correction that relies on respiratory gated CT images aligned using a deformable registration algorithm. In this work, we use two deformable registration algorithms and two optimization approaches for registering the CT images obtained over the respiratory cycle. The two algorithms are the BSpline and the symmetric forces Demons registration. In the first optmization approach, CT images at each time point are registered to a single reference time point. In the second approach, deformation maps are obtained to align each CT time point with its adjacent time point. These deformations are then composed to find the deformation with respect to a reference time point. We evaluate these two algorithms and optimization approaches using respiratory gated CT images obtained from 7 patients. Our results show that overall the BSpline registration algorithm with the reference optimization approach gives the best results.

  12. Robust video copy detection approach based on local tangent space alignment

    NASA Astrophysics Data System (ADS)

    Nie, Xiushan; Qiao, Qianping

    2012-04-01

    We propose a robust content-based video copy detection approach based on local tangent space alignment (LTSA), which is an efficient dimensionality reduction algorithm. The idea is motivated by the fact that the content of video becomes richer and the dimension of content becomes higher. It does not give natural tools for video analysis and understanding because of the high dimensionality. The proposed approach reduces the dimensionality of video content using LTSA, and then generates video fingerprints in low dimensional space for video copy detection. Furthermore, a dynamic sliding window is applied to fingerprint matching. Experimental results show that the video copy detection approach has good robustness and discrimination.

  13. Evaluation of GMI and PMI diffeomorphic‐based demons algorithms for aligning PET and CT Images

    PubMed Central

    Yang, Juan; Zhang, You; Yin, Yong

    2015-01-01

    Fusion of anatomic information in computed tomography (CT) and functional information in F18‐FDG positron emission tomography (PET) is crucial for accurate differentiation of tumor from benign masses, designing radiotherapy treatment plan and staging of cancer. Although current PET and CT images can be acquired from combined F18‐FDG PET/CT scanner, the two acquisitions are scanned separately and take a long time, which may induce potential positional errors in global and local caused by respiratory motion or organ peristalsis. So registration (alignment) of whole‐body PET and CT images is a prerequisite for their meaningful fusion. The purpose of this study was to assess the performance of two multimodal registration algorithms for aligning PET and CT images. The proposed gradient of mutual information (GMI)‐based demons algorithm, which incorporated the GMI between two images as an external force to facilitate the alignment, was compared with the point‐wise mutual information (PMI) diffeomorphic‐based demons algorithm whose external force was modified by replacing the image intensity difference in diffeomorphic demons algorithm with the PMI to make it appropriate for multimodal image registration. Eight patients with esophageal cancer(s) were enrolled in this IRB‐approved study. Whole‐body PET and CT images were acquired from a combined F18‐FDG PET/CT scanner for each patient. The modified Hausdorff distance (dMH) was used to evaluate the registration accuracy of the two algorithms. Of all patients, the mean values and standard deviations (SDs) of dMH were 6.65 (± 1.90) voxels and 6.01 (± 1.90) after the GMI‐based demons and the PMI diffeomorphic‐based demons registration algorithms respectively. Preliminary results on oncological patients showed that the respiratory motion and organ peristalsis in PET/CT esophageal images could not be neglected, although a combined F18‐FDG PET/CT scanner was used for image acquisition. The PMI diffeomorphic‐based demons algorithm was more accurate than the GMI‐based demons algorithm in registering PET/CT esophageal images. PACS numbers: 87.57.nj, 87.57. Q‐, 87.57.uk PMID:26218993

  14. Evaluation of GMI and PMI diffeomorphic-based demons algorithms for aligning PET and CT Images.

    PubMed

    Yang, Juan; Wang, Hongjun; Zhang, You; Yin, Yong

    2015-07-08

    Fusion of anatomic information in computed tomography (CT) and functional information in 18F-FDG positron emission tomography (PET) is crucial for accurate differentiation of tumor from benign masses, designing radiotherapy treatment plan and staging of cancer. Although current PET and CT images can be acquired from combined 18F-FDG PET/CT scanner, the two acquisitions are scanned separately and take a long time, which may induce potential positional errors in global and local caused by respiratory motion or organ peristalsis. So registration (alignment) of whole-body PET and CT images is a prerequisite for their meaningful fusion. The purpose of this study was to assess the performance of two multimodal registration algorithms for aligning PET and CT images. The proposed gradient of mutual information (GMI)-based demons algorithm, which incorporated the GMI between two images as an external force to facilitate the alignment, was compared with the point-wise mutual information (PMI) diffeomorphic-based demons algorithm whose external force was modified by replacing the image intensity difference in diffeomorphic demons algorithm with the PMI to make it appropriate for multimodal image registration. Eight patients with esophageal cancer(s) were enrolled in this IRB-approved study. Whole-body PET and CT images were acquired from a combined 18F-FDG PET/CT scanner for each patient. The modified Hausdorff distance (d(MH)) was used to evaluate the registration accuracy of the two algorithms. Of all patients, the mean values and standard deviations (SDs) of d(MH) were 6.65 (± 1.90) voxels and 6.01 (± 1.90) after the GMI-based demons and the PMI diffeomorphic-based demons registration algorithms respectively. Preliminary results on oncological patients showed that the respiratory motion and organ peristalsis in PET/CT esophageal images could not be neglected, although a combined 18F-FDG PET/CT scanner was used for image acquisition. The PMI diffeomorphic-based demons algorithm was more accurate than the GMI-based demons algorithm in registering PET/CT esophageal images.

  15. PROPER: global protein interaction network alignment through percolation matching.

    PubMed

    Kazemi, Ehsan; Hassani, Hamed; Grossglauser, Matthias; Pezeshgi Modarres, Hassan

    2016-12-12

    The alignment of protein-protein interaction (PPI) networks enables us to uncover the relationships between different species, which leads to a deeper understanding of biological systems. Network alignment can be used to transfer biological knowledge between species. Although different PPI-network alignment algorithms were introduced during the last decade, developing an accurate and scalable algorithm that can find alignments with high biological and structural similarities among PPI networks is still challenging. In this paper, we introduce a new global network alignment algorithm for PPI networks called PROPER. Compared to other global network alignment methods, our algorithm shows higher accuracy and speed over real PPI datasets and synthetic networks. We show that the PROPER algorithm can detect large portions of conserved biological pathways between species. Also, using a simple parsimonious evolutionary model, we explain why PROPER performs well based on several different comparison criteria. We highlight that PROPER has high potential in further applications such as detecting biological pathways, finding protein complexes and PPI prediction. The PROPER algorithm is available at http://proper.epfl.ch .

  16. Sensor Network Localization by Eigenvector Synchronization Over the Euclidean Group

    PubMed Central

    CUCURINGU, MIHAI; LIPMAN, YARON; SINGER, AMIT

    2013-01-01

    We present a new approach to localization of sensors from noisy measurements of a subset of their Euclidean distances. Our algorithm starts by finding, embedding, and aligning uniquely realizable subsets of neighboring sensors called patches. In the noise-free case, each patch agrees with its global positioning up to an unknown rigid motion of translation, rotation, and possibly reflection. The reflections and rotations are estimated using the recently developed eigenvector synchronization algorithm, while the translations are estimated by solving an overdetermined linear system. The algorithm is scalable as the number of nodes increases and can be implemented in a distributed fashion. Extensive numerical experiments show that it compares favorably to other existing algorithms in terms of robustness to noise, sparse connectivity, and running time. While our approach is applicable to higher dimensions, in the current article, we focus on the two-dimensional case. PMID:23946700

  17. Structural alignment of protein descriptors - a combinatorial model.

    PubMed

    Antczak, Maciej; Kasprzak, Marta; Lukasiak, Piotr; Blazewicz, Jacek

    2016-09-17

    Structural alignment of proteins is one of the most challenging problems in molecular biology. The tertiary structure of a protein strictly correlates with its function and computationally predicted structures are nowadays a main premise for understanding the latter. However, computationally derived 3D models often exhibit deviations from the native structure. A way to confirm a model is a comparison with other structures. The structural alignment of a pair of proteins can be defined with the use of a concept of protein descriptors. The protein descriptors are local substructures of protein molecules, which allow us to divide the original problem into a set of subproblems and, consequently, to propose a more efficient algorithmic solution. In the literature, one can find many applications of the descriptors concept that prove its usefulness for insight into protein 3D structures, but the proposed approaches are presented rather from the biological perspective than from the computational or algorithmic point of view. Efficient algorithms for identification and structural comparison of descriptors can become crucial components of methods for structural quality assessment as well as tertiary structure prediction. In this paper, we propose a new combinatorial model and new polynomial-time algorithms for the structural alignment of descriptors. The model is based on the maximum-size assignment problem, which we define here and prove that it can be solved in polynomial time. We demonstrate suitability of this approach by comparison with an exact backtracking algorithm. Besides a simplification coming from the combinatorial modeling, both on the conceptual and complexity level, we gain with this approach high quality of obtained results, in terms of 3D alignment accuracy and processing efficiency. All the proposed algorithms were developed and integrated in a computationally efficient tool descs-standalone, which allows the user to identify and structurally compare descriptors of biological molecules, such as proteins and RNAs. Both PDB (Protein Data Bank) and mmCIF (macromolecular Crystallographic Information File) formats are supported. The proposed tool is available as an open source project stored on GitHub ( https://github.com/mantczak/descs-standalone ).

  18. SPHINX--an algorithm for taxonomic binning of metagenomic sequences.

    PubMed

    Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Singh, Nitin Kumar; Mande, Sharmila S

    2011-01-01

    Compared with composition-based binning algorithms, the binning accuracy and specificity of alignment-based binning algorithms is significantly higher. However, being alignment-based, the latter class of algorithms require enormous amount of time and computing resources for binning huge metagenomic datasets. The motivation was to develop a binning approach that can analyze metagenomic datasets as rapidly as composition-based approaches, but nevertheless has the accuracy and specificity of alignment-based algorithms. This article describes a hybrid binning approach (SPHINX) that achieves high binning efficiency by utilizing the principles of both 'composition'- and 'alignment'-based binning algorithms. Validation results with simulated sequence datasets indicate that SPHINX is able to analyze metagenomic sequences as rapidly as composition-based algorithms. Furthermore, the binning efficiency (in terms of accuracy and specificity of assignments) of SPHINX is observed to be comparable with results obtained using alignment-based algorithms. A web server for the SPHINX algorithm is available at http://metagenomics.atc.tcs.com/SPHINX/.

  19. Accelerated probabilistic inference of RNA structure evolution

    PubMed Central

    Holmes, Ian

    2005-01-01

    Background Pairwise stochastic context-free grammars (Pair SCFGs) are powerful tools for evolutionary analysis of RNA, including simultaneous RNA sequence alignment and secondary structure prediction, but the associated algorithms are intensive in both CPU and memory usage. The same problem is faced by other RNA alignment-and-folding algorithms based on Sankoff's 1985 algorithm. It is therefore desirable to constrain such algorithms, by pre-processing the sequences and using this first pass to limit the range of structures and/or alignments that can be considered. Results We demonstrate how flexible classes of constraint can be imposed, greatly reducing the computational costs while maintaining a high quality of structural homology prediction. Any score-attributed context-free grammar (e.g. energy-based scoring schemes, or conditionally normalized Pair SCFGs) is amenable to this treatment. It is now possible to combine independent structural and alignment constraints of unprecedented general flexibility in Pair SCFG alignment algorithms. We outline several applications to the bioinformatics of RNA sequence and structure, including Waterman-Eggert N-best alignments and progressive multiple alignment. We evaluate the performance of the algorithm on test examples from the RFAM database. Conclusion A program, Stemloc, that implements these algorithms for efficient RNA sequence alignment and structure prediction is available under the GNU General Public License. PMID:15790387

  20. The practical evaluation of DNA barcode efficacy.

    PubMed

    Spouge, John L; Mariño-Ramírez, Leonardo

    2012-01-01

    This chapter describes a workflow for measuring the efficacy of a barcode in identifying species. First, assemble individual sequence databases corresponding to each barcode marker. A controlled collection of taxonomic data is preferable to GenBank data, because GenBank data can be problematic, particularly when comparing barcodes based on more than one marker. To ensure proper controls when evaluating species identification, specimens not having a sequence in every marker database should be discarded. Second, select a computer algorithm for assigning species to barcode sequences. No algorithm has yet improved notably on assigning a specimen to the species of its nearest neighbor within a barcode database. Because global sequence alignments (e.g., with the Needleman-Wunsch algorithm, or some related algorithm) examine entire barcode sequences, they generally produce better species assignments than local sequence alignments (e.g., with BLAST). No neighboring method (e.g., global sequence similarity, global sequence distance, or evolutionary distance based on a global alignment) has yet shown a notable superiority in identifying species. Finally, "the probability of correct identification" (PCI) provides an appropriate measurement of barcode efficacy. The overall PCI for a data set is the average of the species PCIs, taken over all species in the data set. This chapter states explicitly how to calculate PCI, how to estimate its statistical sampling error, and how to use data on PCR failure to set limits on how much improvements in PCR technology can improve species identification.

  1. An unsupervised video foreground co-localization and segmentation process by incorporating motion cues and frame features

    NASA Astrophysics Data System (ADS)

    Zhang, Chao; Zhang, Qian; Zheng, Chi; Qiu, Guoping

    2018-04-01

    Video foreground segmentation is one of the key problems in video processing. In this paper, we proposed a novel and fully unsupervised approach for foreground object co-localization and segmentation of unconstrained videos. We firstly compute both the actual edges and motion boundaries of the video frames, and then align them by their HOG feature maps. Then, by filling the occlusions generated by the aligned edges, we obtained more precise masks about the foreground object. Such motion-based masks could be derived as the motion-based likelihood. Moreover, the color-base likelihood is adopted for the segmentation process. Experimental Results show that our approach outperforms most of the State-of-the-art algorithms.

  2. An Algorithm to Identify and Localize Suitable Dock Locations from 3-D LiDAR Scans

    DTIC Science & Technology

    2013-05-10

    Locations from 3-D LiDAR Scans 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Graves, Mitchell Robert 5d. PROJECT NUMBER...Ranging ( LiDAR ) scans. A LiDAR sensor is a sensor that collects range images from a rotating array of vertically aligned lasers. Our solution leverages...Algorithm, Dock, Locations, Point Clouds, LiDAR , Identify 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a

  3. Robust algorithm for aligning two-dimensional chromatograms.

    PubMed

    Gros, Jonas; Nabi, Deedar; Dimitriou-Christidis, Petros; Rutler, Rebecca; Arey, J Samuel

    2012-11-06

    Comprehensive two-dimensional gas chromatography (GC × GC) chromatograms typically exhibit run-to-run retention time variability. Chromatogram alignment is often a desirable step prior to further analysis of the data, for example, in studies of environmental forensics or weathering of complex mixtures. We present a new algorithm for aligning whole GC × GC chromatograms. This technique is based on alignment points that have locations indicated by the user both in a target chromatogram and in a reference chromatogram. We applied the algorithm to two sets of samples. First, we aligned the chromatograms of twelve compositionally distinct oil spill samples, all analyzed using the same instrument parameters. Second, we applied the algorithm to two compositionally distinct wastewater extracts analyzed using two different instrument temperature programs, thus involving larger retention time shifts than the first sample set. For both sample sets, the new algorithm performed favorably compared to two other available alignment algorithms: that of Pierce, K. M.; Wood, Lianna F.; Wright, B. W.; Synovec, R. E. Anal. Chem.2005, 77, 7735-7743 and 2-D COW from Zhang, D.; Huang, X.; Regnier, F. E.; Zhang, M. Anal. Chem.2008, 80, 2664-2671. The new algorithm achieves the best matches of retention times for test analytes, avoids some artifacts which result from the other alignment algorithms, and incurs the least modification of quantitative signal information.

  4. Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm.

    PubMed

    Rani, R Ranjani; Ramyachitra, D

    2016-12-01

    Multiple sequence alignment (MSA) is a widespread approach in computational biology and bioinformatics. MSA deals with how the sequences of nucleotides and amino acids are sequenced with possible alignment and minimum number of gaps between them, which directs to the functional, evolutionary and structural relationships among the sequences. Still the computation of MSA is a challenging task to provide an efficient accuracy and statistically significant results of alignments. In this work, the Bacterial Foraging Optimization Algorithm was employed to align the biological sequences which resulted in a non-dominated optimal solution. It employs Multi-objective, such as: Maximization of Similarity, Non-gap percentage, Conserved blocks and Minimization of gap penalty. BAliBASE 3.0 benchmark database was utilized to examine the proposed algorithm against other methods In this paper, two algorithms have been proposed: Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC) and Bacterial Foraging Optimization Algorithm. It was found that Hybrid Genetic Algorithm with Artificial Bee Colony performed better than the existing optimization algorithms. But still the conserved blocks were not obtained using GA-ABC. Then BFO was used for the alignment and the conserved blocks were obtained. The proposed Multi-Objective Bacterial Foraging Optimization Algorithm (MO-BFO) was compared with widely used MSA methods Clustal Omega, Kalign, MUSCLE, MAFFT, Genetic Algorithm (GA), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO) and Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC). The final results show that the proposed MO-BFO algorithm yields better alignment than most widely used methods. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  5. Coarse Alignment Technology on Moving base for SINS Based on the Improved Quaternion Filter Algorithm.

    PubMed

    Zhang, Tao; Zhu, Yongyun; Zhou, Feng; Yan, Yaxiong; Tong, Jinwu

    2017-06-17

    Initial alignment of the strapdown inertial navigation system (SINS) is intended to determine the initial attitude matrix in a short time with certain accuracy. The alignment accuracy of the quaternion filter algorithm is remarkable, but the convergence rate is slow. To solve this problem, this paper proposes an improved quaternion filter algorithm for faster initial alignment based on the error model of the quaternion filter algorithm. The improved quaternion filter algorithm constructs the K matrix based on the principle of optimal quaternion algorithm, and rebuilds the measurement model by containing acceleration and velocity errors to make the convergence rate faster. A doppler velocity log (DVL) provides the reference velocity for the improved quaternion filter alignment algorithm. In order to demonstrate the performance of the improved quaternion filter algorithm in the field, a turntable experiment and a vehicle test are carried out. The results of the experiments show that the convergence rate of the proposed improved quaternion filter is faster than that of the tradition quaternion filter algorithm. In addition, the improved quaternion filter algorithm also demonstrates advantages in terms of correctness, effectiveness, and practicability.

  6. Retention time alignment of LC/MS data by a divide-and-conquer algorithm.

    PubMed

    Zhang, Zhongqi

    2012-04-01

    Liquid chromatography-mass spectrometry (LC/MS) has become the method of choice for characterizing complex mixtures. These analyses often involve quantitative comparison of components in multiple samples. To achieve automated sample comparison, the components of interest must be detected and identified, and their retention times aligned and peak areas calculated. This article describes a simple pairwise iterative retention time alignment algorithm, based on the divide-and-conquer approach, for alignment of ion features detected in LC/MS experiments. In this iterative algorithm, ion features in the sample run are first aligned with features in the reference run by applying a single constant shift of retention time. The sample chromatogram is then divided into two shorter chromatograms, which are aligned to the reference chromatogram the same way. Each shorter chromatogram is further divided into even shorter chromatograms. This process continues until each chromatogram is sufficiently narrow so that ion features within it have a similar retention time shift. In six pairwise LC/MS alignment examples containing a total of 6507 confirmed true corresponding feature pairs with retention time shifts up to five peak widths, the algorithm successfully aligned these features with an error rate of 0.2%. The alignment algorithm is demonstrated to be fast, robust, fully automatic, and superior to other algorithms. After alignment and gap-filling of detected ion features, their abundances can be tabulated for direct comparison between samples.

  7. Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns.

    PubMed

    Ortuño, Francisco M; Valenzuela, Olga; Rojas, Fernando; Pomares, Hector; Florido, Javier P; Urquiza, Jose M; Rojas, Ignacio

    2013-09-01

    Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, above all when sequences are less similar. Consequently, researchers and biologists do not agree about which is the most suitable way to evaluate MSAs. Recent evaluations tend to use more complex scores including further biological features. Among them, 3D structures are increasingly being used to evaluate alignments. Because structures are more conserved in proteins than sequences, scores with structural information are better suited to evaluate more distant relationships between sequences. The proposed multiobjective algorithm, based on the non-dominated sorting genetic algorithm, aims to jointly optimize three objectives: STRIKE score, non-gaps percentage and totally conserved columns. It was significantly assessed on the BAliBASE benchmark according to the Kruskal-Wallis test (P < 0.01). This algorithm also outperforms other aligners, such as ClustalW, Multiple Sequence Alignment Genetic Algorithm (MSA-GA), PRRP, DIALIGN, Hidden Markov Model Training (HMMT), Pattern-Induced Multi-sequence Alignment (PIMA), MULTIALIGN, Sequence Alignment Genetic Algorithm (SAGA), PILEUP, Rubber Band Technique Genetic Algorithm (RBT-GA) and Vertical Decomposition Genetic Algorithm (VDGA), according to the Wilcoxon signed-rank test (P < 0.05), whereas it shows results not significantly different to 3D-COFFEE (P > 0.05) with the advantage of being able to use less structures. Structural information is included within the objective function to evaluate more accurately the obtained alignments. The source code is available at http://www.ugr.es/~fortuno/MOSAStrE/MO-SAStrE.zip.

  8. Introducing difference recurrence relations for faster semi-global alignment of long sequences.

    PubMed

    Suzuki, Hajime; Kasahara, Masahiro

    2018-02-19

    The read length of single-molecule DNA sequencers is reaching 1 Mb. Popular alignment software tools widely used for analyzing such long reads often take advantage of single-instruction multiple-data (SIMD) operations to accelerate calculation of dynamic programming (DP) matrices in the Smith-Waterman-Gotoh (SWG) algorithm with a fixed alignment start position at the origin. Nonetheless, 16-bit or 32-bit integers are necessary for storing the values in a DP matrix when sequences to be aligned are long; this situation hampers the use of the full SIMD width of modern processors. We proposed a faster semi-global alignment algorithm, "difference recurrence relations," that runs more rapidly than the state-of-the-art algorithm by a factor of 2.1. Instead of calculating and storing all the values in a DP matrix directly, our algorithm computes and stores mainly the differences between the values of adjacent cells in the matrix. Although the SWG algorithm and our algorithm can output exactly the same result, our algorithm mainly involves 8-bit integer operations, enabling us to exploit the full width of SIMD operations (e.g., 32) on modern processors. We also developed a library, libgaba, so that developers can easily integrate our algorithm into alignment programs. Our novel algorithm and optimized library implementation will facilitate accelerating nucleotide long-read analysis algorithms that use pairwise alignment stages. The library is implemented in the C programming language and available at https://github.com/ocxtal/libgaba .

  9. Aligning Greek-English parallel texts

    NASA Astrophysics Data System (ADS)

    Galiotou, Eleni; Koronakis, George; Lazari, Vassiliki

    2015-02-01

    In this paper, we discuss issues concerning the alignment of parallel texts written in languages with different alphabets based on an experiment of aligning texts from the proceedings of the European Parliament in Greek and English. First, we describe our implementation of the k-vec algorithm and its application to the bilingual corpus. Then the output of the algorithm is used as a starting point for an alignment procedure at a sentence level which also takes into account mark-ups of meta-information. The results of the implementation are compared to those of the application of the Church and Gale alignment algorithm on the Europarl corpus. The conclusions of this comparison can give useful insights as for the efficiency of alignment algorithms when applied to the particular bilingual corpus.

  10. A phase and frequency alignment protocol for 1H MRSI data of the prostate.

    PubMed

    Wright, Alan J; Buydens, Lutgarde M C; Heerschap, Arend

    2012-05-01

    (1)H MRSI of the prostate reveals relative metabolite levels that vary according to the presence or absence of tumour, providing a sensitive method for the identification of patients with cancer. Current interpretations of prostate data rely on quantification algorithms that fit model metabolite resonances to individual voxel spectra and calculate relative levels of metabolites, such as choline, creatine, citrate and polyamines. Statistical pattern recognition techniques can potentially improve the detection of prostate cancer, but these analyses are hampered by artefacts and sources of noise in the data, such as variations in phase and frequency of resonances. Phase and frequency variations may arise as a result of spatial field gradients or local physiological conditions affecting the frequency of resonances, in particular those of citrate. Thus, there are unique challenges in developing a peak alignment algorithm for these data. We have developed a frequency and phase correction algorithm for automatic alignment of the resonances in prostate MRSI spectra. We demonstrate, with a simulated dataset, that alignment can be achieved to a phase standard deviation of 0.095  rad and a frequency standard deviation of 0.68  Hz for the citrate resonances. Three parameters were used to assess the improvement in peak alignment in the MRSI data of five patients: the percentage of variance in all MRSI spectra explained by their first principal component; the signal-to-noise ratio of a spectrum formed by taking the median value of the entire set at each spectral point; and the mean cross-correlation between all pairs of spectra. These parameters showed a greater similarity between spectra in all five datasets and the simulated data, demonstrating improved alignment for phase and frequency in these spectra. This peak alignment program is expected to improve pattern recognition significantly, enabling accurate detection and localisation of prostate cancer with MRSI. Copyright © 2011 John Wiley & Sons, Ltd.

  11. Biclustering as a method for RNA local multiple sequence alignment.

    PubMed

    Wang, Shu; Gutell, Robin R; Miranker, Daniel P

    2007-12-15

    Biclustering is a clustering method that simultaneously clusters both the domain and range of a relation. A challenge in multiple sequence alignment (MSA) is that the alignment of sequences is often intended to reveal groups of conserved functional subsequences. Simultaneously, the grouping of the sequences can impact the alignment; precisely the kind of dual situation biclustering is intended to address. We define a representation of the MSA problem enabling the application of biclustering algorithms. We develop a computer program for local MSA, BlockMSA, that combines biclustering with divide-and-conquer. BlockMSA simultaneously finds groups of similar sequences and locally aligns subsequences within them. Further alignment is accomplished by dividing both the set of sequences and their contents. The net result is both a multiple sequence alignment and a hierarchical clustering of the sequences. BlockMSA was tested on the subsets of the BRAliBase 2.1 benchmark suite that display high variability and on an extension to that suite to larger problem sizes. Also, alignments were evaluated of two large datasets of current biological interest, T box sequences and Group IC1 Introns. The results were compared with alignments computed by ClustalW, MAFFT, MUCLE and PROBCONS alignment programs using Sum of Pairs (SPS) and Consensus Count. Results for the benchmark suite are sensitive to problem size. On problems of 15 or greater sequences, BlockMSA is consistently the best. On none of the problems in the test suite are there appreciable differences in scores among BlockMSA, MAFFT and PROBCONS. On the T box sequences, BlockMSA does the most faithful job of reproducing known annotations. MAFFT and PROBCONS do not. On the Intron sequences, BlockMSA, MAFFT and MUSCLE are comparable at identifying conserved regions. BlockMSA is implemented in Java. Source code and supplementary datasets are available at http://aug.csres.utexas.edu/msa/

  12. A generalized global alignment algorithm.

    PubMed

    Huang, Xiaoqiu; Chao, Kun-Mao

    2003-01-22

    Homologous sequences are sometimes similar over some regions but different over other regions. Homologous sequences have a much lower global similarity if the different regions are much longer than the similar regions. We present a generalized global alignment algorithm for comparing sequences with intermittent similarities, an ordered list of similar regions separated by different regions. A generalized global alignment model is defined to handle sequences with intermittent similarities. A dynamic programming algorithm is designed to compute an optimal general alignment in time proportional to the product of sequence lengths and in space proportional to the sum of sequence lengths. The algorithm is implemented as a computer program named GAP3 (Global Alignment Program Version 3). The generalized global alignment model is validated by experimental results produced with GAP3 on both DNA and protein sequences. The GAP3 program extends the ability of standard global alignment programs to recognize homologous sequences of lower similarity. The GAP3 program is freely available for academic use at http://bioinformatics.iastate.edu/aat/align/align.html.

  13. Simulation of enhanced deposition due to magnetic field alignment of ellipsoidal particles in a lung bifurcation.

    PubMed

    Martinez, R C; Roshchenko, A; Minev, P; Finlay, W H

    2013-02-01

    Aerosolized chemotherapy has been recognized as a potential treatment for lung cancer. The challenge of providing sufficient therapeutic effects without reaching dose-limiting toxicity levels hinders the development of aerosolized chemotherapy. This could be mitigated by increasing drug-delivery efficiency with a noninvasive drug-targeting delivery method. The purpose of this study is to use direct numerical simulations to study the resulting local enhancement of deposition due to magnetic field alignment of high aspect ratio particles. High aspect ratio particles were approximated by a rigid ellipsoid with a minor diameter of 0.5 μm and fluid particle density ratio of 1,000. Particle trajectories were calculated by solving the coupled fluid particle equations using an in-house micro-macro grid finite element algorithm based on a previously developed fictitious domain approach. Particle trajectories were simulated in a morphologically realistic geometry modeling a symmetrical terminal bronchiole bifurcation. Flow conditions were steady inspiratory air flow due to typical breathing at 18 L/min. Deposition efficiency was estimated for two different cases: [1] particles aligned with the streamlines and [2] particles with fixed angular orientation simulating the magnetic field alignment of our previous in vitro study. The local enhancement factor defined as the ratio between deposition efficiency of Case [1] and Case [2] was found to be 1.43 and 3.46 for particles with an aspect ratio of 6 and 20, respectively. Results indicate that externally forcing local alignment of high aspect ratio particles can increase local deposition considerably.

  14. Text Extraction from Scene Images by Character Appearance and Structure Modeling

    PubMed Central

    Yi, Chucai; Tian, Yingli

    2012-01-01

    In this paper, we propose a novel algorithm to detect text information from natural scene images. Scene text classification and detection are still open research topics. Our proposed algorithm is able to model both character appearance and structure to generate representative and discriminative text descriptors. The contributions of this paper include three aspects: 1) a new character appearance model by a structure correlation algorithm which extracts discriminative appearance features from detected interest points of character samples; 2) a new text descriptor based on structons and correlatons, which model character structure by structure differences among character samples and structure component co-occurrence; and 3) a new text region localization method by combining color decomposition, character contour refinement, and string line alignment to localize character candidates and refine detected text regions. We perform three groups of experiments to evaluate the effectiveness of our proposed algorithm, including text classification, text detection, and character identification. The evaluation results on benchmark datasets demonstrate that our algorithm achieves the state-of-the-art performance on scene text classification and detection, and significantly outperforms the existing algorithms for character identification. PMID:23316111

  15. Towards a Unified Framework for Pose, Expression, and Occlusion Tolerant Automatic Facial Alignment.

    PubMed

    Seshadri, Keshav; Savvides, Marios

    2016-10-01

    We propose a facial alignment algorithm that is able to jointly deal with the presence of facial pose variation, partial occlusion of the face, and varying illumination and expressions. Our approach proceeds from sparse to dense landmarking steps using a set of specific models trained to best account for the shape and texture variation manifested by facial landmarks and facial shapes across pose and various expressions. We also propose the use of a novel l1-regularized least squares approach that we incorporate into our shape model, which is an improvement over the shape model used by several prior Active Shape Model (ASM) based facial landmark localization algorithms. Our approach is compared against several state-of-the-art methods on many challenging test datasets and exhibits a higher fitting accuracy on all of them.

  16. Real-time driver fatigue detection based on face alignment

    NASA Astrophysics Data System (ADS)

    Tao, Huanhuan; Zhang, Guiying; Zhao, Yong; Zhou, Yi

    2017-07-01

    The performance and robustness of fatigue detection largely decrease if the driver with glasses. To address this issue, this paper proposes a practical driver fatigue detection method based on face alignment at 3000 FPS algorithm. Firstly, the eye regions of the driver are localized by exploiting 6 landmarks surrounding each eye. Secondly, the HOG features of the extracted eye regions are calculated and put into SVM classifier to recognize the eye state. Finally, the value of PERCLOS is calculated to determine whether the driver is drowsy or not. An alarm will be generated if the eye is closed for a specified period of time. The accuracy and real-time on testing videos with different drivers demonstrate that the proposed algorithm is robust and obtain better accuracy for driver fatigue detection compared with some previous method.

  17. Multi-resolution Shape Analysis via Non-Euclidean Wavelets: Applications to Mesh Segmentation and Surface Alignment Problems.

    PubMed

    Kim, Won Hwa; Chung, Moo K; Singh, Vikas

    2013-01-01

    The analysis of 3-D shape meshes is a fundamental problem in computer vision, graphics, and medical imaging. Frequently, the needs of the application require that our analysis take a multi-resolution view of the shape's local and global topology, and that the solution is consistent across multiple scales. Unfortunately, the preferred mathematical construct which offers this behavior in classical image/signal processing, Wavelets, is no longer applicable in this general setting (data with non-uniform topology). In particular, the traditional definition does not allow writing out an expansion for graphs that do not correspond to the uniformly sampled lattice (e.g., images). In this paper, we adapt recent results in harmonic analysis, to derive Non-Euclidean Wavelets based algorithms for a range of shape analysis problems in vision and medical imaging. We show how descriptors derived from the dual domain representation offer native multi-resolution behavior for characterizing local/global topology around vertices. With only minor modifications, the framework yields a method for extracting interest/key points from shapes, a surprisingly simple algorithm for 3-D shape segmentation (competitive with state of the art), and a method for surface alignment (without landmarks). We give an extensive set of comparison results on a large shape segmentation benchmark and derive a uniqueness theorem for the surface alignment problem.

  18. ARYANA: Aligning Reads by Yet Another Approach

    PubMed Central

    2014-01-01

    Motivation Although there are many different algorithms and software tools for aligning sequencing reads, fast gapped sequence search is far from solved. Strong interest in fast alignment is best reflected in the $106 prize for the Innocentive competition on aligning a collection of reads to a given database of reference genomes. In addition, de novo assembly of next-generation sequencing long reads requires fast overlap-layout-concensus algorithms which depend on fast and accurate alignment. Contribution We introduce ARYANA, a fast gapped read aligner, developed on the base of BWA indexing infrastructure with a completely new alignment engine that makes it significantly faster than three other aligners: Bowtie2, BWA and SeqAlto, with comparable generality and accuracy. Instead of the time-consuming backtracking procedures for handling mismatches, ARYANA comes with the seed-and-extend algorithmic framework and a significantly improved efficiency by integrating novel algorithmic techniques including dynamic seed selection, bidirectional seed extension, reset-free hash tables, and gap-filling dynamic programming. As the read length increases ARYANA's superiority in terms of speed and alignment rate becomes more evident. This is in perfect harmony with the read length trend as the sequencing technologies evolve. The algorithmic platform of ARYANA makes it easy to develop mission-specific aligners for other applications using ARYANA engine. Availability ARYANA with complete source code can be obtained from http://github.com/aryana-aligner PMID:25252881

  19. ARYANA: Aligning Reads by Yet Another Approach.

    PubMed

    Gholami, Milad; Arbabi, Aryan; Sharifi-Zarchi, Ali; Chitsaz, Hamidreza; Sadeghi, Mehdi

    2014-01-01

    Although there are many different algorithms and software tools for aligning sequencing reads, fast gapped sequence search is far from solved. Strong interest in fast alignment is best reflected in the $10(6) prize for the Innocentive competition on aligning a collection of reads to a given database of reference genomes. In addition, de novo assembly of next-generation sequencing long reads requires fast overlap-layout-concensus algorithms which depend on fast and accurate alignment. We introduce ARYANA, a fast gapped read aligner, developed on the base of BWA indexing infrastructure with a completely new alignment engine that makes it significantly faster than three other aligners: Bowtie2, BWA and SeqAlto, with comparable generality and accuracy. Instead of the time-consuming backtracking procedures for handling mismatches, ARYANA comes with the seed-and-extend algorithmic framework and a significantly improved efficiency by integrating novel algorithmic techniques including dynamic seed selection, bidirectional seed extension, reset-free hash tables, and gap-filling dynamic programming. As the read length increases ARYANA's superiority in terms of speed and alignment rate becomes more evident. This is in perfect harmony with the read length trend as the sequencing technologies evolve. The algorithmic platform of ARYANA makes it easy to develop mission-specific aligners for other applications using ARYANA engine. ARYANA with complete source code can be obtained from http://github.com/aryana-aligner.

  20. CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions.

    PubMed

    Liu, Yongchao; Wirawan, Adrianto; Schmidt, Bertil

    2013-04-04

    The maximal sensitivity for local alignments makes the Smith-Waterman algorithm a popular choice for protein sequence database search based on pairwise alignment. However, the algorithm is compute-intensive due to a quadratic time complexity. Corresponding runtimes are further compounded by the rapid growth of sequence databases. We present CUDASW++ 3.0, a fast Smith-Waterman protein database search algorithm, which couples CPU and GPU SIMD instructions and carries out concurrent CPU and GPU computations. For the CPU computation, this algorithm employs SSE-based vector execution units as accelerators. For the GPU computation, we have investigated for the first time a GPU SIMD parallelization, which employs CUDA PTX SIMD video instructions to gain more data parallelism beyond the SIMT execution model. Moreover, sequence alignment workloads are automatically distributed over CPUs and GPUs based on their respective compute capabilities. Evaluation on the Swiss-Prot database shows that CUDASW++ 3.0 gains a performance improvement over CUDASW++ 2.0 up to 2.9 and 3.2, with a maximum performance of 119.0 and 185.6 GCUPS, on a single-GPU GeForce GTX 680 and a dual-GPU GeForce GTX 690 graphics card, respectively. In addition, our algorithm has demonstrated significant speedups over other top-performing tools: SWIPE and BLAST+. CUDASW++ 3.0 is written in CUDA C++ and PTX assembly languages, targeting GPUs based on the Kepler architecture. This algorithm obtains significant speedups over its predecessor: CUDASW++ 2.0, by benefiting from the use of CPU and GPU SIMD instructions as well as the concurrent execution on CPUs and GPUs. The source code and the simulated data are available at http://cudasw.sourceforge.net.

  1. Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization.

    PubMed

    Bauer, Markus; Klau, Gunnar W; Reinert, Knut

    2007-07-27

    The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. We present a graph-based representation for sequence-structure alignments, which we model as an integer linear program (ILP). We sketch how we compute an optimal or near-optimal solution to the ILP using methods from combinatorial optimization, and present results on a recently published benchmark set for RNA alignments. The implementation of our algorithm yields better alignments in terms of two published scores than the other programs that we tested: This is especially the case with an increasing number of input sequences. Our program LARA is freely available for academic purposes from http://www.planet-lisa.net.

  2. Feature-based Alignment of Volumetric Multi-modal Images

    PubMed Central

    Toews, Matthew; Zöllei, Lilla; Wells, William M.

    2014-01-01

    This paper proposes a method for aligning image volumes acquired from different imaging modalities (e.g. MR, CT) based on 3D scale-invariant image features. A novel method for encoding invariant feature geometry and appearance is developed, based on the assumption of locally linear intensity relationships, providing a solution to poor repeatability of feature detection in different image modalities. The encoding method is incorporated into a probabilistic feature-based model for multi-modal image alignment. The model parameters are estimated via a group-wise alignment algorithm, that iteratively alternates between estimating a feature-based model from feature data, then realigning feature data to the model, converging to a stable alignment solution with few pre-processing or pre-alignment requirements. The resulting model can be used to align multi-modal image data with the benefits of invariant feature correspondence: globally optimal solutions, high efficiency and low memory usage. The method is tested on the difficult RIRE data set of CT, T1, T2, PD and MP-RAGE brain images of subjects exhibiting significant inter-subject variability due to pathology. PMID:24683955

  3. Exact calculation of distributions on integers, with application to sequence alignment.

    PubMed

    Newberg, Lee A; Lawrence, Charles E

    2009-01-01

    Computational biology is replete with high-dimensional discrete prediction and inference problems. Dynamic programming recursions can be applied to several of the most important of these, including sequence alignment, RNA secondary-structure prediction, phylogenetic inference, and motif finding. In these problems, attention is frequently focused on some scalar quantity of interest, a score, such as an alignment score or the free energy of an RNA secondary structure. In many cases, score is naturally defined on integers, such as a count of the number of pairing differences between two sequence alignments, or else an integer score has been adopted for computational reasons, such as in the test of significance of motif scores. The probability distribution of the score under an appropriate probabilistic model is of interest, such as in tests of significance of motif scores, or in calculation of Bayesian confidence limits around an alignment. Here we present three algorithms for calculating the exact distribution of a score of this type; then, in the context of pairwise local sequence alignments, we apply the approach so as to find the alignment score distribution and Bayesian confidence limits.

  4. CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment

    PubMed Central

    Manavski, Svetlin A; Valle, Giorgio

    2008-01-01

    Background Searching for similarities in protein and DNA databases has become a routine procedure in Molecular Biology. The Smith-Waterman algorithm has been available for more than 25 years. It is based on a dynamic programming approach that explores all the possible alignments between two sequences; as a result it returns the optimal local alignment. Unfortunately, the computational cost is very high, requiring a number of operations proportional to the product of the length of two sequences. Furthermore, the exponential growth of protein and DNA databases makes the Smith-Waterman algorithm unrealistic for searching similarities in large sets of sequences. For these reasons heuristic approaches such as those implemented in FASTA and BLAST tend to be preferred, allowing faster execution times at the cost of reduced sensitivity. The main motivation of our work is to exploit the huge computational power of commonly available graphic cards, to develop high performance solutions for sequence alignment. Results In this paper we present what we believe is the fastest solution of the exact Smith-Waterman algorithm running on commodity hardware. It is implemented in the recently released CUDA programming environment by NVidia. CUDA allows direct access to the hardware primitives of the last-generation Graphics Processing Units (GPU) G80. Speeds of more than 3.5 GCUPS (Giga Cell Updates Per Second) are achieved on a workstation running two GeForce 8800 GTX. Exhaustive tests have been done to compare our implementation to SSEARCH and BLAST, running on a 3 GHz Intel Pentium IV processor. Our solution was also compared to a recently published GPU implementation and to a Single Instruction Multiple Data (SIMD) solution. These tests show that our implementation performs from 2 to 30 times faster than any other previous attempt available on commodity hardware. Conclusions The results show that graphic cards are now sufficiently advanced to be used as efficient hardware accelerators for sequence alignment. Their performance is better than any alternative available on commodity hardware platforms. The solution presented in this paper allows large scale alignments to be performed at low cost, using the exact Smith-Waterman algorithm instead of the largely adopted heuristic approaches. PMID:18387198

  5. A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

    PubMed

    Luczak, Brian B; James, Benjamin T; Girgis, Hani Z

    2017-12-06

    Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.

  6. Node fingerprinting: an efficient heuristic for aligning biological networks.

    PubMed

    Radu, Alex; Charleston, Michael

    2014-10-01

    With the continuing increase in availability of biological data and improvements to biological models, biological network analysis has become a promising area of research. An emerging technique for the analysis of biological networks is through network alignment. Network alignment has been used to calculate genetic distance, similarities between regulatory structures, and the effect of external forces on gene expression, and to depict conditional activity of expression modules in cancer. Network alignment is algorithmically complex, and therefore we must rely on heuristics, ideally as efficient and accurate as possible. The majority of current techniques for network alignment rely on precomputed information, such as with protein sequence alignment, or on tunable network alignment parameters, which may introduce an increased computational overhead. Our presented algorithm, which we call Node Fingerprinting (NF), is appropriate for performing global pairwise network alignment without precomputation or tuning, can be fully parallelized, and is able to quickly compute an accurate alignment between two biological networks. It has performed as well as or better than existing algorithms on biological and simulated data, and with fewer computational resources. The algorithmic validation performed demonstrates the low computational resource requirements of NF.

  7. Manifold alignment with Schroedinger eigenmaps

    NASA Astrophysics Data System (ADS)

    Johnson, Juan E.; Bachmann, Charles M.; Cahill, Nathan D.

    2016-05-01

    The sun-target-sensor angle can change during aerial remote sensing. In an attempt to compensate BRDF effects in multi-angular hyperspectral images, the Semi-Supervised Manifold Alignment (SSMA) algorithm pulls data from similar classes together and pushes data from different classes apart. SSMA uses Laplacian Eigenmaps (LE) to preserve the original geometric structure of each local data set independently. In this paper, we replace LE with Spatial-Spectral Schoedinger Eigenmaps (SSSE) which was designed to be a semisupervised enhancement to the to extend the SSMA methodology and improve classification of multi-angular hyperspectral images captured over Hog Island in the Virginia Coast Reserve.

  8. GenomeVista

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Poliakov, Alexander; Couronne, Olivier

    2002-11-04

    Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less

  9. AntiClustal: Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.

    PubMed

    Di Pietro, C; Di Pietro, V; Emmanuele, G; Ferro, A; Maugeri, T; Modica, E; Pigola, G; Pulvirenti, A; Purrello, M; Ragusa, M; Scalia, M; Shasha, D; Travali, S; Zimmitti, V

    2003-01-01

    In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.

  10. The Application of the Weighted k-Partite Graph Problem to the Multiple Alignment for Metabolic Pathways.

    PubMed

    Chen, Wenbin; Hendrix, William; Samatova, Nagiza F

    2017-12-01

    The problem of aligning multiple metabolic pathways is one of very challenging problems in computational biology. A metabolic pathway consists of three types of entities: reactions, compounds, and enzymes. Based on similarities between enzymes, Tohsato et al. gave an algorithm for aligning multiple metabolic pathways. However, the algorithm given by Tohsato et al. neglects the similarities among reactions, compounds, enzymes, and pathway topology. How to design algorithms for the alignment problem of multiple metabolic pathways based on the similarity of reactions, compounds, and enzymes? It is a difficult computational problem. In this article, we propose an algorithm for the problem of aligning multiple metabolic pathways based on the similarities among reactions, compounds, enzymes, and pathway topology. First, we compute a weight between each pair of like entities in different input pathways based on the entities' similarity score and topological structure using Ay et al.'s methods. We then construct a weighted k-partite graph for the reactions, compounds, and enzymes. We extract a mapping between these entities by solving the maximum-weighted k-partite matching problem by applying a novel heuristic algorithm. By analyzing the alignment results of multiple pathways in different organisms, we show that the alignments found by our algorithm correctly identify common subnetworks among multiple pathways.

  11. Evaluation of mathematical algorithms for automatic patient alignment in radiosurgery.

    PubMed

    Williams, Kenneth M; Schulte, Reinhard W; Schubert, Keith E; Wroe, Andrew J

    2015-06-01

    Image registration techniques based on anatomical features can serve to automate patient alignment for intracranial radiosurgery procedures in an effort to improve the accuracy and efficiency of the alignment process as well as potentially eliminate the need for implanted fiducial markers. To explore this option, four two-dimensional (2D) image registration algorithms were analyzed: the phase correlation technique, mutual information (MI) maximization, enhanced correlation coefficient (ECC) maximization, and the iterative closest point (ICP) algorithm. Digitally reconstructed radiographs from the treatment planning computed tomography scan of a human skull were used as the reference images, while orthogonal digital x-ray images taken in the treatment room were used as the captured images to be aligned. The accuracy of aligning the skull with each algorithm was compared to the alignment of the currently practiced procedure, which is based on a manual process of selecting common landmarks, including implanted fiducials and anatomical skull features. Of the four algorithms, three (phase correlation, MI maximization, and ECC maximization) demonstrated clinically adequate (ie, comparable to the standard alignment technique) translational accuracy and improvements in speed compared to the interactive, user-guided technique; however, the ICP algorithm failed to give clinically acceptable results. The results of this work suggest that a combination of different algorithms may provide the best registration results. This research serves as the initial groundwork for the translation of automated, anatomy-based 2D algorithms into a real-world system for 2D-to-2D image registration and alignment for intracranial radiosurgery. This may obviate the need for invasive implantation of fiducial markers into the skull and may improve treatment room efficiency and accuracy. © The Author(s) 2014.

  12. A Novel Center Star Multiple Sequence Alignment Algorithm Based on Affine Gap Penalty and K-Band

    NASA Astrophysics Data System (ADS)

    Zou, Quan; Shan, Xiao; Jiang, Yi

    Multiple sequence alignment is one of the most important topics in computational biology, but it cannot deal with the large data so far. As the development of copy-number variant(CNV) and Single Nucleotide Polymorphisms(SNP) research, many researchers want to align numbers of similar sequences for detecting CNV and SNP. In this paper, we propose a novel multiple sequence alignment algorithm based on affine gap penalty and k-band. It can align more quickly and accurately, that will be helpful for mining CNV and SNP. Experiments prove the performance of our algorithm.

  13. Consistency functional map propagation for repetitive patterns

    NASA Astrophysics Data System (ADS)

    Wang, Hao

    2017-09-01

    Repetitive patterns appear frequently in both man-made and natural environments. Automatically and robustly detecting such patterns from an image is a challenging problem. We study repetitive pattern alignment by embedding segmentation cue with a functional map model. However, this model cannot tackle the repetitive patterns directly due to the large photometric and geometric variations. Thus, a consistency functional map propagation (CFMP) algorithm that extends the functional map with dynamic propagation is proposed to address this issue. This propagation model is acquired in two steps. The first one aligns the patterns from a local region, transferring segmentation functions among patterns. It can be cast as an L norm optimization problem. The latter step updates the template segmentation for the next round of pattern discovery by merging the transferred segmentation functions. Extensive experiments and comparative analyses have demonstrated an encouraging performance of the proposed algorithm in detection and segmentation of repetitive patterns.

  14. A Local Fast Marching-Based Diffusion Tensor Image Registration Algorithm by Simultaneously Considering Spatial Deformation and Tensor Orientation

    PubMed Central

    Xue, Zhong; Li, Hai; Guo, Lei; Wong, Stephen T.C.

    2010-01-01

    It is a key step to spatially align diffusion tensor images (DTI) to quantitatively compare neural images obtained from different subjects or the same subject at different timepoints. Different from traditional scalar or multi-channel image registration methods, tensor orientation should be considered in DTI registration. Recently, several DTI registration methods have been proposed in the literature, but deformation fields are purely dependent on the tensor features not the whole tensor information. Other methods, such as the piece-wise affine transformation and the diffeomorphic non-linear registration algorithms, use analytical gradients of the registration objective functions by simultaneously considering the reorientation and deformation of tensors during the registration. However, only relatively local tensor information such as voxel-wise tensor-similarity, is utilized. This paper proposes a new DTI image registration algorithm, called local fast marching (FM)-based simultaneous registration. The algorithm not only considers the orientation of tensors during registration but also utilizes the neighborhood tensor information of each voxel to drive the deformation, and such neighborhood tensor information is extracted from a local fast marching algorithm around the voxels of interest. These local fast marching-based tensor features efficiently reflect the diffusion patterns around each voxel within a spherical neighborhood and can capture relatively distinctive features of the anatomical structures. Using simulated and real DTI human brain data the experimental results show that the proposed algorithm is more accurate compared with the FA-based registration and is more efficient than its counterpart, the neighborhood tensor similarity-based registration. PMID:20382233

  15. SU-E-J-218: Novel Validation Paradigm of MRI to CT Deformation of Prostate

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Padgett, K; University of Miami School of Medicine - Radiology, Miami, FL; Pirozzi, S

    2015-06-15

    Purpose: Deformable registration algorithms are inherently difficult to characterize in the multi-modality setting due to a significant differences in the characteristics of the different modalities (CT and MRI) as well as tissue deformations. We present a unique paradigm where this is overcome by utilizing a planning-MRI acquired within an hour of the planning-CT serving as a surrogate for quantifying MRI to CT deformation by eliminating the issues of multi-modality comparisons. Methods: For nine subjects, T2 fast-spin-echo images were acquired at two different time points, the first several weeks prior to planning (diagnostic-MRI) and the second on the same day asmore » the planning-CT (planning-MRI). Significant effort in patient positioning and bowel/bladder preparation was undertaken to minimize distortion of the prostate in all datasets. The diagnostic-MRI was rigidly and deformably aligned to the planning-CT utilizing a commercially available deformable registration algorithm synthesized from local registrations. Additionally, the quality of rigid alignment was ranked by an imaging physicist. The distances between corresponding anatomical landmarks on rigid and deformed registrations (diagnostic-MR to planning-CT) were evaluated. Results: It was discovered that in cases where the rigid registration was of acceptable quality the deformable registration didn’t improve the alignment, this was true of all metrics employed. If the analysis is separated into cases where the rigid alignment was ranked as unacceptable the deformable registration significantly improved the alignment, 4.62mm residual error in landmarks as compared to 5.72mm residual error in rigid alignments with a p-value of 0.0008. Conclusion: This paradigm provides an ideal testing ground for MR to CT deformable registration algorithms by allowing for inter-modality comparisons of multi-modality registrations. Consistent positioning, bowel and bladder preparation may Result in higher quality rigid registrations than typically achieved which limits the impact of deformable registrations. In this study cases where significant differences exist, deformable registrations provide significant value.« less

  16. Development of an embedded instrument for autofocus and polarization alignment of polarization maintaining fiber

    NASA Astrophysics Data System (ADS)

    Feng, Di; Fang, Qimeng; Huang, Huaibo; Zhao, Zhengqi; Song, Ningfang

    2017-12-01

    The development and implementation of a practical instrument based on an embedded technique for autofocus and polarization alignment of polarization maintaining fiber is presented. For focusing efficiency and stability, an image-based focusing algorithm fully considering the image definition evaluation and the focusing search strategy was used to accomplish autofocus. For improving the alignment accuracy, various image-based algorithms of alignment detection were developed with high calculation speed and strong robustness. The instrument can be operated as a standalone device with real-time processing and convenience operations. The hardware construction, software interface, and image-based algorithms of main modules are described. Additionally, several image simulation experiments were also carried out to analyze the accuracy of the above alignment detection algorithms. Both the simulation results and experiment results indicate that the instrument can achieve the accuracy of polarization alignment <±0.1 deg.

  17. A novel approach to multiple sequence alignment using hadoop data grids.

    PubMed

    Sudha Sadasivam, G; Baktavatchalam, G

    2010-01-01

    Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.

  18. Vertebra identification using template matching modelmp and K-means clustering.

    PubMed

    Larhmam, Mohamed Amine; Benjelloun, Mohammed; Mahmoudi, Saïd

    2014-03-01

    Accurate vertebra detection and segmentation are essential steps for automating the diagnosis of spinal disorders. This study is dedicated to vertebra alignment measurement, the first step in a computer-aided diagnosis tool for cervical spine trauma. Automated vertebral segment alignment determination is a challenging task due to low contrast imaging and noise. A software tool for segmenting vertebrae and detecting subluxations has clinical significance. A robust method was developed and tested for cervical vertebra identification and segmentation that extracts parameters used for vertebra alignment measurement. Our contribution involves a novel combination of a template matching method and an unsupervised clustering algorithm. In this method, we build a geometric vertebra mean model. To achieve vertebra detection, manual selection of the region of interest is performed initially on the input image. Subsequent preprocessing is done to enhance image contrast and detect edges. Candidate vertebra localization is then carried out by using a modified generalized Hough transform (GHT). Next, an adapted cost function is used to compute local voted centers and filter boundary data. Thereafter, a K-means clustering algorithm is applied to obtain clusters distribution corresponding to the targeted vertebrae. These clusters are combined with the vote parameters to detect vertebra centers. Rigid segmentation is then carried out by using GHT parameters. Finally, cervical spine curves are extracted to measure vertebra alignment. The proposed approach was successfully applied to a set of 66 high-resolution X-ray images. Robust detection was achieved in 97.5 % of the 330 tested cervical vertebrae. An automated vertebral identification method was developed and demonstrated to be robust to noise and occlusion. This work presents a first step toward an automated computer-aided diagnosis system for cervical spine trauma detection.

  19. Automatic classification of protein structures relying on similarities between alignments

    PubMed Central

    2012-01-01

    Background Identification of protein structural cores requires isolation of sets of proteins all sharing a same subset of structural motifs. In the context of an ever growing number of available 3D protein structures, standard and automatic clustering algorithms require adaptations so as to allow for efficient identification of such sets of proteins. Results When considering a pair of 3D structures, they are stated as similar or not according to the local similarities of their matching substructures in a structural alignment. This binary relation can be represented in a graph of similarities where a node represents a 3D protein structure and an edge states that two 3D protein structures are similar. Therefore, classifying proteins into structural families can be viewed as a graph clustering task. Unfortunately, because such a graph encodes only pairwise similarity information, clustering algorithms may include in the same cluster a subset of 3D structures that do not share a common substructure. In order to overcome this drawback we first define a ternary similarity on a triple of 3D structures as a constraint to be satisfied by the graph of similarities. Such a ternary constraint takes into account similarities between pairwise alignments, so as to ensure that the three involved protein structures do have some common substructure. We propose hereunder a modification algorithm that eliminates edges from the original graph of similarities and gives a reduced graph in which no ternary constraints are violated. Our approach is then first to build a graph of similarities, then to reduce the graph according to the modification algorithm, and finally to apply to the reduced graph a standard graph clustering algorithm. Such method was used for classifying ASTRAL-40 non-redundant protein domains, identifying significant pairwise similarities with Yakusa, a program devised for rapid 3D structure alignments. Conclusions We show that filtering similarities prior to standard graph based clustering process by applying ternary similarity constraints i) improves the separation of proteins of different classes and consequently ii) improves the classification quality of standard graph based clustering algorithms according to the reference classification SCOP. PMID:22974051

  20. A New Continuous Rotation IMU Alignment Algorithm Based on Stochastic Modeling for Cost Effective North-Finding Applications

    PubMed Central

    Li, Yun; Wu, Wenqi; Jiang, Qingan; Wang, Jinling

    2016-01-01

    Based on stochastic modeling of Coriolis vibration gyros by the Allan variance technique, this paper discusses Angle Random Walk (ARW), Rate Random Walk (RRW) and Markov process gyroscope noises which have significant impacts on the North-finding accuracy. A new continuous rotation alignment algorithm for a Coriolis vibration gyroscope Inertial Measurement Unit (IMU) is proposed in this paper, in which the extended observation equations are used for the Kalman filter to enhance the estimation of gyro drift errors, thus improving the north-finding accuracy. Theoretical and numerical comparisons between the proposed algorithm and the traditional ones are presented. The experimental results show that the new continuous rotation alignment algorithm using the extended observation equations in the Kalman filter is more efficient than the traditional two-position alignment method. Using Coriolis vibration gyros with bias instability of 0.1°/h, a north-finding accuracy of 0.1° (1σ) is achieved by the new continuous rotation alignment algorithm, compared with 0.6° (1σ) north-finding accuracy for the two-position alignment and 1° (1σ) for the fixed-position alignment. PMID:27983585

  1. Alignment of cryo-EM movies of individual particles by optimization of image translations.

    PubMed

    Rubinstein, John L; Brubaker, Marcus A

    2015-11-01

    Direct detector device (DDD) cameras have revolutionized single particle electron cryomicroscopy (cryo-EM). In addition to an improved camera detective quantum efficiency, acquisition of DDD movies allows for correction of movement of the specimen, due to both instabilities in the microscope specimen stage and electron beam-induced movement. Unlike specimen stage drift, beam-induced movement is not always homogeneous within an image. Local correlation in the trajectories of nearby particles suggests that beam-induced motion is due to deformation of the ice layer. Algorithms have already been described that can correct movement for large regions of frames and for >1 MDa protein particles. Another algorithm allows individual <1 MDa protein particle trajectories to be estimated, but requires rolling averages to be calculated from frames and fits linear trajectories for particles. Here we describe an algorithm that allows for individual <1 MDa particle images to be aligned without frame averaging or linear trajectories. The algorithm maximizes the overall correlation of the shifted frames with the sum of the shifted frames. The optimum in this single objective function is found efficiently by making use of analytically calculated derivatives of the function. To smooth estimates of particle trajectories, rapid changes in particle positions between frames are penalized in the objective function and weighted averaging of nearby trajectories ensures local correlation in trajectories. This individual particle motion correction, in combination with weighting of Fourier components to account for increasing radiation damage in later frames, can be used to improve 3-D maps from single particle cryo-EM. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. Ontology Alignment Repair through Modularization and Confidence-Based Heuristics

    PubMed Central

    Santos, Emanuel; Faria, Daniel; Pesquita, Catia; Couto, Francisco M.

    2015-01-01

    Ontology Matching aims at identifying a set of semantic correspondences, called an alignment, between related ontologies. In recent years, there has been a growing interest in efficient and effective matching methods for large ontologies. However, alignments produced for large ontologies are often logically incoherent. It was only recently that the use of repair techniques to improve the coherence of ontology alignments began to be explored. This paper presents a novel modularization technique for ontology alignment repair which extracts fragments of the input ontologies that only contain the necessary classes and relations to resolve all detectable incoherences. The paper presents also an alignment repair algorithm that uses a global repair strategy to minimize both the degree of incoherence and the number of mappings removed from the alignment, while overcoming the scalability problem by employing the proposed modularization technique. Our evaluation shows that our modularization technique produces significantly small fragments of the ontologies and that our repair algorithm produces more complete alignments than other current alignment repair systems, while obtaining an equivalent degree of incoherence. Additionally, we also present a variant of our repair algorithm that makes use of the confidence values of the mappings to improve alignment repair. Our repair algorithm was implemented as part of AgreementMakerLight, a free and open-source ontology matching system. PMID:26710335

  3. Ontology Alignment Repair through Modularization and Confidence-Based Heuristics.

    PubMed

    Santos, Emanuel; Faria, Daniel; Pesquita, Catia; Couto, Francisco M

    2015-01-01

    Ontology Matching aims at identifying a set of semantic correspondences, called an alignment, between related ontologies. In recent years, there has been a growing interest in efficient and effective matching methods for large ontologies. However, alignments produced for large ontologies are often logically incoherent. It was only recently that the use of repair techniques to improve the coherence of ontology alignments began to be explored. This paper presents a novel modularization technique for ontology alignment repair which extracts fragments of the input ontologies that only contain the necessary classes and relations to resolve all detectable incoherences. The paper presents also an alignment repair algorithm that uses a global repair strategy to minimize both the degree of incoherence and the number of mappings removed from the alignment, while overcoming the scalability problem by employing the proposed modularization technique. Our evaluation shows that our modularization technique produces significantly small fragments of the ontologies and that our repair algorithm produces more complete alignments than other current alignment repair systems, while obtaining an equivalent degree of incoherence. Additionally, we also present a variant of our repair algorithm that makes use of the confidence values of the mappings to improve alignment repair. Our repair algorithm was implemented as part of AgreementMakerLight, a free and open-source ontology matching system.

  4. mTM-align: a server for fast protein structure database search and multiple protein structure alignment.

    PubMed

    Dong, Runze; Pan, Shuo; Peng, Zhenling; Zhang, Yang; Yang, Jianyi

    2018-05-21

    With the rapid increase of the number of protein structures in the Protein Data Bank, it becomes urgent to develop algorithms for efficient protein structure comparisons. In this article, we present the mTM-align server, which consists of two closely related modules: one for structure database search and the other for multiple structure alignment. The database search is speeded up based on a heuristic algorithm and a hierarchical organization of the structures in the database. The multiple structure alignment is performed using the recently developed algorithm mTM-align. Benchmark tests demonstrate that our algorithms outperform other peering methods for both modules, in terms of speed and accuracy. One of the unique features for the server is the interplay between database search and multiple structure alignment. The server provides service not only for performing fast database search, but also for making accurate multiple structure alignment with the structures found by the search. For the database search, it takes about 2-5 min for a structure of a medium size (∼300 residues). For the multiple structure alignment, it takes a few seconds for ∼10 structures of medium sizes. The server is freely available at: http://yanglab.nankai.edu.cn/mTM-align/.

  5. Feature Based Retention Time Alignment for Improved HDX MS Analysis

    NASA Astrophysics Data System (ADS)

    Venable, John D.; Scuba, William; Brock, Ansgar

    2013-04-01

    An algorithm for retention time alignment of mass shifted hydrogen-deuterium exchange (HDX) data based on an iterative distance minimization procedure is described. The algorithm performs pairwise comparisons in an iterative fashion between a list of features from a reference file and a file to be time aligned to calculate a retention time mapping function. Features are characterized by their charge, retention time and mass of the monoisotopic peak. The algorithm is able to align datasets with mass shifted features, which is a prerequisite for aligning hydrogen-deuterium exchange mass spectrometry datasets. Confidence assignments from the fully automated processing of a commercial HDX software package are shown to benefit significantly from retention time alignment prior to extraction of deuterium incorporation values.

  6. Simultaneous phylogeny reconstruction and multiple sequence alignment

    PubMed Central

    Yue, Feng; Shi, Jian; Tang, Jijun

    2009-01-01

    Background A phylogeny is the evolutionary history of a group of organisms. To date, sequence data is still the most used data type for phylogenetic reconstruction. Before any sequences can be used for phylogeny reconstruction, they must be aligned, and the quality of the multiple sequence alignment has been shown to affect the quality of the inferred phylogeny. At the same time, all the current multiple sequence alignment programs use a guide tree to produce the alignment and experiments showed that good guide trees can significantly improve the multiple alignment quality. Results We devise a new algorithm to simultaneously align multiple sequences and search for the phylogenetic tree that leads to the best alignment. We also implemented the algorithm as a C program package, which can handle both DNA and protein data and can take simple cost model as well as complex substitution matrices, such as PAM250 or BLOSUM62. The performance of the new method are compared with those from other popular multiple sequence alignment tools, including the widely used programs such as ClustalW and T-Coffee. Experimental results suggest that this method has good performance in terms of both phylogeny accuracy and alignment quality. Conclusion We present an algorithm to align multiple sequences and reconstruct the phylogenies that minimize the alignment score, which is based on an efficient algorithm to solve the median problems for three sequences. Our extensive experiments suggest that this method is very promising and can produce high quality phylogenies and alignments. PMID:19208110

  7. Matt: local flexibility aids protein multiple structure alignment.

    PubMed

    Menke, Matthew; Berger, Bonnie; Cowen, Lenore

    2008-01-01

    Even when there is agreement on what measure a protein multiple structure alignment should be optimizing, finding the optimal alignment is computationally prohibitive. One approach used by many previous methods is aligned fragment pair chaining, where short structural fragments from all the proteins are aligned against each other optimally, and the final alignment chains these together in geometrically consistent ways. Ye and Godzik have recently suggested that adding geometric flexibility may help better model protein structures in a variety of contexts. We introduce the program Matt (Multiple Alignment with Translations and Twists), an aligned fragment pair chaining algorithm that, in intermediate steps, allows local flexibility between fragments: small translations and rotations are temporarily allowed to bring sets of aligned fragments closer, even if they are physically impossible under rigid body transformations. After a dynamic programming assembly guided by these "bent" alignments, geometric consistency is restored in the final step before the alignment is output. Matt is tested against other recent multiple protein structure alignment programs on the popular Homstrad and SABmark benchmark datasets. Matt's global performance is competitive with the other programs on Homstrad, but outperforms the other programs on SABmark, a benchmark of multiple structure alignments of proteins with more distant homology. On both datasets, Matt demonstrates an ability to better align the ends of alpha-helices and beta-strands, an important characteristic of any structure alignment program intended to help construct a structural template library for threading approaches to the inverse protein-folding problem. The related question of whether Matt alignments can be used to distinguish distantly homologous structure pairs from pairs of proteins that are not homologous is also considered. For this purpose, a p-value score based on the length of the common core and average root mean squared deviation (RMSD) of Matt alignments is shown to largely separate decoys from homologous protein structures in the SABmark benchmark dataset. We postulate that Matt's strong performance comes from its ability to model proteins in different conformational states and, perhaps even more important, its ability to model backbone distortions in more distantly related proteins.

  8. Evaluation of methods to produce an image library for automatic patient model localization for dose mapping during fluoroscopically guided procedures

    NASA Astrophysics Data System (ADS)

    Kilian-Meneghin, Josh; Xiong, Z.; Rudin, S.; Oines, A.; Bednarek, D. R.

    2017-03-01

    The purpose of this work is to evaluate methods for producing a library of 2D-radiographic images to be correlated to clinical images obtained during a fluoroscopically-guided procedure for automated patient-model localization. The localization algorithm will be used to improve the accuracy of the skin-dose map superimposed on the 3D patient- model of the real-time Dose-Tracking-System (DTS). For the library, 2D images were generated from CT datasets of the SK-150 anthropomorphic phantom using two methods: Schmid's 3D-visualization tool and Plastimatch's digitally-reconstructed-radiograph (DRR) code. Those images, as well as a standard 2D-radiographic image, were correlated to a 2D-fluoroscopic image of a phantom, which represented the clinical-fluoroscopic image, using the Corr2 function in Matlab. The Corr2 function takes two images and outputs the relative correlation between them, which is fed into the localization algorithm. Higher correlation means better alignment of the 3D patient-model with the patient image. In this instance, it was determined that the localization algorithm will succeed when Corr2 returns a correlation of at least 50%. The 3D-visualization tool images returned 55-80% correlation relative to the fluoroscopic-image, which was comparable to the correlation for the radiograph. The DRR images returned 61-90% correlation, again comparable to the radiograph. Both methods prove to be sufficient for the localization algorithm and can be produced quickly; however, the DRR method produces more accurate grey-levels. Using the DRR code, a library at varying angles can be produced for the localization algorithm.

  9. Image stack alignment in full-field X-ray absorption spectroscopy using SIFT_PyOCL.

    PubMed

    Paleo, Pierre; Pouyet, Emeline; Kieffer, Jérôme

    2014-03-01

    Full-field X-ray absorption spectroscopy experiments allow the acquisition of millions of spectra within minutes. However, the construction of the hyperspectral image requires an image alignment procedure with sub-pixel precision. While the image correlation algorithm has originally been used for image re-alignment using translations, the Scale Invariant Feature Transform (SIFT) algorithm (which is by design robust versus rotation, illumination change, translation and scaling) presents an additional advantage: the alignment can be limited to a region of interest of any arbitrary shape. In this context, a Python module, named SIFT_PyOCL, has been developed. It implements a parallel version of the SIFT algorithm in OpenCL, providing high-speed image registration and alignment both on processors and graphics cards. The performance of the algorithm allows online processing of large datasets.

  10. Optimal Alignment of Structures for Finite and Periodic Systems.

    PubMed

    Griffiths, Matthew; Niblett, Samuel P; Wales, David J

    2017-10-10

    Finding the optimal alignment between two structures is important for identifying the minimum root-mean-square distance (RMSD) between them and as a starting point for calculating pathways. Most current algorithms for aligning structures are stochastic, scale exponentially with the size of structure, and the performance can be unreliable. We present two complementary methods for aligning structures corresponding to isolated clusters of atoms and to condensed matter described by a periodic cubic supercell. The first method (Go-PERMDIST), a branch and bound algorithm, locates the global minimum RMSD deterministically in polynomial time. The run time increases for larger RMSDs. The second method (FASTOVERLAP) is a heuristic algorithm that aligns structures by finding the global maximum kernel correlation between them using fast Fourier transforms (FFTs) and fast SO(3) transforms (SOFTs). For periodic systems, FASTOVERLAP scales with the square of the number of identical atoms in the system, reliably finds the best alignment between structures that are not too distant, and shows significantly better performance than existing algorithms. The expected run time for Go-PERMDIST is longer than FASTOVERLAP for periodic systems. For finite clusters, the FASTOVERLAP algorithm is competitive with existing algorithms. The expected run time for Go-PERMDIST to find the global RMSD between two structures deterministically is generally longer than for existing stochastic algorithms. However, with an earlier exit condition, Go-PERMDIST exhibits similar or better performance.

  11. Robust 3D face landmark localization based on local coordinate coding.

    PubMed

    Song, Mingli; Tao, Dacheng; Sun, Shengpeng; Chen, Chun; Maybank, Stephen J

    2014-12-01

    In the 3D facial animation and synthesis community, input faces are usually required to be labeled by a set of landmarks for parameterization. Because of the variations in pose, expression and resolution, automatic 3D face landmark localization remains a challenge. In this paper, a novel landmark localization approach is presented. The approach is based on local coordinate coding (LCC) and consists of two stages. In the first stage, we perform nose detection, relying on the fact that the nose shape is usually invariant under the variations in the pose, expression, and resolution. Then, we use the iterative closest points algorithm to find a 3D affine transformation that aligns the input face to a reference face. In the second stage, we perform resampling to build correspondences between the input 3D face and the training faces. Then, an LCC-based localization algorithm is proposed to obtain the positions of the landmarks in the input face. Experimental results show that the proposed method is comparable to state of the art methods in terms of its robustness, flexibility, and accuracy.

  12. Evaluation of Eight Methods for Aligning Orientation of Two Coordinate Systems.

    PubMed

    Mecheri, Hakim; Robert-Lachaine, Xavier; Larue, Christian; Plamondon, André

    2016-08-01

    The aim of this study was to evaluate eight methods for aligning the orientation of two different local coordinate systems. Alignment is very important when combining two different systems of motion analysis. Two of the methods were developed specifically for biomechanical studies, and because there have been at least three decades of algorithm development in robotics, it was decided to include six methods from this field. To compare these methods, an Xsens sensor and two Optotrak clusters were attached to a Plexiglas plate. The first optical marker cluster was fixed on the sensor and 20 trials were recorded. The error of alignment was calculated for each trial, and the mean, the standard deviation, and the maximum values of this error over all trials were reported. One-way repeated measures analysis of variance revealed that the alignment error differed significantly across the eight methods. Post-hoc tests showed that the alignment error from the methods based on angular velocities was significantly lower than for the other methods. The method using angular velocities performed the best, with an average error of 0.17 ± 0.08 deg. We therefore recommend this method, which is easy to perform and provides accurate alignment.

  13. Global Alignment of Pairwise Protein Interaction Networks for Maximal Common Conserved Patterns

    DOE PAGES

    Tian, Wenhong; Samatova, Nagiza F.

    2013-01-01

    A number of tools for the alignment of protein-protein interaction (PPI) networks have laid the foundation for PPI network analysis. Most of alignment tools focus on finding conserved interaction regions across the PPI networks through either local or global mapping of similar sequences. Researchers are still trying to improve the speed, scalability, and accuracy of network alignment. In view of this, we introduce a connected-components based fast algorithm, HopeMap, for network alignment. Observing that the size of true orthologs across species is small comparing to the total number of proteins in all species, we take a different approach based onmore » a precompiled list of homologs identified by KO terms. Applying this approach to S. cerevisiae (yeast) and D. melanogaster (fly), E. coli K12 and S. typhimurium , E. coli K12 and C. crescenttus , we analyze all clusters identified in the alignment. The results are evaluated through up-to-date known gene annotations, gene ontology (GO), and KEGG ortholog groups (KO). Comparing to existing tools, our approach is fast with linear computational cost, highly accurate in terms of KO and GO terms specificity and sensitivity, and can be extended to multiple alignments easily.« less

  14. Quantification of Cardiomyocyte Alignment from Three-Dimensional (3D) Confocal Microscopy of Engineered Tissue.

    PubMed

    Kowalski, William J; Yuan, Fangping; Nakane, Takeichiro; Masumoto, Hidetoshi; Dwenger, Marc; Ye, Fei; Tinney, Joseph P; Keller, Bradley B

    2017-08-01

    Biological tissues have complex, three-dimensional (3D) organizations of cells and matrix factors that provide the architecture necessary to meet morphogenic and functional demands. Disordered cell alignment is associated with congenital heart disease, cardiomyopathy, and neurodegenerative diseases and repairing or replacing these tissues using engineered constructs may improve regenerative capacity. However, optimizing cell alignment within engineered tissues requires quantitative 3D data on cell orientations and both efficient and validated processing algorithms. We developed an automated method to measure local 3D orientations based on structure tensor analysis and incorporated an adaptive subregion size to account for multiple scales. Our method calculates the statistical concentration parameter, κ, to quantify alignment, as well as the traditional orientational order parameter. We validated our method using synthetic images and accurately measured principal axis and concentration. We then applied our method to confocal stacks of cleared, whole-mount engineered cardiac tissues generated from human-induced pluripotent stem cells or embryonic chick cardiac cells and quantified cardiomyocyte alignment. We found significant differences in alignment based on cellular composition and tissue geometry. These results from our synthetic images and confocal data demonstrate the efficiency and accuracy of our method to measure alignment in 3D tissues.

  15. Pairwise Sequence Alignment Library

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jeff Daily, PNNL

    2015-05-20

    Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, amore » novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.« less

  16. Differential evolution-simulated annealing for multiple sequence alignment

    NASA Astrophysics Data System (ADS)

    Addawe, R. C.; Addawe, J. M.; Sueño, M. R. K.; Magadia, J. C.

    2017-10-01

    Multiple sequence alignments (MSA) are used in the analysis of molecular evolution and sequence structure relationships. In this paper, a hybrid algorithm, Differential Evolution - Simulated Annealing (DESA) is applied in optimizing multiple sequence alignments (MSAs) based on structural information, non-gaps percentage and totally conserved columns. DESA is a robust algorithm characterized by self-organization, mutation, crossover, and SA-like selection scheme of the strategy parameters. Here, the MSA problem is treated as a multi-objective optimization problem of the hybrid evolutionary algorithm, DESA. Thus, we name the algorithm as DESA-MSA. Simulated sequences and alignments were generated to evaluate the accuracy and efficiency of DESA-MSA using different indel sizes, sequence lengths, deletion rates and insertion rates. The proposed hybrid algorithm obtained acceptable solutions particularly for the MSA problem evaluated based on the three objectives.

  17. High-speed peak matching algorithm for retention time alignment of gas chromatographic data for chemometric analysis.

    PubMed

    Johnson, Kevin J; Wright, Bob W; Jarman, Kristin H; Synovec, Robert E

    2003-05-09

    A rapid retention time alignment algorithm was developed as a preprocessing utility to be used prior to chemometric analysis of large datasets of diesel fuel profiles obtained using gas chromatography (GC). Retention time variation from chromatogram-to-chromatogram has been a significant impediment against the use of chemometric techniques in the analysis of chromatographic data due to the inability of current chemometric techniques to correctly model information that shifts from variable to variable within a dataset. The alignment algorithm developed is shown to increase the efficacy of pattern recognition methods applied to diesel fuel chromatograms by retaining chemical selectivity while reducing chromatogram-to-chromatogram retention time variations and to do so on a time scale that makes analysis of large sets of chromatographic data practical. Two sets of diesel fuel gas chromatograms were studied using the novel alignment algorithm followed by principal component analysis (PCA). In the first study, retention times for corresponding chromatographic peaks in 60 chromatograms varied by as much as 300 ms between chromatograms before alignment. In the second study of 42 chromatograms, the retention time shifting exhibited was on the order of 10 s between corresponding chromatographic peaks, and required a coarse retention time correction prior to alignment with the algorithm. In both cases, an increase in retention time precision afforded by the algorithm was clearly visible in plots of overlaid chromatograms before and then after applying the retention time alignment algorithm. Using the alignment algorithm, the standard deviation for corresponding peak retention times following alignment was 17 ms throughout a given chromatogram, corresponding to a relative standard deviation of 0.003% at an average retention time of 8 min. This level of retention time precision is a 5-fold improvement over the retention time precision initially provided by a state-of-the-art GC instrument equipped with electronic pressure control and was critical to the performance of the chemometric analysis. This increase in retention time precision does not come at the expense of chemical selectivity, since the PCA results suggest that essentially all of the chemical selectivity is preserved. Cluster resolution between dissimilar groups of diesel fuel chromatograms in a two-dimensional scores space generated with PCA is shown to substantially increase after alignment. The alignment method is robust against missing or extra peaks relative to a target chromatogram used in the alignment, and operates at high speed, requiring roughly 1 s of computation time per GC chromatogram.

  18. MSuPDA: A Memory Efficient Algorithm for Sequence Alignment.

    PubMed

    Khan, Mohammad Ibrahim; Kamal, Md Sarwar; Chowdhury, Linkon

    2016-03-01

    Space complexity is a million dollar question in DNA sequence alignments. In this regard, memory saving under pushdown automata can help to reduce the occupied spaces in computer memory. Our proposed process is that anchor seed (AS) will be selected from given data set of nucleotide base pairs for local sequence alignment. Quick splitting techniques will separate the AS from all the DNA genome segments. Selected AS will be placed to pushdown automata's (PDA) input unit. Whole DNA genome segments will be placed into PDA's stack. AS from input unit will be matched with the DNA genome segments from stack of PDA. Match, mismatch and indel of nucleotides will be popped from the stack under the control unit of pushdown automata. During the POP operation on stack, it will free the memory cell occupied by the nucleotide base pair.

  19. Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices.

    PubMed

    Li, Guang; Wang, Yadong; Su, Xiaohong

    2012-10-01

    When developing personal DNA databases, there must be an appropriate guarantee of anonymity, which means that the data cannot be related back to individuals. DNA lattice anonymization (DNALA) is a successful method for making personal DNA sequences anonymous. However, it uses time-consuming multiple sequence alignment and a low-accuracy greedy clustering algorithm. Furthermore, DNALA is not an online algorithm, and so it cannot quickly return results when the database is updated. This study improves the DNALA method. Specifically, we replaced the multiple sequence alignment in DNALA with global pairwise sequence alignment to save time, and we designed a hybrid clustering algorithm comprised of a maximum weight matching (MWM)-based algorithm and an online algorithm. The MWM-based algorithm is more accurate than the greedy algorithm in DNALA and has the same time complexity. The online algorithm can process data quickly when the database is updated. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  20. A Comprehensive Two-Dimensional Retention Time Alignment Algorithm To Enhance Chemometric Analysis of Comprehensive Two-Dimensional Separation Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pierce, Karisa M.; Wood, Lianna F.; Wright, Bob W.

    2005-12-01

    A comprehensive two-dimensional (2D) retention time alignment algorithm was developed using a novel indexing scheme. The algorithm is termed comprehensive because it functions to correct the entire chromatogram in both dimensions and it preserves the separation information in both dimensions. Although the algorithm is demonstrated by correcting comprehensive two-dimensional gas chromatography (GC x GC) data, the algorithm is designed to correct shifting in all forms of 2D separations, such as LC x LC, LC x CE, CE x CE, and LC x GC. This 2D alignment algorithm was applied to three different data sets composed of replicate GC x GCmore » separations of (1) three 22-component control mixtures, (2) three gasoline samples, and (3) three diesel samples. The three data sets were collected using slightly different temperature or pressure programs to engender significant retention time shifting in the raw data and then demonstrate subsequent corrections of that shifting upon comprehensive 2D alignment of the data sets. Thirty 12-min GC x GC separations from three 22-component control mixtures were used to evaluate the 2D alignment performance (10 runs/mixture). The average standard deviation of the first column retention time improved 5-fold from 0.020 min (before alignment) to 0.004 min (after alignment). Concurrently, the average standard deviation of second column retention time improved 4-fold from 3.5 ms (before alignment) to 0.8 ms (after alignment). Alignment of the 30 control mixture chromatograms took 20 min. The quantitative integrity of the GC x GC data following 2D alignment was also investigated. The mean integrated signal was determined for all components in the three 22-component mixtures for all 30 replicates. The average percent difference in the integrated signal for each component before and after alignment was 2.6%. Singular value decomposition (SVD) was applied to the 22-component control mixture data before and after alignment to show the restoration of trilinearity to the data, since trilinearity benefits chemometric analysis. By applying comprehensive 2D retention time alignment to all three data sets (control mixtures, gasoline samples, and diesel samples), classification by principal component analysis (PCA) substantially improved, resulting in 100% accurate scores clustering.« less

  1. Protein alignment algorithms with an efficient backtracking routine on multiple GPUs.

    PubMed

    Blazewicz, Jacek; Frohmberg, Wojciech; Kierzynka, Michal; Pesch, Erwin; Wojciechowski, Pawel

    2011-05-20

    Pairwise sequence alignment methods are widely used in biological research. The increasing number of sequences is perceived as one of the upcoming challenges for sequence alignment methods in the nearest future. To overcome this challenge several GPU (Graphics Processing Unit) computing approaches have been proposed lately. These solutions show a great potential of a GPU platform but in most cases address the problem of sequence database scanning and computing only the alignment score whereas the alignment itself is omitted. Thus, the need arose to implement the global and semiglobal Needleman-Wunsch, and Smith-Waterman algorithms with a backtracking procedure which is needed to construct the alignment. In this paper we present the solution that performs the alignment of every given sequence pair, which is a required step for progressive multiple sequence alignment methods, as well as for DNA recognition at the DNA assembly stage. Performed tests show that the implementation, with performance up to 6.3 GCUPS on a single GPU for affine gap penalties, is very efficient in comparison to other CPU and GPU-based solutions. Moreover, multiple GPUs support with load balancing makes the application very scalable. The article shows that the backtracking procedure of the sequence alignment algorithms may be designed to fit in with the GPU architecture. Therefore, our algorithm, apart from scores, is able to compute pairwise alignments. This opens a wide range of new possibilities, allowing other methods from the area of molecular biology to take advantage of the new computational architecture. Performed tests show that the efficiency of the implementation is excellent. Moreover, the speed of our GPU-based algorithms can be almost linearly increased when using more than one graphics card.

  2. 160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA)

    PubMed Central

    Li, Isaac TS; Shum, Warren; Truong, Kevin

    2007-01-01

    Background To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. Results In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. Conclusion This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching. PMID:17555593

  3. 160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA).

    PubMed

    Li, Isaac T S; Shum, Warren; Truong, Kevin

    2007-06-07

    To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching.

  4. CAMPways: constrained alignment framework for the comparative analysis of a pair of metabolic pathways.

    PubMed

    Abaka, Gamze; Bıyıkoğlu, Türker; Erten, Cesim

    2013-07-01

    Given a pair of metabolic pathways, an alignment of the pathways corresponds to a mapping between similar substructures of the pair. Successful alignments may provide useful applications in phylogenetic tree reconstruction, drug design and overall may enhance our understanding of cellular metabolism. We consider the problem of providing one-to-many alignments of reactions in a pair of metabolic pathways. We first provide a constrained alignment framework applicable to the problem. We show that the constrained alignment problem even in a primitive setting is computationally intractable, which justifies efforts for designing efficient heuristics. We present our Constrained Alignment of Metabolic Pathways (CAMPways) algorithm designed for this purpose. Through extensive experiments involving a large pathway database, we demonstrate that when compared with a state-of-the-art alternative, the CAMPways algorithm provides better alignment results on metabolic networks as far as measures based on same-pathway inclusion and biochemical significance are concerned. The execution speed of our algorithm constitutes yet another important improvement over alternative algorithms. Open source codes, executable binary, useful scripts, all the experimental data and the results are freely available as part of the Supplementary Material at http://code.google.com/p/campways/. Supplementary data are available at Bioinformatics online.

  5. Optimal Parameter Design of Coarse Alignment for Fiber Optic Gyro Inertial Navigation System.

    PubMed

    Lu, Baofeng; Wang, Qiuying; Yu, Chunmei; Gao, Wei

    2015-06-25

    Two different coarse alignment algorithms for Fiber Optic Gyro (FOG) Inertial Navigation System (INS) based on inertial reference frame are discussed in this paper. Both of them are based on gravity vector integration, therefore, the performance of these algorithms is determined by integration time. In previous works, integration time is selected by experience. In order to give a criterion for the selection process, and make the selection of the integration time more accurate, optimal parameter design of these algorithms for FOG INS is performed in this paper. The design process is accomplished based on the analysis of the error characteristics of these two coarse alignment algorithms. Moreover, this analysis and optimal parameter design allow us to make an adequate selection of the most accurate algorithm for FOG INS according to the actual operational conditions. The analysis and simulation results show that the parameter provided by this work is the optimal value, and indicate that in different operational conditions, the coarse alignment algorithms adopted for FOG INS are different in order to achieve better performance. Lastly, the experiment results validate the effectiveness of the proposed algorithm.

  6. Iterative refinement of structure-based sequence alignments by Seed Extension

    PubMed Central

    Kim, Changhoon; Tai, Chin-Hsien; Lee, Byungkook

    2009-01-01

    Background Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension) that iteratively refines a structure-based sequence alignment. Results RSE uses SE (Seed Extension) in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs. Conclusion RSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs. PMID:19589133

  7. Aligning Biomolecular Networks Using Modular Graph Kernels

    NASA Astrophysics Data System (ADS)

    Towfic, Fadi; Greenlee, M. Heather West; Honavar, Vasant

    Comparative analysis of biomolecular networks constructed using measurements from different conditions, tissues, and organisms offer a powerful approach to understanding the structure, function, dynamics, and evolution of complex biological systems. We explore a class of algorithms for aligning large biomolecular networks by breaking down such networks into subgraphs and computing the alignment of the networks based on the alignment of their subgraphs. The resulting subnetworks are compared using graph kernels as scoring functions. We provide implementations of the resulting algorithms as part of BiNA, an open source biomolecular network alignment toolkit. Our experiments using Drosophila melanogaster, Saccharomyces cerevisiae, Mus musculus and Homo sapiens protein-protein interaction networks extracted from the DIP repository of protein-protein interaction data demonstrate that the performance of the proposed algorithms (as measured by % GO term enrichment of subnetworks identified by the alignment) is competitive with some of the state-of-the-art algorithms for pair-wise alignment of large protein-protein interaction networks. Our results also show that the inter-species similarity scores computed based on graph kernels can be used to cluster the species into a species tree that is consistent with the known phylogenetic relationships among the species.

  8. Fitting-free algorithm for efficient quantification of collagen fiber alignment in SHG imaging applications.

    PubMed

    Hall, Gunnsteinn; Liang, Wenxuan; Li, Xingde

    2017-10-01

    Collagen fiber alignment derived from second harmonic generation (SHG) microscopy images can be important for disease diagnostics. Image processing algorithms are needed to robustly quantify the alignment in images with high sensitivity and reliability. Fourier transform (FT) magnitude, 2D power spectrum, and image autocorrelation have previously been used to extract fiber information from images by assuming a certain mathematical model (e.g. Gaussian distribution of the fiber-related parameters) and fitting. The fitting process is slow and fails to converge when the data is not Gaussian. Herein we present an efficient constant-time deterministic algorithm which characterizes the symmetricity of the FT magnitude image in terms of a single parameter, named the fiber alignment anisotropy R ranging from 0 (randomized fibers) to 1 (perfect alignment). This represents an important improvement of the technology and may bring us one step closer to utilizing the technology for various applications in real time. In addition, we present a digital image phantom-based framework for characterizing and validating the algorithm, as well as assessing the robustness of the algorithm against different perturbations.

  9. Memetic algorithms for de novo motif-finding in biomedical sequences.

    PubMed

    Bi, Chengpeng

    2012-09-01

    The objectives of this study are to design and implement a new memetic algorithm for de novo motif discovery, which is then applied to detect important signals hidden in various biomedical molecular sequences. In this paper, memetic algorithms are developed and tested in de novo motif-finding problems. Several strategies in the algorithm design are employed that are to not only efficiently explore the multiple sequence local alignment space, but also effectively uncover the molecular signals. As a result, there are a number of key features in the implementation of the memetic motif-finding algorithm (MaMotif), including a chromosome replacement operator, a chromosome alteration-aware local search operator, a truncated local search strategy, and a stochastic operation of local search imposed on individual learning. To test the new algorithm, we compare MaMotif with a few of other similar algorithms using simulated and experimental data including genomic DNA, primary microRNA sequences (let-7 family), and transmembrane protein sequences. The new memetic motif-finding algorithm is successfully implemented in C++, and exhaustively tested with various simulated and real biological sequences. In the simulation, it shows that MaMotif is the most time-efficient algorithm compared with others, that is, it runs 2 times faster than the expectation maximization (EM) method and 16 times faster than the genetic algorithm-based EM hybrid. In both simulated and experimental testing, results show that the new algorithm is compared favorably or superior to other algorithms. Notably, MaMotif is able to successfully discover the transcription factors' binding sites in the chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-Seq) data, correctly uncover the RNA splicing signals in gene expression, and precisely find the highly conserved helix motif in the transmembrane protein sequences, as well as rightly detect the palindromic segments in the primary microRNA sequences. The memetic motif-finding algorithm is effectively designed and implemented, and its applications demonstrate it is not only time-efficient, but also exhibits excellent performance while compared with other popular algorithms. Copyright © 2012 Elsevier B.V. All rights reserved.

  10. muBLASTP: database-indexed protein sequence search on multicore CPUs.

    PubMed

    Zhang, Jing; Misra, Sanchit; Wang, Hao; Feng, Wu-Chun

    2016-11-04

    The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search. muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST. With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.

  11. Automated and Adaptable Quantification of Cellular Alignment from Microscopic Images for Tissue Engineering Applications

    PubMed Central

    Xu, Feng; Beyazoglu, Turker; Hefner, Evan; Gurkan, Umut Atakan

    2011-01-01

    Cellular alignment plays a critical role in functional, physical, and biological characteristics of many tissue types, such as muscle, tendon, nerve, and cornea. Current efforts toward regeneration of these tissues include replicating the cellular microenvironment by developing biomaterials that facilitate cellular alignment. To assess the functional effectiveness of the engineered microenvironments, one essential criterion is quantification of cellular alignment. Therefore, there is a need for rapid, accurate, and adaptable methodologies to quantify cellular alignment for tissue engineering applications. To address this need, we developed an automated method, binarization-based extraction of alignment score (BEAS), to determine cell orientation distribution in a wide variety of microscopic images. This method combines a sequenced application of median and band-pass filters, locally adaptive thresholding approaches and image processing techniques. Cellular alignment score is obtained by applying a robust scoring algorithm to the orientation distribution. We validated the BEAS method by comparing the results with the existing approaches reported in literature (i.e., manual, radial fast Fourier transform-radial sum, and gradient based approaches). Validation results indicated that the BEAS method resulted in statistically comparable alignment scores with the manual method (coefficient of determination R2=0.92). Therefore, the BEAS method introduced in this study could enable accurate, convenient, and adaptable evaluation of engineered tissue constructs and biomaterials in terms of cellular alignment and organization. PMID:21370940

  12. A study on facial expressions recognition

    NASA Astrophysics Data System (ADS)

    Xu, Jingjing

    2017-09-01

    In terms of communication, postures and facial expressions of such feelings like happiness, anger and sadness play important roles in conveying information. With the development of the technology, recently a number of algorithms dealing with face alignment, face landmark detection, classification, facial landmark localization and pose estimation have been put forward. However, there are a lot of challenges and problems need to be fixed. In this paper, a few technologies have been concluded and analyzed, and they all relate to handling facial expressions recognition and poses like pose-indexed based multi-view method for face alignment, robust facial landmark detection under significant head pose and occlusion, partitioning the input domain for classification, robust statistics face formalization.

  13. A Self-Alignment Algorithm for SINS Based on Gravitational Apparent Motion and Sensor Data Denoising

    PubMed Central

    Liu, Yiting; Xu, Xiaosu; Liu, Xixiang; Yao, Yiqing; Wu, Liang; Sun, Jin

    2015-01-01

    Initial alignment is always a key topic and difficult to achieve in an inertial navigation system (INS). In this paper a novel self-initial alignment algorithm is proposed using gravitational apparent motion vectors at three different moments and vector-operation. Simulation and analysis showed that this method easily suffers from the random noise contained in accelerometer measurements which are used to construct apparent motion directly. Aiming to resolve this problem, an online sensor data denoising method based on a Kalman filter is proposed and a novel reconstruction method for apparent motion is designed to avoid the collinearity among vectors participating in the alignment solution. Simulation, turntable tests and vehicle tests indicate that the proposed alignment algorithm can fulfill initial alignment of strapdown INS (SINS) under both static and swinging conditions. The accuracy can either reach or approach the theoretical values determined by sensor precision under static or swinging conditions. PMID:25923932

  14. Transcript mapping for handwritten English documents

    NASA Astrophysics Data System (ADS)

    Jose, Damien; Bharadwaj, Anurag; Govindaraju, Venu

    2008-01-01

    Transcript mapping or text alignment with handwritten documents is the automatic alignment of words in a text file with word images in a handwritten document. Such a mapping has several applications in fields ranging from machine learning where large quantities of truth data are required for evaluating handwriting recognition algorithms, to data mining where word image indexes are used in ranked retrieval of scanned documents in a digital library. The alignment also aids "writer identity" verification algorithms. Interfaces which display scanned handwritten documents may use this alignment to highlight manuscript tokens when a person examines the corresponding transcript word. We propose an adaptation of the True DTW dynamic programming algorithm for English handwritten documents. The integration of the dissimilarity scores from a word-model word recognizer and Levenshtein distance between the recognized word and lexicon word, as a cost metric in the DTW algorithm leading to a fast and accurate alignment, is our primary contribution. Results provided, confirm the effectiveness of our approach.

  15. Landmark matching based retinal image alignment by enforcing sparsity in correspondence matrix.

    PubMed

    Zheng, Yuanjie; Daniel, Ebenezer; Hunter, Allan A; Xiao, Rui; Gao, Jianbin; Li, Hongsheng; Maguire, Maureen G; Brainard, David H; Gee, James C

    2014-08-01

    Retinal image alignment is fundamental to many applications in diagnosis of eye diseases. In this paper, we address the problem of landmark matching based retinal image alignment. We propose a novel landmark matching formulation by enforcing sparsity in the correspondence matrix and offer its solutions based on linear programming. The proposed formulation not only enables a joint estimation of the landmark correspondences and a predefined transformation model but also combines the benefits of the softassign strategy (Chui and Rangarajan, 2003) and the combinatorial optimization of linear programming. We also introduced a set of reinforced self-similarities descriptors which can better characterize local photometric and geometric properties of the retinal image. Theoretical analysis and experimental results with both fundus color images and angiogram images show the superior performances of our algorithms to several state-of-the-art techniques. Copyright © 2013 Elsevier B.V. All rights reserved.

  16. MSuPDA: A memory efficient algorithm for sequence alignment.

    PubMed

    Khan, Mohammad Ibrahim; Kamal, Md Sarwar; Chowdhury, Linkon

    2015-01-16

    Space complexity is a million dollar question in DNA sequence alignments. In this regards, MSuPDA (Memory Saving under Pushdown Automata) can help to reduce the occupied spaces in computer memory. Our proposed process is that Anchor Seed (AS) will be selected from given data set of Nucleotides base pairs for local sequence alignment. Quick Splitting (QS) techniques will separate the Anchor Seed from all the DNA genome segments. Selected Anchor Seed will be placed to pushdown Automata's (PDA) input unit. Whole DNA genome segments will be placed into PDA's stack. Anchor Seed from input unit will be matched with the DNA genome segments from stack of PDA. Whatever matches, mismatches or Indel, of Nucleotides will be POP from the stack under the control of control unit of Pushdown Automata. During the POP operation on stack it will free the memory cell occupied by the Nucleotide base pair.

  17. Neurient: An Algorithm for Automatic Tracing of Confluent Neuronal Images to Determine Alignment

    PubMed Central

    Mitchel, J.A.; Martin, I.S.

    2013-01-01

    A goal of neural tissue engineering is the development and evaluation of materials that guide neuronal growth and alignment. However, the methods available to quantitatively evaluate the response of neurons to guidance materials are limited and/or expensive, and may require manual tracing to be performed by the researcher. We have developed an open source, automated Matlab-based algorithm, building on previously published methods, to trace and quantify alignment of fluorescent images of neurons in culture. The algorithm is divided into three phases, including computation of a lookup table which contains directional information for each image, location of a set of seed points which may lie along neurite centerlines, and tracing neurites starting with each seed point and indexing into the lookup table. This method was used to obtain quantitative alignment data for complex images of densely cultured neurons. Complete automation of tracing allows for unsupervised processing of large numbers of images. Following image processing with our algorithm, available metrics to quantify neurite alignment include angular histograms, percent of neurite segments in a given direction, and mean neurite angle. The alignment information obtained from traced images can be used to compare the response of neurons to a range of conditions. This tracing algorithm is freely available to the scientific community under the name Neurient, and its implementation in Matlab allows a wide range of researchers to use a standardized, open source method to quantitatively evaluate the alignment of dense neuronal cultures. PMID:23384629

  18. Genetic algorithms for protein threading.

    PubMed

    Yadgari, J; Amir, A; Unger, R

    1998-01-01

    Despite many years of efforts, a direct prediction of protein structure from sequence is still not possible. As a result, in the last few years researchers have started to address the "inverse folding problem": Identifying and aligning a sequence to the fold with which it is most compatible, a process known as "threading". In two meetings in which protein folding predictions were objectively evaluated, it became clear that threading as a concept promises a real breakthrough, but that much improvement is still needed in the technique itself. Threading is a NP-hard problem, and thus no general polynomial solution can be expected. Still a practical approach with demonstrated ability to find optimal solutions in many cases, and acceptable solutions in other cases, is needed. We applied the technique of Genetic Algorithms in order to significantly improve the ability of threading algorithms to find the optimal alignment of a sequence to a structure, i.e. the alignment with the minimum free energy. A major progress reported here is the design of a representation of the threading alignment as a string of fixed length. With this representation validation of alignments and genetic operators are effectively implemented. Appropriate data structure and parameters have been selected. It is shown that Genetic Algorithm threading is effective and is able to find the optimal alignment in a few test cases. Furthermore, the described algorithm is shown to perform well even without pre-definition of core elements. Existing threading methods are dependent on such constraints to make their calculations feasible. But the concept of core elements is inherently arbitrary and should be avoided if possible. While a rigorous proof is hard to submit yet an, we present indications that indeed Genetic Algorithm threading is capable of finding consistently good solutions of full alignments in search spaces of size up to 10(70).

  19. CUFID-query: accurate network querying through random walk based network flow estimation.

    PubMed

    Jeong, Hyundoo; Qian, Xiaoning; Yoon, Byung-Jun

    2017-12-28

    Functional modules in biological networks consist of numerous biomolecules and their complicated interactions. Recent studies have shown that biomolecules in a functional module tend to have similar interaction patterns and that such modules are often conserved across biological networks of different species. As a result, such conserved functional modules can be identified through comparative analysis of biological networks. In this work, we propose a novel network querying algorithm based on the CUFID (Comparative network analysis Using the steady-state network Flow to IDentify orthologous proteins) framework combined with an efficient seed-and-extension approach. The proposed algorithm, CUFID-query, can accurately detect conserved functional modules as small subnetworks in the target network that are expected to perform similar functions to the given query functional module. The CUFID framework was recently developed for probabilistic pairwise global comparison of biological networks, and it has been applied to pairwise global network alignment, where the framework was shown to yield accurate network alignment results. In the proposed CUFID-query algorithm, we adopt the CUFID framework and extend it for local network alignment, specifically to solve network querying problems. First, in the seed selection phase, the proposed method utilizes the CUFID framework to compare the query and the target networks and to predict the probabilistic node-to-node correspondence between the networks. Next, the algorithm selects and greedily extends the seed in the target network by iteratively adding nodes that have frequent interactions with other nodes in the seed network, in a way that the conductance of the extended network is maximally reduced. Finally, CUFID-query removes irrelevant nodes from the querying results based on the personalized PageRank vector for the induced network that includes the fully extended network and its neighboring nodes. Through extensive performance evaluation based on biological networks with known functional modules, we show that CUFID-query outperforms the existing state-of-the-art algorithms in terms of prediction accuracy and biological significance of the predictions.

  20. Experimental image alignment system

    NASA Technical Reports Server (NTRS)

    Moyer, A. L.; Kowel, S. T.; Kornreich, P. G.

    1980-01-01

    A microcomputer-based instrument for image alignment with respect to a reference image is described which uses the DEFT sensor (Direct Electronic Fourier Transform) for image sensing and preprocessing. The instrument alignment algorithm which uses the two-dimensional Fourier transform as input is also described. It generates signals used to steer the stage carrying the test image into the correct orientation. This algorithm has computational advantages over algorithms which use image intensity data as input and is suitable for a microcomputer-based instrument since the two-dimensional Fourier transform is provided by the DEFT sensor.

  1. Optimization of sequence alignment for simple sequence repeat regions.

    PubMed

    Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

    2011-07-20

    Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.

  2. Iterative Most-Likely Point Registration (IMLP): A Robust Algorithm for Computing Optimal Shape Alignment

    PubMed Central

    Billings, Seth D.; Boctor, Emad M.; Taylor, Russell H.

    2015-01-01

    We present a probabilistic registration algorithm that robustly solves the problem of rigid-body alignment between two shapes with high accuracy, by aptly modeling measurement noise in each shape, whether isotropic or anisotropic. For point-cloud shapes, the probabilistic framework additionally enables modeling locally-linear surface regions in the vicinity of each point to further improve registration accuracy. The proposed Iterative Most-Likely Point (IMLP) algorithm is formed as a variant of the popular Iterative Closest Point (ICP) algorithm, which iterates between point-correspondence and point-registration steps. IMLP’s probabilistic framework is used to incorporate a generalized noise model into both the correspondence and the registration phases of the algorithm, hence its name as a most-likely point method rather than a closest-point method. To efficiently compute the most-likely correspondences, we devise a novel search strategy based on a principal direction (PD)-tree search. We also propose a new approach to solve the generalized total-least-squares (GTLS) sub-problem of the registration phase, wherein the point correspondences are registered under a generalized noise model. Our GTLS approach has improved accuracy, efficiency, and stability compared to prior methods presented for this problem and offers a straightforward implementation using standard least squares. We evaluate the performance of IMLP relative to a large number of prior algorithms including ICP, a robust variant on ICP, Generalized ICP (GICP), and Coherent Point Drift (CPD), as well as drawing close comparison with the prior anisotropic registration methods of GTLS-ICP and A-ICP. The performance of IMLP is shown to be superior with respect to these algorithms over a wide range of noise conditions, outliers, and misalignments using both mesh and point-cloud representations of various shapes. PMID:25748700

  3. ChromAlign: A two-step algorithmic procedure for time alignment of three-dimensional LC-MS chromatographic surfaces.

    PubMed

    Sadygov, Rovshan G; Maroto, Fernando Martin; Hühmer, Andreas F R

    2006-12-15

    We present an algorithmic approach to align three-dimensional chromatographic surfaces of LC-MS data of complex mixture samples. The approach consists of two steps. In the first step, we prealign chromatographic profiles: two-dimensional projections of chromatographic surfaces. This is accomplished by correlation analysis using fast Fourier transforms. In this step, a temporal offset that maximizes the overlap and dot product between two chromatographic profiles is determined. In the second step, the algorithm generates correlation matrix elements between full mass scans of the reference and sample chromatographic surfaces. The temporal offset from the first step indicates a range of the mass scans that are possibly correlated, then the correlation matrix is calculated only for these mass scans. The correlation matrix carries information on highly correlated scans, but it does not itself determine the scan or time alignment. Alignment is determined as a path in the correlation matrix that maximizes the sum of the correlation matrix elements. The computational complexity of the optimal path generation problem is reduced by the use of dynamic programming. The program produces time-aligned surfaces. The use of the temporal offset from the first step in the second step reduces the computation time for generating the correlation matrix and speeds up the process. The algorithm has been implemented in a program, ChromAlign, developed in C++ language for the .NET2 environment in WINDOWS XP. In this work, we demonstrate the applications of ChromAlign to alignment of LC-MS surfaces of several datasets: a mixture of known proteins, samples from digests of surface proteins of T-cells, and samples prepared from digests of cerebrospinal fluid. ChromAlign accurately aligns the LC-MS surfaces we studied. In these examples, we discuss various aspects of the alignment by ChromAlign, such as constant time axis shifts and warping of chromatographic surfaces.

  4. Computational Aerothermodynamic Simulation Issues on Unstructured Grids

    NASA Technical Reports Server (NTRS)

    Gnoffo, Peter A.; White, Jeffery A.

    2004-01-01

    The synthesis of physical models for gas chemistry and turbulence from the structured grid codes LAURA and VULCAN into the unstructured grid code FUN3D is described. A directionally Symmetric, Total Variation Diminishing (STVD) algorithm and an entropy fix (eigenvalue limiter) keyed to local cell Reynolds number are introduced to improve solution quality for hypersonic aeroheating applications. A simple grid-adaptation procedure is incorporated within the flow solver. Simulations of flow over an ellipsoid (perfect gas, inviscid), Shuttle Orbiter (viscous, chemical nonequilibrium) and comparisons to the structured grid solvers LAURA (cylinder, Shuttle Orbiter) and VULCAN (flat plate) are presented to show current capabilities. The quality of heating in 3D stagnation regions is very sensitive to algorithm options in general, high aspect ratio tetrahedral elements complicate the simulation of high Reynolds number, viscous flow as compared to locally structured meshes aligned with the flow.

  5. GCALIGNER 1.0: an alignment program to compute a multiple sample comparison data matrix from large eco-chemical datasets obtained by GC.

    PubMed

    Dellicour, Simon; Lecocq, Thomas

    2013-10-01

    GCALIGNER 1.0 is a computer program designed to perform a preliminary data comparison matrix of chemical data obtained by GC without MS information. The alignment algorithm is based on the comparison between the retention times of each detected compound in a sample. In this paper, we test the GCALIGNER efficiency on three datasets of the chemical secretions of bumble bees. The algorithm performs the alignment with a low error rate (<3%). GCALIGNER 1.0 is a useful, simple and free program based on an algorithm that enables the alignment of table-type data from GC. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Robust prediction of consensus secondary structures using averaged base pairing probability matrices.

    PubMed

    Kiryu, Hisanori; Kin, Taishin; Asai, Kiyoshi

    2007-02-15

    Recent transcriptomic studies have revealed the existence of a considerable number of non-protein-coding RNA transcripts in higher eukaryotic cells. To investigate the functional roles of these transcripts, it is of great interest to find conserved secondary structures from multiple alignments on a genomic scale. Since multiple alignments are often created using alignment programs that neglect the special conservation patterns of RNA secondary structures for computational efficiency, alignment failures can cause potential risks of overlooking conserved stem structures. We investigated the dependence of the accuracy of secondary structure prediction on the quality of alignments. We compared three algorithms that maximize the expected accuracy of secondary structures as well as other frequently used algorithms. We found that one of our algorithms, called McCaskill-MEA, was more robust against alignment failures than others. The McCaskill-MEA method first computes the base pairing probability matrices for all the sequences in the alignment and then obtains the base pairing probability matrix of the alignment by averaging over these matrices. The consensus secondary structure is predicted from this matrix such that the expected accuracy of the prediction is maximized. We show that the McCaskill-MEA method performs better than other methods, particularly when the alignment quality is low and when the alignment consists of many sequences. Our model has a parameter that controls the sensitivity and specificity of predictions. We discussed the uses of that parameter for multi-step screening procedures to search for conserved secondary structures and for assigning confidence values to the predicted base pairs. The C++ source code that implements the McCaskill-MEA algorithm and the test dataset used in this paper are available at http://www.ncrna.org/papers/McCaskillMEA/. Supplementary data are available at Bioinformatics online.

  7. VIP Barcoding: composition vector-based software for rapid species identification based on DNA barcoding.

    PubMed

    Fan, Long; Hui, Jerome H L; Yu, Zu Guo; Chu, Ka Hou

    2014-07-01

    Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/. © 2014 John Wiley & Sons Ltd.

  8. HBLAST: Parallelised sequence similarity--A Hadoop MapReducable basic local alignment search tool.

    PubMed

    O'Driscoll, Aisling; Belogrudov, Vladislav; Carroll, John; Kropp, Kai; Walsh, Paul; Ghazal, Peter; Sleator, Roy D

    2015-04-01

    The recent exponential growth of genomic databases has resulted in the common task of sequence alignment becoming one of the major bottlenecks in the field of computational biology. It is typical for these large datasets and complex computations to require cost prohibitive High Performance Computing (HPC) to function. As such, parallelised solutions have been proposed but many exhibit scalability limitations and are incapable of effectively processing "Big Data" - the name attributed to datasets that are extremely large, complex and require rapid processing. The Hadoop framework, comprised of distributed storage and a parallelised programming framework known as MapReduce, is specifically designed to work with such datasets but it is not trivial to efficiently redesign and implement bioinformatics algorithms according to this paradigm. The parallelisation strategy of "divide and conquer" for alignment algorithms can be applied to both data sets and input query sequences. However, scalability is still an issue due to memory constraints or large databases, with very large database segmentation leading to additional performance decline. Herein, we present Hadoop Blast (HBlast), a parallelised BLAST algorithm that proposes a flexible method to partition both databases and input query sequences using "virtual partitioning". HBlast presents improved scalability over existing solutions and well balanced computational work load while keeping database segmentation and recompilation to a minimum. Enhanced BLAST search performance on cheap memory constrained hardware has significant implications for in field clinical diagnostic testing; enabling faster and more accurate identification of pathogenic DNA in human blood or tissue samples. Copyright © 2015 Elsevier Inc. All rights reserved.

  9. Accurate quantification of local changes for carotid arteries in 3D ultrasound images using convex optimization-based deformable registration

    NASA Astrophysics Data System (ADS)

    Cheng, Jieyu; Qiu, Wu; Yuan, Jing; Fenster, Aaron; Chiu, Bernard

    2016-03-01

    Registration of longitudinally acquired 3D ultrasound (US) images plays an important role in monitoring and quantifying progression/regression of carotid atherosclerosis. We introduce an image-based non-rigid registration algorithm to align the baseline 3D carotid US with longitudinal images acquired over several follow-up time points. This algorithm minimizes the sum of absolute intensity differences (SAD) under a variational optical-flow perspective within a multi-scale optimization framework to capture local and global deformations. Outer wall and lumen were segmented manually on each image, and the performance of the registration algorithm was quantified by Dice similarity coefficient (DSC) and mean absolute distance (MAD) of the outer wall and lumen surfaces after registration. In this study, images for 5 subjects were registered initially by rigid registration, followed by the proposed algorithm. Mean DSC generated by the proposed algorithm was 79:3+/-3:8% for lumen and 85:9+/-4:0% for outer wall, compared to 73:9+/-3:4% and 84:7+/-3:2% generated by rigid registration. Mean MAD of 0:46+/-0:08mm and 0:52+/-0:13mm were generated for lumen and outer wall respectively by the proposed algorithm, compared to 0:55+/-0:08mm and 0:54+/-0:11mm generated by rigid registration. The mean registration time of our method per image pair was 143+/-23s.

  10. Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns

    PubMed Central

    2013-01-01

    Background It is well known that the search for homologous RNAs is more effective if both sequence and structure information is incorporated into the search. However, current tools for searching with RNA sequence-structure patterns cannot fully handle mutations occurring on both these levels or are simply not fast enough for searching large sequence databases because of the high computational costs of the underlying sequence-structure alignment problem. Results We present new fast index-based and online algorithms for approximate matching of RNA sequence-structure patterns supporting a full set of edit operations on single bases and base pairs. Our methods efficiently compute semi-global alignments of structural RNA patterns and substrings of the target sequence whose costs satisfy a user-defined sequence-structure edit distance threshold. For this purpose, we introduce a new computing scheme to optimally reuse the entries of the required dynamic programming matrices for all substrings and combine it with a technique for avoiding the alignment computation of non-matching substrings. Our new index-based methods exploit suffix arrays preprocessed from the target database and achieve running times that are sublinear in the size of the searched sequences. To support the description of RNA molecules that fold into complex secondary structures with multiple ordered sequence-structure patterns, we use fast algorithms for the local or global chaining of approximate sequence-structure pattern matches. The chaining step removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our improved online algorithm is faster than the best previous method by up to factor 45. Our best new index-based algorithm achieves a speedup of factor 560. Conclusions The presented methods achieve considerable speedups compared to the best previous method. This, together with the expected sublinear running time of the presented index-based algorithms, allows for the first time approximate matching of RNA sequence-structure patterns in large sequence databases. Beyond the algorithmic contributions, we provide with RaligNAtor a robust and well documented open-source software package implementing the algorithms presented in this manuscript. The RaligNAtor software is available at http://www.zbh.uni-hamburg.de/ralignator. PMID:23865810

  11. Functional Alignment of Metabolic Networks.

    PubMed

    Mazza, Arnon; Wagner, Allon; Ruppin, Eytan; Sharan, Roded

    2016-05-01

    Network alignment has become a standard tool in comparative biology, allowing the inference of protein function, interaction, and orthology. However, current alignment techniques are based on topological properties of networks and do not take into account their functional implications. Here we propose, for the first time, an algorithm to align two metabolic networks by taking advantage of their coupled metabolic models. These models allow us to assess the functional implications of genes or reactions, captured by the metabolic fluxes that are altered following their deletion from the network. Such implications may spread far beyond the region of the network where the gene or reaction lies. We apply our algorithm to align metabolic networks from various organisms, ranging from bacteria to humans, showing that our alignment can reveal functional orthology relations that are missed by conventional topological alignments.

  12. The kinetic regime of the Vicsek model

    NASA Astrophysics Data System (ADS)

    Chepizhko, A. A.; Kulinskii, V. L.

    2009-12-01

    We consider the dynamics of the system of self-propelling particles modeled via the Vicsek algorithm in continuum time limit. It is shown that the alignment process for the velocities can be subdivided into two regimes: "fast" kinetic and "slow" hydrodynamic ones. In fast kinetic regime the alignment of the particle velocity to the local neighborhood takes place with characteristic relaxation time. So, that the bigger regions arise with the velocity alignment. These regions align their velocities thus giving rise to hydrodynamic regime of the dynamics. We propose the mean-field-like approach in which we take into account the correlations between density and velocity. The comparison of the theoretical predictions with the numerical simulations is given. The relation between Vicsek model in the zero velocity limit and the Kuramoto model is stated. The mean-field approach accounting for the dynamic change of the neighborhood is proposed. The nature of the discontinuity of the dependence of the order parameter in case of vectorial noise revealed in Gregorie and Chaite, Phys. Rev. Lett., 92, 025702 (2004) is discussed and the explanation of it is proposed.

  13. Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets.

    PubMed

    Hoffmann, Nils; Keck, Matthias; Neuweger, Heiko; Wilhelm, Mathias; Högy, Petra; Niehaus, Karsten; Stoye, Jens

    2012-08-27

    Modern analytical methods in biology and chemistry use separation techniques coupled to sensitive detectors, such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). These hyphenated methods provide high-dimensional data. Comparing such data manually to find corresponding signals is a laborious task, as each experiment usually consists of thousands of individual scans, each containing hundreds or even thousands of distinct signals. In order to allow for successful identification of metabolites or proteins within such data, especially in the context of metabolomics and proteomics, an accurate alignment and matching of corresponding features between two or more experiments is required. Such a matching algorithm should capture fluctuations in the chromatographic system which lead to non-linear distortions on the time axis, as well as systematic changes in recorded intensities. Many different algorithms for the retention time alignment of GC-MS and LC-MS data have been proposed and published, but all of them focus either on aligning previously extracted peak features or on aligning and comparing the complete raw data containing all available features. In this paper we introduce two algorithms for retention time alignment of multiple GC-MS datasets: multiple alignment by bidirectional best hits peak assignment and cluster extension (BIPACE) and center-star multiple alignment by pairwise partitioned dynamic time warping (CeMAPP-DTW). We show how the similarity-based peak group matching method BIPACE may be used for multiple alignment calculation individually and how it can be used as a preprocessing step for the pairwise alignments performed by CeMAPP-DTW. We evaluate the algorithms individually and in combination on a previously published small GC-MS dataset studying the Leishmania parasite and on a larger GC-MS dataset studying grains of wheat (Triticum aestivum). We have shown that BIPACE achieves very high precision and recall and a very low number of false positive peak assignments on both evaluation datasets. CeMAPP-DTW finds a high number of true positives when executed on its own, but achieves even better results when BIPACE is used to constrain its search space. The source code of both algorithms is included in the OpenSource software framework Maltcms, which is available from http://maltcms.sf.net. The evaluation scripts of the present study are available from the same source.

  14. Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets

    PubMed Central

    2012-01-01

    Background Modern analytical methods in biology and chemistry use separation techniques coupled to sensitive detectors, such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). These hyphenated methods provide high-dimensional data. Comparing such data manually to find corresponding signals is a laborious task, as each experiment usually consists of thousands of individual scans, each containing hundreds or even thousands of distinct signals. In order to allow for successful identification of metabolites or proteins within such data, especially in the context of metabolomics and proteomics, an accurate alignment and matching of corresponding features between two or more experiments is required. Such a matching algorithm should capture fluctuations in the chromatographic system which lead to non-linear distortions on the time axis, as well as systematic changes in recorded intensities. Many different algorithms for the retention time alignment of GC-MS and LC-MS data have been proposed and published, but all of them focus either on aligning previously extracted peak features or on aligning and comparing the complete raw data containing all available features. Results In this paper we introduce two algorithms for retention time alignment of multiple GC-MS datasets: multiple alignment by bidirectional best hits peak assignment and cluster extension (BIPACE) and center-star multiple alignment by pairwise partitioned dynamic time warping (CeMAPP-DTW). We show how the similarity-based peak group matching method BIPACE may be used for multiple alignment calculation individually and how it can be used as a preprocessing step for the pairwise alignments performed by CeMAPP-DTW. We evaluate the algorithms individually and in combination on a previously published small GC-MS dataset studying the Leishmania parasite and on a larger GC-MS dataset studying grains of wheat (Triticum aestivum). Conclusions We have shown that BIPACE achieves very high precision and recall and a very low number of false positive peak assignments on both evaluation datasets. CeMAPP-DTW finds a high number of true positives when executed on its own, but achieves even better results when BIPACE is used to constrain its search space. The source code of both algorithms is included in the OpenSource software framework Maltcms, which is available from http://maltcms.sf.net. The evaluation scripts of the present study are available from the same source. PMID:22920415

  15. An Efficient Correction Algorithm for Eliminating Image Misalignment Effects on Co-Phasing Measurement Accuracy for Segmented Active Optics Systems

    PubMed Central

    Yue, Dan; Xu, Shuyan; Nie, Haitao; Wang, Zongyang

    2016-01-01

    The misalignment between recorded in-focus and out-of-focus images using the Phase Diversity (PD) algorithm leads to a dramatic decline in wavefront detection accuracy and image recovery quality for segmented active optics systems. This paper demonstrates the theoretical relationship between the image misalignment and tip-tilt terms in Zernike polynomials of the wavefront phase for the first time, and an efficient two-step alignment correction algorithm is proposed to eliminate these misalignment effects. This algorithm processes a spatial 2-D cross-correlation of the misaligned images, revising the offset to 1 or 2 pixels and narrowing the search range for alignment. Then, it eliminates the need for subpixel fine alignment to achieve adaptive correction by adding additional tip-tilt terms to the Optical Transfer Function (OTF) of the out-of-focus channel. The experimental results demonstrate the feasibility and validity of the proposed correction algorithm to improve the measurement accuracy during the co-phasing of segmented mirrors. With this alignment correction, the reconstructed wavefront is more accurate, and the recovered image is of higher quality. PMID:26934045

  16. A method to align the coordinate system of accelerometers to the axes of a human body: The depitch algorithm.

    PubMed

    Gietzelt, Matthias; Schnabel, Stephan; Wolf, Klaus-Hendrik; Büsching, Felix; Song, Bianying; Rust, Stefan; Marschollek, Michael

    2012-05-01

    One of the key problems in accelerometry based gait analyses is that it may not be possible to attach an accelerometer to the lower trunk so that its axes are perfectly aligned to the axes of the subject. In this paper we will present an algorithm that was designed to virtually align the axes of the accelerometer to the axes of the subject during walking sections. This algorithm is based on a physically reasonable approach and built for measurements in unsupervised settings, where the test persons are applying the sensors by themselves. For evaluation purposes we conducted a study with 6 healthy subjects and measured their gait with a manually aligned and a skewed accelerometer attached to the subject's lower trunk. After applying the algorithm the intra-axis correlation of both sensors was on average 0.89±0.1 with a mean absolute error of 0.05g. We concluded that the algorithm was able to adjust the skewed sensor node virtually to the coordinate system of the subject. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  17. Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

    PubMed

    Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

    2011-01-01

    The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

  18. Surface plasmon enhanced cell microscopy with blocked random spatial activation

    NASA Astrophysics Data System (ADS)

    Son, Taehwang; Oh, Youngjin; Lee, Wonju; Yang, Heejin; Kim, Donghyun

    2016-03-01

    We present surface plasmon enhanced fluorescence microscopy with random spatial sampling using patterned block of silver nanoislands. Rigorous coupled wave analysis was performed to confirm near-field localization on nanoislands. Random nanoislands were fabricated in silver by temperature annealing. By analyzing random near-field distribution, average size of localized fields was found to be on the order of 135 nm. Randomly localized near-fields were used to spatially sample F-actin of J774 cells (mouse macrophage cell-line). Image deconvolution algorithm based on linear imaging theory was established for stochastic estimation of fluorescent molecular distribution. The alignment between near-field distribution and raw image was performed by the patterned block. The achieved resolution is dependent upon factors including the size of localized fields and estimated to be 100-150 nm.

  19. Optimizing high performance computing workflow for protein functional annotation.

    PubMed

    Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene

    2014-09-10

    Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data.

  20. Automated protein structure modeling in CASP9 by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement

    PubMed Central

    Xu, Dong; Zhang, Jian; Roy, Ambrish; Zhang, Yang

    2011-01-01

    I-TASSER is an automated pipeline for protein tertiary structure prediction using multiple threading alignments and iterative structure assembly simulations. In CASP9 experiments, two new algorithms, QUARK and FG-MD, were added to the I-TASSER pipeline for improving the structural modeling accuracy. QUARK is a de novo structure prediction algorithm used for structure modeling of proteins that lack detectable template structures. For distantly homologous targets, QUARK models are found useful as a reference structure for selecting good threading alignments and guiding the I-TASSER structure assembly simulations. FG-MD is an atomic-level structural refinement program that uses structural fragments collected from the PDB structures to guide molecular dynamics simulation and improve the local structure of predicted model, including hydrogen-bonding networks, torsion angles and steric clashes. Despite considerable progress in both the template-based and template-free structure modeling, significant improvements on protein target classification, domain parsing, model selection, and ab initio folding of beta-proteins are still needed to further improve the I-TASSER pipeline. PMID:22069036

  1. Optimizing high performance computing workflow for protein functional annotation

    PubMed Central

    Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene

    2014-01-01

    Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data. PMID:25313296

  2. Projected power iteration for network alignment

    NASA Astrophysics Data System (ADS)

    Onaran, Efe; Villar, Soledad

    2017-08-01

    The network alignment problem asks for the best correspondence between two given graphs, so that the largest possible number of edges are matched. This problem appears in many scientific problems (like the study of protein-protein interactions) and it is very closely related to the quadratic assignment problem which has graph isomorphism, traveling salesman and minimum bisection problems as particular cases. The graph matching problem is NP-hard in general. However, under some restrictive models for the graphs, algorithms can approximate the alignment efficiently. In that spirit the recent work by Feizi and collaborators introduce EigenAlign, a fast spectral method with convergence guarantees for Erd-s-Renyí graphs. In this work we propose the algorithm Projected Power Alignment, which is a projected power iteration version of EigenAlign. We numerically show it improves the recovery rates of EigenAlign and we describe the theory that may be used to provide performance guarantees for Projected Power Alignment.

  3. On the Impact of Widening Vector Registers on Sequence Alignment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daily, Jeffrey A.; Kalyanaraman, Anantharaman; Krishnamoorthy, Sriram

    2016-09-22

    Vector extensions, such as SSE, have been part of the x86 since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. In this paper, we demonstrate that the trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based onmore » striped data layouts. We present a practically efficient SIMD implementation of a parallel scan based sequence alignment algorithm that can better exploit wider SIMD units. We conduct comprehensive workload and use case analyses to characterize the relative behavior of the striped and scan approaches and identify the best choice of algorithm based on input length and SIMD width.« less

  4. Rapid Transfer Alignment of MEMS SINS Based on Adaptive Incremental Kalman Filter.

    PubMed

    Chu, Hairong; Sun, Tingting; Zhang, Baiqiang; Zhang, Hongwei; Chen, Yang

    2017-01-14

    In airborne MEMS SINS transfer alignment, the error of MEMS IMU is highly environment-dependent and the parameters of the system model are also uncertain, which may lead to large error and bad convergence of the Kalman filter. In order to solve this problem, an improved adaptive incremental Kalman filter (AIKF) algorithm is proposed. First, the model of SINS transfer alignment is defined based on the "Velocity and Attitude" matching method. Then the detailed algorithm progress of AIKF and its recurrence formulas are presented. The performance and calculation amount of AKF and AIKF are also compared. Finally, a simulation test is designed to verify the accuracy and the rapidity of the AIKF algorithm by comparing it with KF and AKF. The results show that the AIKF algorithm has better estimation accuracy and shorter convergence time, especially for the bias of the gyroscope and the accelerometer, which can meet the accuracy and rapidity requirement of transfer alignment.

  5. Rapid Transfer Alignment of MEMS SINS Based on Adaptive Incremental Kalman Filter

    PubMed Central

    Chu, Hairong; Sun, Tingting; Zhang, Baiqiang; Zhang, Hongwei; Chen, Yang

    2017-01-01

    In airborne MEMS SINS transfer alignment, the error of MEMS IMU is highly environment-dependent and the parameters of the system model are also uncertain, which may lead to large error and bad convergence of the Kalman filter. In order to solve this problem, an improved adaptive incremental Kalman filter (AIKF) algorithm is proposed. First, the model of SINS transfer alignment is defined based on the “Velocity and Attitude” matching method. Then the detailed algorithm progress of AIKF and its recurrence formulas are presented. The performance and calculation amount of AKF and AIKF are also compared. Finally, a simulation test is designed to verify the accuracy and the rapidity of the AIKF algorithm by comparing it with KF and AKF. The results show that the AIKF algorithm has better estimation accuracy and shorter convergence time, especially for the bias of the gyroscope and the accelerometer, which can meet the accuracy and rapidity requirement of transfer alignment. PMID:28098829

  6. RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation.

    PubMed

    Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin

    2017-01-21

    RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/ .

  7. An Automatic Registration Algorithm for 3D Maxillofacial Model

    NASA Astrophysics Data System (ADS)

    Qiu, Luwen; Zhou, Zhongwei; Guo, Jixiang; Lv, Jiancheng

    2016-09-01

    3D image registration aims at aligning two 3D data sets in a common coordinate system, which has been widely used in computer vision, pattern recognition and computer assisted surgery. One challenging problem in 3D registration is that point-wise correspondences between two point sets are often unknown apriori. In this work, we develop an automatic algorithm for 3D maxillofacial models registration including facial surface model and skull model. Our proposed registration algorithm can achieve a good alignment result between partial and whole maxillofacial model in spite of ambiguous matching, which has a potential application in the oral and maxillofacial reparative and reconstructive surgery. The proposed algorithm includes three steps: (1) 3D-SIFT features extraction and FPFH descriptors construction; (2) feature matching using SAC-IA; (3) coarse rigid alignment and refinement by ICP. Experiments on facial surfaces and mandible skull models demonstrate the efficiency and robustness of our algorithm.

  8. ExScalibur: A High-Performance Cloud-Enabled Suite for Whole Exome Germline and Somatic Mutation Identification.

    PubMed

    Bao, Riyue; Hernandez, Kyle; Huang, Lei; Kang, Wenjun; Bartom, Elizabeth; Onel, Kenan; Volchenboum, Samuel; Andrade, Jorge

    2015-01-01

    Whole exome sequencing has facilitated the discovery of causal genetic variants associated with human diseases at deep coverage and low cost. In particular, the detection of somatic mutations from tumor/normal pairs has provided insights into the cancer genome. Although there is an abundance of publicly-available software for the detection of germline and somatic variants, concordance is generally limited among variant callers and alignment algorithms. Successful integration of variants detected by multiple methods requires in-depth knowledge of the software, access to high-performance computing resources, and advanced programming techniques. We present ExScalibur, a set of fully automated, highly scalable and modulated pipelines for whole exome data analysis. The suite integrates multiple alignment and variant calling algorithms for the accurate detection of germline and somatic mutations with close to 99% sensitivity and specificity. ExScalibur implements streamlined execution of analytical modules, real-time monitoring of pipeline progress, robust handling of errors and intuitive documentation that allows for increased reproducibility and sharing of results and workflows. It runs on local computers, high-performance computing clusters and cloud environments. In addition, we provide a data analysis report utility to facilitate visualization of the results that offers interactive exploration of quality control files, read alignment and variant calls, assisting downstream customization of potential disease-causing mutations. ExScalibur is open-source and is also available as a public image on Amazon cloud.

  9. Constructing Aligned Assessments Using Automated Test Construction

    ERIC Educational Resources Information Center

    Porter, Andrew; Polikoff, Morgan S.; Barghaus, Katherine M.; Yang, Rui

    2013-01-01

    We describe an innovative automated test construction algorithm for building aligned achievement tests. By incorporating the algorithm into the test construction process, along with other test construction procedures for building reliable and unbiased assessments, the result is much more valid tests than result from current test construction…

  10. Acceleration of the Smith-Waterman algorithm using single and multiple graphics processors

    NASA Astrophysics Data System (ADS)

    Khajeh-Saeed, Ali; Poole, Stephen; Blair Perot, J.

    2010-06-01

    Finding regions of similarity between two very long data streams is a computationally intensive problem referred to as sequence alignment. Alignment algorithms must allow for imperfect sequence matching with different starting locations and some gaps and errors between the two data sequences. Perhaps the most well known application of sequence matching is the testing of DNA or protein sequences against genome databases. The Smith-Waterman algorithm is a method for precisely characterizing how well two sequences can be aligned and for determining the optimal alignment of those two sequences. Like many applications in computational science, the Smith-Waterman algorithm is constrained by the memory access speed and can be accelerated significantly by using graphics processors (GPUs) as the compute engine. In this work we show that effective use of the GPU requires a novel reformulation of the Smith-Waterman algorithm. The performance of this new version of the algorithm is demonstrated using the SSCA#1 (Bioinformatics) benchmark running on one GPU and on up to four GPUs executing in parallel. The results indicate that for large problems a single GPU is up to 45 times faster than a CPU for this application, and the parallel implementation shows linear speed up on up to 4 GPUs.

  11. FLASHFLOOD: A 3D Field-based similarity search and alignment method for flexible molecules

    NASA Astrophysics Data System (ADS)

    Pitman, Michael C.; Huber, Wolfgang K.; Horn, Hans; Krämer, Andreas; Rice, Julia E.; Swope, William C.

    2001-07-01

    A three-dimensional field-based similarity search and alignment method for flexible molecules is introduced. The conformational space of a flexible molecule is represented in terms of fragments and torsional angles of allowed conformations. A user-definable property field is used to compute features of fragment pairs. Features are generalizations of CoMMA descriptors (Silverman, B.D. and Platt, D.E., J. Med. Chem., 39 (1996) 2129.) that characterize local regions of the property field by its local moments. The features are invariant under coordinate system transformations. Features taken from a query molecule are used to form alignments with fragment pairs in the database. An assembly algorithm is then used to merge the fragment pairs into full structures, aligned to the query. Key to the method is the use of a context adaptive descriptor scaling procedure as the basis for similarity. This allows the user to tune the weights of the various feature components based on examples relevant to the particular context under investigation. The property fields may range from simple, phenomenological fields, to fields derived from quantum mechanical calculations. We apply the method to the dihydrofolate/methotrexate benchmark system, and show that when one injects relevant contextual information into the descriptor scaling procedure, better results are obtained more efficiently. We also show how the method works and include computer times for a query from a database that represents approximately 23 million conformers of seventeen flexible molecules.

  12. Evaluation of Laser Based Alignment Algorithms Under Additive Random and Diffraction Noise

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McClay, W A; Awwal, A; Wilhelmsen, K

    2004-09-30

    The purpose of the automatic alignment algorithm at the National Ignition Facility (NIF) is to determine the position of a laser beam based on the position of beam features from video images. The position information obtained is used to command motors and attenuators to adjust the beam lines to the desired position, which facilitates the alignment of all 192 beams. One of the goals of the algorithm development effort is to ascertain the performance, reliability, and uncertainty of the position measurement. This paper describes a method of evaluating the performance of algorithms using Monte Carlo simulation. In particular we showmore » the application of this technique to the LM1{_}LM3 algorithm, which determines the position of a series of two beam light sources. The performance of the algorithm was evaluated for an ensemble of over 900 simulated images with varying image intensities and noise counts, as well as varying diffraction noise amplitude and frequency. The performance of the algorithm on the image data set had a tolerance well beneath the 0.5-pixel system requirement.« less

  13. MultiSETTER: web server for multiple RNA structure comparison.

    PubMed

    Čech, Petr; Hoksza, David; Svozil, Daniel

    2015-08-12

    Understanding the architecture and function of RNA molecules requires methods for comparing and analyzing their tertiary and quaternary structures. While structural superposition of short RNAs is achievable in a reasonable time, large structures represent much bigger challenge. Therefore, we have developed a fast and accurate algorithm for RNA pairwise structure superposition called SETTER and implemented it in the SETTER web server. However, though biological relationships can be inferred by a pairwise structure alignment, key features preserved by evolution can be identified only from a multiple structure alignment. Thus, we extended the SETTER algorithm to the alignment of multiple RNA structures and developed the MultiSETTER algorithm. In this paper, we present the updated version of the SETTER web server that implements a user friendly interface to the MultiSETTER algorithm. The server accepts RNA structures either as the list of PDB IDs or as user-defined PDB files. After the superposition is computed, structures are visualized in 3D and several reports and statistics are generated. To the best of our knowledge, the MultiSETTER web server is the first publicly available tool for a multiple RNA structure alignment. The MultiSETTER server offers the visual inspection of an alignment in 3D space which may reveal structural and functional relationships not captured by other multiple alignment methods based either on a sequence or on secondary structure motifs.

  14. Tensor voting for image correction by global and local intensity alignment.

    PubMed

    Jia, Jiaya; Tang, Chi-Keung

    2005-01-01

    This paper presents a voting method to perform image correction by global and local intensity alignment. The key to our modeless approach is the estimation of global and local replacement functions by reducing the complex estimation problem to the robust 2D tensor voting in the corresponding voting spaces. No complicated model for replacement function (curve) is assumed. Subject to the monotonic constraint only, we vote for an optimal replacement function by propagating the curve smoothness constraint using a dense tensor field. Our method effectively infers missing curve segments and rejects image outliers. Applications using our tensor voting approach are proposed and described. The first application consists of image mosaicking of static scenes, where the voted replacement functions are used in our iterative registration algorithm for computing the best warping matrix. In the presence of occlusion, our replacement function can be employed to construct a visually acceptable mosaic by detecting occlusion which has large and piecewise constant color. Furthermore, by the simultaneous consideration of color matches and spatial constraints in the voting space, we perform image intensity compensation and high contrast image correction using our voting framework, when only two defective input images are given.

  15. icoshift: A versatile tool for the rapid alignment of 1D NMR spectra

    NASA Astrophysics Data System (ADS)

    Savorani, F.; Tomasi, G.; Engelsen, S. B.

    2010-02-01

    The increasing scientific and industrial interest towards metabonomics takes advantage from the high qualitative and quantitative information level of nuclear magnetic resonance (NMR) spectroscopy. However, several chemical and physical factors can affect the absolute and the relative position of an NMR signal and it is not always possible or desirable to eliminate these effects a priori. To remove misalignment of NMR signals a posteriori, several algorithms have been proposed in the literature. The icoshift program presented here is an open source and highly efficient program designed for solving signal alignment problems in metabonomic NMR data analysis. The icoshift algorithm is based on correlation shifting of spectral intervals and employs an FFT engine that aligns all spectra simultaneously. The algorithm is demonstrated to be faster than similar methods found in the literature making full-resolution alignment of large datasets feasible and thus avoiding down-sampling steps such as binning. The algorithm uses missing values as a filling alternative in order to avoid spectral artifacts at the segment boundaries. The algorithm is made open source and the Matlab code including documentation can be downloaded from www.models.life.ku.dk.

  16. Genetic Algorithm Phase Retrieval for the Systematic Image-Based Optical Alignment Testbed

    NASA Technical Reports Server (NTRS)

    Rakoczy, John; Steincamp, James; Taylor, Jaime

    2003-01-01

    A reduced surrogate, one point crossover genetic algorithm with random rank-based selection was used successfully to estimate the multiple phases of a segmented optical system modeled on the seven-mirror Systematic Image-Based Optical Alignment testbed located at NASA's Marshall Space Flight Center.

  17. Multiple nodes transfer alignment for airborne missiles based on inertial sensor network

    NASA Astrophysics Data System (ADS)

    Si, Fan; Zhao, Yan

    2017-09-01

    Transfer alignment is an important initialization method for airborne missiles because the alignment accuracy largely determines the performance of the missile. However, traditional alignment methods are limited by complicated and unknown flexure angle, and cannot meet the actual requirement when wing flexure deformation occurs. To address this problem, we propose a new method that uses the relative navigation parameters between the weapons and fighter to achieve transfer alignment. First, in the relative inertial navigation algorithm, the relative attitudes and positions are constantly computed in wing flexure deformation situations. Secondly, the alignment results of each weapon are processed using a data fusion algorithm to improve the overall performance. Finally, the feasibility and performance of the proposed method were evaluated under two typical types of deformation, and the simulation results demonstrated that the new transfer alignment method is practical and has high-precision.

  18. Triangular Alignment (TAME). A Tensor-based Approach for Higher-order Network Alignment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mohammadi, Shahin; Gleich, David F.; Kolda, Tamara G.

    2015-11-01

    Network alignment is an important tool with extensive applications in comparative interactomics. Traditional approaches aim to simultaneously maximize the number of conserved edges and the underlying similarity of aligned entities. We propose a novel formulation of the network alignment problem that extends topological similarity to higher-order structures and provide a new objective function that maximizes the number of aligned substructures. This objective function corresponds to an integer programming problem, which is NP-hard. Consequently, we approximate this objective function as a surrogate function whose maximization results in a tensor eigenvalue problem. Based on this formulation, we present an algorithm called Triangularmore » AlignMEnt (TAME), which attempts to maximize the number of aligned triangles across networks. We focus on alignment of triangles because of their enrichment in complex networks; however, our formulation and resulting algorithms can be applied to general motifs. Using a case study on the NAPABench dataset, we show that TAME is capable of producing alignments with up to 99% accuracy in terms of aligned nodes. We further evaluate our method by aligning yeast and human interactomes. Our results indicate that TAME outperforms the state-of-art alignment methods both in terms of biological and topological quality of the alignments.« less

  19. Accelerating large-scale protein structure alignments with graphics processing units

    PubMed Central

    2012-01-01

    Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132

  20. An efficient grid layout algorithm for biological networks utilizing various biological attributes

    PubMed Central

    Kojima, Kaname; Nagasaki, Masao; Jeong, Euna; Kato, Mitsuru; Miyano, Satoru

    2007-01-01

    Background Clearly visualized biopathways provide a great help in understanding biological systems. However, manual drawing of large-scale biopathways is time consuming. We proposed a grid layout algorithm that can handle gene-regulatory networks and signal transduction pathways by considering edge-edge crossing, node-edge crossing, distance measure between nodes, and subcellular localization information from Gene Ontology. Consequently, the layout algorithm succeeded in drastically reducing these crossings in the apoptosis model. However, for larger-scale networks, we encountered three problems: (i) the initial layout is often very far from any local optimum because nodes are initially placed at random, (ii) from a biological viewpoint, human layouts still exceed automatic layouts in understanding because except subcellular localization, it does not fully utilize biological information of pathways, and (iii) it employs a local search strategy in which the neighborhood is obtained by moving one node at each step, and automatic layouts suggest that simultaneous movements of multiple nodes are necessary for better layouts, while such extension may face worsening the time complexity. Results We propose a new grid layout algorithm. To address problem (i), we devised a new force-directed algorithm whose output is suitable as the initial layout. For (ii), we considered that an appropriate alignment of nodes having the same biological attribute is one of the most important factors of the comprehension, and we defined a new score function that gives an advantage to such configurations. For solving problem (iii), we developed a search strategy that considers swapping nodes as well as moving a node, while keeping the order of the time complexity. Though a naïve implementation increases by one order, the time complexity, we solved this difficulty by devising a method that caches differences between scores of a layout and its possible updates. Conclusion Layouts of the new grid layout algorithm are compared with that of the previous algorithm and human layout in an endothelial cell model, three times as large as the apoptosis model. The total cost of the result from the new grid layout algorithm is similar to that of the human layout. In addition, its convergence time is drastically reduced (40% reduction). PMID:17338825

  1. Some aspects of SR beamline alignment

    NASA Astrophysics Data System (ADS)

    Gaponov, Yu. A.; Cerenius, Y.; Nygaard, J.; Ursby, T.; Larsson, K.

    2011-09-01

    Based on the Synchrotron Radiation (SR) beamline optical element-by-element alignment with analysis of the alignment results an optimized beamline alignment algorithm has been designed and developed. The alignment procedures have been designed and developed for the MAX-lab I911-4 fixed energy beamline. It has been shown that the intermediate information received during the monochromator alignment stage can be used for the correction of both monochromator and mirror without the next stages of alignment of mirror, slits, sample holder, etc. Such an optimization of the beamline alignment procedures decreases the time necessary for the alignment and becomes useful and helpful in the case of any instability of the beamline optical elements, storage ring electron orbit or the wiggler insertion device, which could result in the instability of angular and positional parameters of the SR beam. A general purpose software package for manual, semi-automatic and automatic SR beamline alignment has been designed and developed using the developed algorithm. The TANGO control system is used as the middle-ware between the stand-alone beamline control applications BLTools, BPMonitor and the beamline equipment.

  2. Assessing the statistical robustness of inter- and intra-basinal carbon isotope chemostratigraphic correlation

    NASA Astrophysics Data System (ADS)

    Hay, C.; Creveling, J. R.; Huybers, P. J.

    2016-12-01

    Excursions in the stable carbon isotopic composition of carbonate rocks (δ13Ccarb) can facilitate correlation of Precambrian and Phanerozoic sedimentary successions at a higher temporal resolution than radiometric and biostratigraphic frameworks typically afford. Within the bounds of litho- and biostratigraphic constraints, stratigraphers often correlate isotopic patterns between distant stratigraphic sections through visual alignment of local maxima and minima of isotopic values. The reproducibility of this method can prove challenging and, thus, evaluating the statistical robustness of intrabasinal composite carbon isotope curves, and global correlations to these reference curves, remains difficult. To assess the reproducibility of stratigraphic alignment of δ13Ccarb data, and correlations between carbon isotope excursions, we employ a numerical dynamic time warping methodology that stretches and squeezes the time axis of a record to obtain an optimal correlation (in a least-squares sense) between time-uncertain series of data. In particular, we assess various alignments between series of Early Cambrian δ13Ccarb data with respect to plausible matches. We first show that an alignment of these records obtained visually, and published previously, is broadly reproducible using dynamic time warping. Alternative alignments with similar goodness of fits are also obtainable, and their stratigraphic plausibility are discussed. This approach should be generalizable to an algorithm for the purposes of developing a library of plausible alignments between multiple time-uncertain stratigraphic records.

  3. Infrastructure-Free Mapping and Localization for Tunnel-Based Rail Applications Using 2D Lidar

    NASA Astrophysics Data System (ADS)

    Daoust, Tyler

    This thesis presents an infrastructure-free mapping and localization framework for rail vehicles using only a lidar sensor. The method was designed to handle modern underground tunnels: narrow, parallel, and relatively smooth concrete walls. A sliding-window algorithm was developed to estimate the train's motion, using a Renyi's Quadratic Entropy (RQE)-based point-cloud alignment system. The method was tested with datasets gathered on a subway train travelling at high speeds, with 75 km of data across 14 runs, simulating 500 km of localization. The system was capable of mapping with an average error of less than 0.6 % by distance. It was capable of continuously localizing, relative to the map, to within 10 cm in stations and at crossovers, and 2.3 m in pathological sections of tunnel. This work has the potential to improve train localization in a tunnel, which can be used to increase capacity and for automation purposes.

  4. Object recognition and localization from 3D point clouds by maximum-likelihood estimation

    NASA Astrophysics Data System (ADS)

    Dantanarayana, Harshana G.; Huntley, Jonathan M.

    2017-08-01

    We present an algorithm based on maximum-likelihood analysis for the automated recognition of objects, and estimation of their pose, from 3D point clouds. Surfaces segmented from depth images are used as the features, unlike `interest point'-based algorithms which normally discard such data. Compared to the 6D Hough transform, it has negligible memory requirements, and is computationally efficient compared to iterative closest point algorithms. The same method is applicable to both the initial recognition/pose estimation problem as well as subsequent pose refinement through appropriate choice of the dispersion of the probability density functions. This single unified approach therefore avoids the usual requirement for different algorithms for these two tasks. In addition to the theoretical description, a simple 2 degrees of freedom (d.f.) example is given, followed by a full 6 d.f. analysis of 3D point cloud data from a cluttered scene acquired by a projected fringe-based scanner, which demonstrated an RMS alignment error as low as 0.3 mm.

  5. Experience from the in-flight calibration of the Extreme Ultraviolet Explorer (EUVE) and Upper Atmosphere Research Satellite (UARS) fixed head star trackers (FHSTs)

    NASA Technical Reports Server (NTRS)

    Lee, Michael

    1995-01-01

    Since the original post-launch calibration of the FHSTs (Fixed Head Star Trackers) on EUVE (Extreme Ultraviolet Explorer) and UARS (Upper Atmosphere Research Satellite), the Flight Dynamics task has continued to analyze the FHST performance. The algorithm used for inflight alignment of spacecraft sensors is described and the equations for the errors in the relative alignment for the simple 2 star tracker case are shown. Simulated data and real data are used to compute the covariance of the relative alignment errors. Several methods for correcting the alignment are compared and results analyzed. The specific problems seen on orbit with UARS and EUVE are then discussed. UARS has experienced anomalous tracker performance on an FHST resulting in continuous variation in apparent tracker alignment. On EUVE, the FHST residuals from the attitude determination algorithm showed a dependence on the direction of roll during survey mode. This dependence is traced back to time tagging errors and the original post launch alignment is found to be in error due to the impact of the time tagging errors on the alignment algorithm. The methods used by the FDF (Flight Dynamics Facility) to correct for these problems is described.

  6. OPTIMA: sensitive and accurate whole-genome alignment of error-prone genomic maps by combinatorial indexing and technology-agnostic statistical analysis.

    PubMed

    Verzotto, Davide; M Teo, Audrey S; Hillmer, Axel M; Nagarajan, Niranjan

    2016-01-01

    Resolution of complex repeat structures and rearrangements in the assembly and analysis of large eukaryotic genomes is often aided by a combination of high-throughput sequencing and genome-mapping technologies (for example, optical restriction mapping). In particular, mapping technologies can generate sparse maps of large DNA fragments (150 kilo base pairs (kbp) to 2 Mbp) and thus provide a unique source of information for disambiguating complex rearrangements in cancer genomes. Despite their utility, combining high-throughput sequencing and mapping technologies has been challenging because of the lack of efficient and sensitive map-alignment algorithms for robustly aligning error-prone maps to sequences. We introduce a novel seed-and-extend glocal (short for global-local) alignment method, OPTIMA (and a sliding-window extension for overlap alignment, OPTIMA-Overlap), which is the first to create indexes for continuous-valued mapping data while accounting for mapping errors. We also present a novel statistical model, agnostic with respect to technology-dependent error rates, for conservatively evaluating the significance of alignments without relying on expensive permutation-based tests. We show that OPTIMA and OPTIMA-Overlap outperform other state-of-the-art approaches (1.6-2 times more sensitive) and are more efficient (170-200 %) and precise in their alignments (nearly 99 % precision). These advantages are independent of the quality of the data, suggesting that our indexing approach and statistical evaluation are robust, provide improved sensitivity and guarantee high precision.

  7. Attitude algorithm and initial alignment method for SINS applied in short-range aircraft

    NASA Astrophysics Data System (ADS)

    Zhang, Rong-Hui; He, Zhao-Cheng; You, Feng; Chen, Bo

    2017-07-01

    This paper presents an attitude solution algorithm based on the Micro-Electro-Mechanical System and quaternion method. We completed the numerical calculation and engineering practice by adopting fourth-order Runge-Kutta algorithm in the digital signal processor. The state space mathematical model of initial alignment in static base was established, and the initial alignment method based on Kalman filter was proposed. Based on the hardware in the loop simulation platform, the short-range flight simulation test and the actual flight test were carried out. The results show that the error of pitch, yaw and roll angle is fast convergent, and the fitting rate between flight simulation and flight test is more than 85%.

  8. A portable foot-parameter-extracting system

    NASA Astrophysics Data System (ADS)

    Zhang, MingKai; Liang, Jin; Li, Wenpan; Liu, Shifan

    2016-03-01

    In order to solve the problem of automatic foot measurement in garment customization, a new automatic footparameter- extracting system based on stereo vision, photogrammetry and heterodyne multiple frequency phase shift technology is proposed and implemented. The key technologies applied in the system are studied, including calibration of projector, alignment of point clouds, and foot measurement. Firstly, a new projector calibration algorithm based on plane model has been put forward to get the initial calibration parameters and a feature point detection scheme of calibration board image is developed. Then, an almost perfect match of two clouds is achieved by performing a first alignment using the Sampled Consensus - Initial Alignment algorithm (SAC-IA) and refining the alignment using the Iterative Closest Point algorithm (ICP). Finally, the approaches used for foot-parameterextracting and the system scheme are presented in detail. Experimental results show that the RMS error of the calibration result is 0.03 pixel and the foot parameter extracting experiment shows the feasibility of the extracting algorithm. Compared with the traditional measurement method, the system can be more portable, accurate and robust.

  9. An extensive assessment of network alignment algorithms for comparison of brain connectomes.

    PubMed

    Milano, Marianna; Guzzi, Pietro Hiram; Tymofieva, Olga; Xu, Duan; Hess, Christofer; Veltri, Pierangelo; Cannataro, Mario

    2017-06-06

    Recently the study of the complex system of connections in neural systems, i.e. the connectome, has gained a central role in neurosciences. The modeling and analysis of connectomes are therefore a growing area. Here we focus on the representation of connectomes by using graph theory formalisms. Macroscopic human brain connectomes are usually derived from neuroimages; the analyzed brains are co-registered in the image domain and brought to a common anatomical space. An atlas is then applied in order to define anatomically meaningful regions that will serve as the nodes of the network - this process is referred to as parcellation. The atlas-based parcellations present some known limitations in cases of early brain development and abnormal anatomy. Consequently, it has been recently proposed to perform atlas-free random brain parcellation into nodes and align brains in the network space instead of the anatomical image space, as a way to deal with the unknown correspondences of the parcels. Such process requires modeling of the brain using graph theory and the subsequent comparison of the structure of graphs. The latter step may be modeled as a network alignment (NA) problem. In this work, we first define the problem formally, then we test six existing state of the art of network aligners on diffusion MRI-derived brain networks. We compare the performances of algorithms by assessing six topological measures. We also evaluated the robustness of algorithms to alterations of the dataset. The results confirm that NA algorithms may be applied in cases of atlas-free parcellation for a fully network-driven comparison of connectomes. The analysis shows MAGNA++ is the best global alignment algorithm. The paper presented a new analysis methodology that uses network alignment for validating atlas-free parcellation brain connectomes. The methodology has been experimented on several brain datasets.

  10. DR-TAMAS: Diffeomorphic Registration for Tensor Accurate alignMent of Anatomical Structures

    PubMed Central

    Irfanoglu, M. Okan; Nayak, Amritha; Jenkins, Jeffrey; Hutchinson, Elizabeth B.; Sadeghi, Neda; Thomas, Cibu P.; Pierpaoli, Carlo

    2016-01-01

    In this work, we propose DR-TAMAS (Diffeomorphic Registration for Tensor Accurate alignMent of Anatomical Structures), a novel framework for intersubject registration of Diffusion Tensor Imaging (DTI) data sets. This framework is optimized for brain data and its main goal is to achieve an accurate alignment of all brain structures, including white matter (WM), gray matter (GM), and spaces containing cerebrospinal fluid (CSF). Currently most DTI-based spatial normalization algorithms emphasize alignment of anisotropic structures. While some diffusion-derived metrics, such as diffusion anisotropy and tensor eigenvector orientation, are highly informative for proper alignment of WM, other tensor metrics such as the trace or mean diffusivity (MD) are fundamental for a proper alignment of GM and CSF boundaries. Moreover, it is desirable to include information from structural MRI data, e.g., T1-weighted or T2-weighted images, which are usually available together with the diffusion data. The fundamental property of DR-TAMAS is to achieve global anatomical accuracy by incorporating in its cost function the most informative metrics locally. Another important feature of DR-TAMAS is a symmetric time-varying velocity-based transformation model, which enables it to account for potentially large anatomical variability in healthy subjects and patients. The performance of DR-TAMAS is evaluated with several data sets and compared with other widely-used diffeomorphic image registration techniques employing both full tensor information and/or DTI-derived scalar maps. Our results show that the proposed method has excellent overall performance in the entire brain, while being equivalent to the best existing methods in WM. PMID:26931817

  11. Rapid Quantification of 3D Collagen Fiber Alignment and Fiber Intersection Correlations with High Sensitivity

    PubMed Central

    Sun, Meng; Bloom, Alexander B.; Zaman, Muhammad H.

    2015-01-01

    Metastatic cancers aggressively reorganize collagen in their microenvironment. For example, radially orientated collagen fibers have been observed surrounding tumor cell clusters in vivo. The degree of fiber alignment, as a consequence of this remodeling, has often been difficult to quantify. In this paper, we present an easy to implement algorithm for accurate detection of collagen fiber orientation in a rapid pixel-wise manner. This algorithm quantifies the alignment of both computer generated and actual collagen fiber networks of varying degrees of alignment within 5°°. We also present an alternative easy method to calculate the alignment index directly from the standard deviation of fiber orientation. Using this quantitative method for determining collagen alignment, we demonstrate that the number of collagen fiber intersections has a negative correlation with the degree of fiber alignment. This decrease in intersections of aligned fibers could explain why cells move more rapidly along aligned fibers than unaligned fibers, as previously reported. Overall, our paper provides an easier, more quantitative and quicker way to quantify fiber orientation and alignment, and presents a platform in studying effects of matrix and cellular properties on fiber alignment in complex 3D environments. PMID:26158674

  12. Improved alignment evaluation and optimization : final report.

    DOT National Transportation Integrated Search

    2007-09-11

    This report outlines the development of an enhanced highway alignment evaluation and optimization : model. A GIS-based software tool is prepared for alignment optimization that uses genetic algorithms for : optimal search. The software is capable of ...

  13. Phase Retrieval Using a Genetic Algorithm on the Systematic Image-Based Optical Alignment Testbed

    NASA Technical Reports Server (NTRS)

    Taylor, Jaime R.

    2003-01-01

    NASA s Marshall Space Flight Center s Systematic Image-Based Optical Alignment (SIBOA) Testbed was developed to test phase retrieval algorithms and hardware techniques. Individuals working with the facility developed the idea of implementing phase retrieval by breaking the determination of the tip/tilt of each mirror apart from the piston motion (or translation) of each mirror. Presented in this report is an algorithm that determines the optimal phase correction associated only with the piston motion of the mirrors. A description of the Phase Retrieval problem is first presented. The Systematic Image-Based Optical Alignment (SIBOA) Testbeb is then described. A Discrete Fourier Transform (DFT) is necessary to transfer the incoming wavefront (or estimate of phase error) into the spatial frequency domain to compare it with the image. A method for reducing the DFT to seven scalar/matrix multiplications is presented. A genetic algorithm is then used to search for the phase error. The results of this new algorithm on a test problem are presented.

  14. BiPACE 2D--graph-based multiple alignment for comprehensive 2D gas chromatography-mass spectrometry.

    PubMed

    Hoffmann, Nils; Wilhelm, Mathias; Doebbe, Anja; Niehaus, Karsten; Stoye, Jens

    2014-04-01

    Comprehensive 2D gas chromatography-mass spectrometry is an established method for the analysis of complex mixtures in analytical chemistry and metabolomics. It produces large amounts of data that require semiautomatic, but preferably automatic handling. This involves the location of significant signals (peaks) and their matching and alignment across different measurements. To date, there exist only a few openly available algorithms for the retention time alignment of peaks originating from such experiments that scale well with increasing sample and peak numbers, while providing reliable alignment results. We describe BiPACE 2D, an automated algorithm for retention time alignment of peaks from 2D gas chromatography-mass spectrometry experiments and evaluate it on three previously published datasets against the mSPA, SWPA and Guineu algorithms. We also provide a fourth dataset from an experiment studying the H2 production of two different strains of Chlamydomonas reinhardtii that is available from the MetaboLights database together with the experimental protocol, peak-detection results and manually curated multiple peak alignment for future comparability with newly developed algorithms. BiPACE 2D is contained in the freely available Maltcms framework, version 1.3, hosted at http://maltcms.sf.net, under the terms of the L-GPL v3 or Eclipse Open Source licenses. The software used for the evaluation along with the underlying datasets is available at the same location. The C.reinhardtii dataset is freely available at http://www.ebi.ac.uk/metabolights/MTBLS37.

  15. QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families

    PubMed Central

    Gudyś, Adam; Deorowicz, Sebastian

    2017-01-01

    The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are ineffective when large protein families are investigated. We present QuickProbs 2, an algorithm for multiple sequence alignment. Based on probabilistic models, equipped with novel column-oriented refinement and selective consistency, it offers outstanding accuracy. When analysing hundreds of sequences, Quick-Probs 2 is noticeably better than ClustalΩ and MAFFT, the previous leaders for processing numerous protein families. In the case of smaller sets, for which consistency-based methods are the best performing, QuickProbs 2 is also superior to the competitors. Due to low computational requirements of selective consistency and utilization of massively parallel architectures, presented algorithm has similar execution times to ClustalΩ, and is orders of magnitude faster than full consistency approaches, like MSAProbs or PicXAA. All these make QuickProbs 2 an excellent tool for aligning families ranging from few, to hundreds of proteins. PMID:28139687

  16. Citation Matching in Sanskrit Corpora Using Local Alignment

    NASA Astrophysics Data System (ADS)

    Prasad, Abhinandan S.; Rao, Shrisha

    Citation matching is the problem of finding which citation occurs in a given textual corpus. Most existing citation matching work is done on scientific literature. The goal of this paper is to present methods for performing citation matching on Sanskrit texts. Exact matching and approximate matching are the two methods for performing citation matching. The exact matching method checks for exact occurrence of the citation with respect to the textual corpus. Approximate matching is a fuzzy string-matching method which computes a similarity score between an individual line of the textual corpus and the citation. The Smith-Waterman-Gotoh algorithm for local alignment, which is generally used in bioinformatics, is used here for calculating the similarity score. This similarity score is a measure of the closeness between the text and the citation. The exact- and approximate-matching methods are evaluated and compared. The methods presented can be easily applied to corpora in other Indic languages like Kannada, Tamil, etc. The approximate-matching method can in particular be used in the compilation of critical editions and plagiarism detection in a literary work.

  17. Modular and configurable optimal sequence alignment software: Cola.

    PubMed

    Zamani, Neda; Sundström, Görel; Höppner, Marc P; Grabherr, Manfred G

    2014-01-01

    The fundamental challenge in optimally aligning homologous sequences is to define a scoring scheme that best reflects the underlying biological processes. Maximising the overall number of matches in the alignment does not always reflect the patterns by which nucleotides mutate. Efficiently implemented algorithms that can be parameterised to accommodate more complex non-linear scoring schemes are thus desirable. We present Cola, alignment software that implements different optimal alignment algorithms, also allowing for scoring contiguous matches of nucleotides in a nonlinear manner. The latter places more emphasis on short, highly conserved motifs, and less on the surrounding nucleotides, which can be more diverged. To illustrate the differences, we report results from aligning 14,100 sequences from 3' untranslated regions of human genes to 25 of their mammalian counterparts, where we found that a nonlinear scoring scheme is more consistent than a linear scheme in detecting short, conserved motifs. Cola is freely available under LPGL from https://github.com/nedaz/cola.

  18. Fast and accurate phylogeny reconstruction using filtered spaced-word matches

    PubMed Central

    Sohrabi-Jahromi, Salma; Morgenstern, Burkhard

    2017-01-01

    Abstract Motivation: Word-based or ‘alignment-free’ algorithms are increasingly used for phylogeny reconstruction and genome comparison, since they are much faster than traditional approaches that are based on full sequence alignments. Existing alignment-free programs, however, are less accurate than alignment-based methods. Results: We propose Filtered Spaced Word Matches (FSWM), a fast alignment-free approach to estimate phylogenetic distances between large genomic sequences. For a pre-defined binary pattern of match and don’t-care positions, FSWM rapidly identifies spaced word-matches between input sequences, i.e. gap-free local alignments with matching nucleotides at the match positions and with mismatches allowed at the don’t-care positions. We then estimate the number of nucleotide substitutions per site by considering the nucleotides aligned at the don’t-care positions of the identified spaced-word matches. To reduce the noise from spurious random matches, we use a filtering procedure where we discard all spaced-word matches for which the overall similarity between the aligned segments is below a threshold. We show that our approach can accurately estimate substitution frequencies even for distantly related sequences that cannot be analyzed with existing alignment-free methods; phylogenetic trees constructed with FSWM distances are of high quality. A program run on a pair of eukaryotic genomes of a few hundred Mb each takes a few minutes. Availability and Implementation: The program source code for FSWM including a documentation, as well as the software that we used to generate artificial genome sequences are freely available at http://fswm.gobics.de/ Contact: chris.leimeister@stud.uni-goettingen.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28073754

  19. Fast and accurate phylogeny reconstruction using filtered spaced-word matches.

    PubMed

    Leimeister, Chris-André; Sohrabi-Jahromi, Salma; Morgenstern, Burkhard

    2017-04-01

    Word-based or 'alignment-free' algorithms are increasingly used for phylogeny reconstruction and genome comparison, since they are much faster than traditional approaches that are based on full sequence alignments. Existing alignment-free programs, however, are less accurate than alignment-based methods. We propose Filtered Spaced Word Matches (FSWM) , a fast alignment-free approach to estimate phylogenetic distances between large genomic sequences. For a pre-defined binary pattern of match and don't-care positions, FSWM rapidly identifies spaced word-matches between input sequences, i.e. gap-free local alignments with matching nucleotides at the match positions and with mismatches allowed at the don't-care positions. We then estimate the number of nucleotide substitutions per site by considering the nucleotides aligned at the don't-care positions of the identified spaced-word matches. To reduce the noise from spurious random matches, we use a filtering procedure where we discard all spaced-word matches for which the overall similarity between the aligned segments is below a threshold. We show that our approach can accurately estimate substitution frequencies even for distantly related sequences that cannot be analyzed with existing alignment-free methods; phylogenetic trees constructed with FSWM distances are of high quality. A program run on a pair of eukaryotic genomes of a few hundred Mb each takes a few minutes. The program source code for FSWM including a documentation, as well as the software that we used to generate artificial genome sequences are freely available at http://fswm.gobics.de/. chris.leimeister@stud.uni-goettingen.de. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  20. Application of Improved 5th-Cubature Kalman Filter in Initial Strapdown Inertial Navigation System Alignment for Large Misalignment Angles.

    PubMed

    Wang, Wei; Chen, Xiyuan

    2018-02-23

    In view of the fact the accuracy of the third-degree Cubature Kalman Filter (CKF) used for initial alignment under large misalignment angle conditions is insufficient, an improved fifth-degree CKF algorithm is proposed in this paper. In order to make full use of the innovation on filtering, the innovation covariance matrix is calculated recursively by an innovative sequence with an exponent fading factor. Then a new adaptive error covariance matrix scaling algorithm is proposed. The Singular Value Decomposition (SVD) method is used for improving the numerical stability of the fifth-degree CKF in this paper. In order to avoid the overshoot caused by excessive scaling of error covariance matrix during the convergence stage, the scaling scheme is terminated when the gradient of azimuth reaches the maximum. The experimental results show that the improved algorithm has better alignment accuracy with large misalignment angles than the traditional algorithm.

  1. Automated alignment system for optical wireless communication systems using image recognition.

    PubMed

    Brandl, Paul; Weiss, Alexander; Zimmermann, Horst

    2014-07-01

    In this Letter, we describe the realization of a tracked line-of-sight optical wireless communication system for indoor data distribution. We built a laser-based transmitter with adaptive focus and ray steering by a microelectromechanical systems mirror. To execute the alignment procedure, we used a CMOS image sensor at the transmitter side and developed an algorithm for image recognition to localize the receiver's position. The receiver is based on a self-developed optoelectronic integrated chip with low requirements on the receiver optics to make the system economically attractive. With this system, we were able to set up the communication link automatically without any back channel and to perform error-free (bit error rate <10⁻⁹) data transmission over a distance of 3.5 m with a data rate of 3 Gbit/s.

  2. Alignment and integration of complex networks by hypergraph-based spectral clustering

    NASA Astrophysics Data System (ADS)

    Michoel, Tom; Nachtergaele, Bruno

    2012-11-01

    Complex networks possess a rich, multiscale structure reflecting the dynamical and functional organization of the systems they model. Often there is a need to analyze multiple networks simultaneously, to model a system by more than one type of interaction, or to go beyond simple pairwise interactions, but currently there is a lack of theoretical and computational methods to address these problems. Here we introduce a framework for clustering and community detection in such systems using hypergraph representations. Our main result is a generalization of the Perron-Frobenius theorem from which we derive spectral clustering algorithms for directed and undirected hypergraphs. We illustrate our approach with applications for local and global alignment of protein-protein interaction networks between multiple species, for tripartite community detection in folksonomies, and for detecting clusters of overlapping regulatory pathways in directed networks.

  3. Alignment and integration of complex networks by hypergraph-based spectral clustering.

    PubMed

    Michoel, Tom; Nachtergaele, Bruno

    2012-11-01

    Complex networks possess a rich, multiscale structure reflecting the dynamical and functional organization of the systems they model. Often there is a need to analyze multiple networks simultaneously, to model a system by more than one type of interaction, or to go beyond simple pairwise interactions, but currently there is a lack of theoretical and computational methods to address these problems. Here we introduce a framework for clustering and community detection in such systems using hypergraph representations. Our main result is a generalization of the Perron-Frobenius theorem from which we derive spectral clustering algorithms for directed and undirected hypergraphs. We illustrate our approach with applications for local and global alignment of protein-protein interaction networks between multiple species, for tripartite community detection in folksonomies, and for detecting clusters of overlapping regulatory pathways in directed networks.

  4. An atomistic fingerprint algorithm for learning ab initio molecular force fields

    NASA Astrophysics Data System (ADS)

    Tang, Yu-Hang; Zhang, Dongkun; Karniadakis, George Em

    2018-01-01

    Molecular fingerprints, i.e., feature vectors describing atomistic neighborhood configurations, is an important abstraction and a key ingredient for data-driven modeling of potential energy surface and interatomic force. In this paper, we present the density-encoded canonically aligned fingerprint algorithm, which is robust and efficient, for fitting per-atom scalar and vector quantities. The fingerprint is essentially a continuous density field formed through the superimposition of smoothing kernels centered on the atoms. Rotational invariance of the fingerprint is achieved by aligning, for each fingerprint instance, the neighboring atoms onto a local canonical coordinate frame computed from a kernel minisum optimization procedure. We show that this approach is superior over principal components analysis-based methods especially when the atomistic neighborhood is sparse and/or contains symmetry. We propose that the "distance" between the density fields be measured using a volume integral of their pointwise difference. This can be efficiently computed using optimal quadrature rules, which only require discrete sampling at a small number of grid points. We also experiment on the choice of weight functions for constructing the density fields and characterize their performance for fitting interatomic potentials. The applicability of the fingerprint is demonstrated through a set of benchmark problems.

  5. Depth map occlusion filling and scene reconstruction using modified exemplar-based inpainting

    NASA Astrophysics Data System (ADS)

    Voronin, V. V.; Marchuk, V. I.; Fisunov, A. V.; Tokareva, S. V.; Egiazarian, K. O.

    2015-03-01

    RGB-D sensors are relatively inexpensive and are commercially available off-the-shelf. However, owing to their low complexity, there are several artifacts that one encounters in the depth map like holes, mis-alignment between the depth and color image and lack of sharp object boundaries in the depth map. Depth map generated by Kinect cameras also contain a significant amount of missing pixels and strong noise, limiting their usability in many computer vision applications. In this paper, we present an efficient hole filling and damaged region restoration method that improves the quality of the depth maps obtained with the Microsoft Kinect device. The proposed approach is based on a modified exemplar-based inpainting and LPA-ICI filtering by exploiting the correlation between color and depth values in local image neighborhoods. As a result, edges of the objects are sharpened and aligned with the objects in the color image. Several examples considered in this paper show the effectiveness of the proposed approach for large holes removal as well as recovery of small regions on several test images of depth maps. We perform a comparative study and show that statistically, the proposed algorithm delivers superior quality results compared to existing algorithms.

  6. Clustalnet: the joining of Clustal and CORBA.

    PubMed

    Campagne, F

    2000-07-01

    Performing sequence alignment operations from a different program than the original sequence alignment code, and/or through a network connection, is often required. Interactive alignment editors and large-scale biological data analysis are common examples where such a flexibility is important. Interoperability between the alignment engine and the client should be obtained regardless of the architectures and programming languages of the server and client. Clustalnet, a Clustal alignment CORBA server is described, which was developed on the basis of Clustalw. This server brings the robustness of the algorithms and implementations of Clustal to a new level of reuse. A Clustalnet server object can be accessed from a program, transparently through the network. We present interfaces to perform the alignment operations and to control these operations via immutable contexts. The interfaces that select the contexts do not depend on the nature of the operation to be performed, making the design modular. The IDL interfaces presented here are not specific to Clustal and can be implemented on top of different sequence alignment algorithm implementations.

  7. SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics.

    PubMed

    Will, Sebastian; Otto, Christina; Miladi, Milad; Möhl, Mathias; Backofen, Rolf

    2015-08-01

    RNA-Seq experiments have revealed a multitude of novel ncRNAs. The gold standard for their analysis based on simultaneous alignment and folding suffers from extreme time complexity of [Formula: see text]. Subsequently, numerous faster 'Sankoff-style' approaches have been suggested. Commonly, the performance of such methods relies on sequence-based heuristics that restrict the search space to optimal or near-optimal sequence alignments; however, the accuracy of sequence-based methods breaks down for RNAs with sequence identities below 60%. Alignment approaches like LocARNA that do not require sequence-based heuristics, have been limited to high complexity ([Formula: see text] quartic time). Breaking this barrier, we introduce the novel Sankoff-style algorithm 'sparsified prediction and alignment of RNAs based on their structure ensembles (SPARSE)', which runs in quadratic time without sequence-based heuristics. To achieve this low complexity, on par with sequence alignment algorithms, SPARSE features strong sparsification based on structural properties of the RNA ensembles. Following PMcomp, SPARSE gains further speed-up from lightweight energy computation. Although all existing lightweight Sankoff-style methods restrict Sankoff's original model by disallowing loop deletions and insertions, SPARSE transfers the Sankoff algorithm to the lightweight energy model completely for the first time. Compared with LocARNA, SPARSE achieves similar alignment and better folding quality in significantly less time (speedup: 3.7). At similar run-time, it aligns low sequence identity instances substantially more accurate than RAF, which uses sequence-based heuristics. © The Author 2015. Published by Oxford University Press.

  8. Tensor Rank Preserving Discriminant Analysis for Facial Recognition.

    PubMed

    Tao, Dapeng; Guo, Yanan; Li, Yaotang; Gao, Xinbo

    2017-10-12

    Facial recognition, one of the basic topics in computer vision and pattern recognition, has received substantial attention in recent years. However, for those traditional facial recognition algorithms, the facial images are reshaped to a long vector, thereby losing part of the original spatial constraints of each pixel. In this paper, a new tensor-based feature extraction algorithm termed tensor rank preserving discriminant analysis (TRPDA) for facial image recognition is proposed; the proposed method involves two stages: in the first stage, the low-dimensional tensor subspace of the original input tensor samples was obtained; in the second stage, discriminative locality alignment was utilized to obtain the ultimate vector feature representation for subsequent facial recognition. On the one hand, the proposed TRPDA algorithm fully utilizes the natural structure of the input samples, and it applies an optimization criterion that can directly handle the tensor spectral analysis problem, thereby decreasing the computation cost compared those traditional tensor-based feature selection algorithms. On the other hand, the proposed TRPDA algorithm extracts feature by finding a tensor subspace that preserves most of the rank order information of the intra-class input samples. Experiments on the three facial databases are performed here to determine the effectiveness of the proposed TRPDA algorithm.

  9. Automatic laser beam alignment using blob detection for an environment monitoring spectroscopy

    NASA Astrophysics Data System (ADS)

    Khidir, Jarjees; Chen, Youhua; Anderson, Gary

    2013-05-01

    This paper describes a fully automated system to align an infra-red laser beam with a small retro-reflector over a wide range of distances. The component development and test were especially used for an open-path spectrometer gas detection system. Using blob detection under OpenCV library, an automatic alignment algorithm was designed to achieve fast and accurate target detection in a complex background environment. Test results are presented to show that the proposed algorithm has been successfully applied to various target distances and environment conditions.

  10. HubAlign: an accurate and efficient method for global alignment of protein-protein interaction networks.

    PubMed

    Hashemifar, Somaye; Xu, Jinbo

    2014-09-01

    High-throughput experimental techniques have produced a large amount of protein-protein interaction (PPI) data. The study of PPI networks, such as comparative analysis, shall benefit the understanding of life process and diseases at the molecular level. One way of comparative analysis is to align PPI networks to identify conserved or species-specific subnetwork motifs. A few methods have been developed for global PPI network alignment, but it still remains challenging in terms of both accuracy and efficiency. This paper presents a novel global network alignment algorithm, denoted as HubAlign, that makes use of both network topology and sequence homology information, based upon the observation that topologically important proteins in a PPI network usually are much more conserved and thus, more likely to be aligned. HubAlign uses a minimum-degree heuristic algorithm to estimate the topological and functional importance of a protein from the global network topology information. Then HubAlign aligns topologically important proteins first and gradually extends the alignment to the whole network. Extensive tests indicate that HubAlign greatly outperforms several popular methods in terms of both accuracy and efficiency, especially in detecting functionally similar proteins. HubAlign is available freely for non-commercial purposes at http://ttic.uchicago.edu/∼hashemifar/software/HubAlign.zip. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  11. ExScalibur: A High-Performance Cloud-Enabled Suite for Whole Exome Germline and Somatic Mutation Identification

    PubMed Central

    Huang, Lei; Kang, Wenjun; Bartom, Elizabeth; Onel, Kenan; Volchenboum, Samuel; Andrade, Jorge

    2015-01-01

    Whole exome sequencing has facilitated the discovery of causal genetic variants associated with human diseases at deep coverage and low cost. In particular, the detection of somatic mutations from tumor/normal pairs has provided insights into the cancer genome. Although there is an abundance of publicly-available software for the detection of germline and somatic variants, concordance is generally limited among variant callers and alignment algorithms. Successful integration of variants detected by multiple methods requires in-depth knowledge of the software, access to high-performance computing resources, and advanced programming techniques. We present ExScalibur, a set of fully automated, highly scalable and modulated pipelines for whole exome data analysis. The suite integrates multiple alignment and variant calling algorithms for the accurate detection of germline and somatic mutations with close to 99% sensitivity and specificity. ExScalibur implements streamlined execution of analytical modules, real-time monitoring of pipeline progress, robust handling of errors and intuitive documentation that allows for increased reproducibility and sharing of results and workflows. It runs on local computers, high-performance computing clusters and cloud environments. In addition, we provide a data analysis report utility to facilitate visualization of the results that offers interactive exploration of quality control files, read alignment and variant calls, assisting downstream customization of potential disease-causing mutations. ExScalibur is open-source and is also available as a public image on Amazon cloud. PMID:26271043

  12. SVM-dependent pairwise HMM: an application to protein pairwise alignments.

    PubMed

    Orlando, Gabriele; Raimondi, Daniele; Khan, Taushif; Lenaerts, Tom; Vranken, Wim F

    2017-12-15

    Methods able to provide reliable protein alignments are crucial for many bioinformatics applications. In the last years many different algorithms have been developed and various kinds of information, from sequence conservation to secondary structure, have been used to improve the alignment performances. This is especially relevant for proteins with highly divergent sequences. However, recent works suggest that different features may have different importance in diverse protein classes and it would be an advantage to have more customizable approaches, capable to deal with different alignment definitions. Here we present Rigapollo, a highly flexible pairwise alignment method based on a pairwise HMM-SVM that can use any type of information to build alignments. Rigapollo lets the user decide the optimal features to align their protein class of interest. It outperforms current state of the art methods on two well-known benchmark datasets when aligning highly divergent sequences. A Python implementation of the algorithm is available at http://ibsquare.be/rigapollo. wim.vranken@vub.be. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  13. Bamgineer: Introduction of simulated allele-specific copy number variants into exome and targeted sequence data sets.

    PubMed

    Samadian, Soroush; Bruce, Jeff P; Pugh, Trevor J

    2018-03-01

    Somatic copy number variations (CNVs) play a crucial role in development of many human cancers. The broad availability of next-generation sequencing data has enabled the development of algorithms to computationally infer CNV profiles from a variety of data types including exome and targeted sequence data; currently the most prevalent types of cancer genomics data. However, systemic evaluation and comparison of these tools remains challenging due to a lack of ground truth reference sets. To address this need, we have developed Bamgineer, a tool written in Python to introduce user-defined haplotype-phased allele-specific copy number events into an existing Binary Alignment Mapping (BAM) file, with a focus on targeted and exome sequencing experiments. As input, this tool requires a read alignment file (BAM format), lists of non-overlapping genome coordinates for introduction of gains and losses (bed file), and an optional file defining known haplotypes (vcf format). To improve runtime performance, Bamgineer introduces the desired CNVs in parallel using queuing and parallel processing on a local machine or on a high-performance computing cluster. As proof-of-principle, we applied Bamgineer to a single high-coverage (mean: 220X) exome sequence file from a blood sample to simulate copy number profiles of 3 exemplar tumors from each of 10 tumor types at 5 tumor cellularity levels (20-100%, 150 BAM files in total). To demonstrate feasibility beyond exome data, we introduced read alignments to a targeted 5-gene cell-free DNA sequencing library to simulate EGFR amplifications at frequencies consistent with circulating tumor DNA (10, 1, 0.1 and 0.01%) while retaining the multimodal insert size distribution of the original data. We expect Bamgineer to be of use for development and systematic benchmarking of CNV calling algorithms by users using locally-generated data for a variety of applications. The source code is freely available at http://github.com/pughlab/bamgineer.

  14. Interactive software tool to comprehend the calculation of optimal sequence alignments with dynamic programming.

    PubMed

    Ibarra, Ignacio L; Melo, Francisco

    2010-07-01

    Dynamic programming (DP) is a general optimization strategy that is successfully used across various disciplines of science. In bioinformatics, it is widely applied in calculating the optimal alignment between pairs of protein or DNA sequences. These alignments form the basis of new, verifiable biological hypothesis. Despite its importance, there are no interactive tools available for training and education on understanding the DP algorithm. Here, we introduce an interactive computer application with a graphical interface, for the purpose of educating students about DP. The program displays the DP scoring matrix and the resulting optimal alignment(s), while allowing the user to modify key parameters such as the values in the similarity matrix, the sequence alignment algorithm version and the gap opening/extension penalties. We hope that this software will be useful to teachers and students of bioinformatics courses, as well as researchers who implement the DP algorithm for diverse applications. The software is freely available at: http:/melolab.org/sat. The software is written in the Java computer language, thus it runs on all major platforms and operating systems including Windows, Mac OS X and LINUX. All inquiries or comments about this software should be directed to Francisco Melo at fmelo@bio.puc.cl.

  15. Mobile and replicated alignment of arrays in data-parallel programs

    NASA Technical Reports Server (NTRS)

    Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert

    1993-01-01

    When a data-parallel language like FORTRAN 90 is compiled for a distributed-memory machine, aggregate data objects (such as arrays) are distributed across the processor memories. The mapping determines the amount of residual communication needed to bring operands of parallel operations into alignment with each other. A common approach is to break the mapping into two stages: first, an alignment that maps all the objects to an abstract template, and then a distribution that maps the template to the processors. We solve two facets of the problem of finding alignments that reduce residual communication: we determine alignments that vary in loops, and objects that should have replicated alignments. We show that loop-dependent mobile alignment is sometimes necessary for optimum performance, and we provide algorithms with which a compiler can determine good mobile alignments for objects within do loops. We also identify situations in which replicated alignment is either required by the program itself (via spread operations) or can be used to improve performance. We propose an algorithm based on network flow that determines which objects to replicate so as to minimize the total amount of broadcast communication in replication. This work on mobile and replicated alignment extends our earlier work on determining static alignment.

  16. Addition of Improved Shock-Capturing Schemes to OVERFLOW 2.1

    NASA Technical Reports Server (NTRS)

    Burning, Pieter G.; Nichols, Robert H.; Tramel, Robert W.

    2009-01-01

    Existing approximate Riemann solvers do not perform well when the grid is not aligned with strong shocks in the flow field. Three new approximate Riemann algorithms are investigated to improve solution accuracy and stability in the vicinity of strong shocks. The new algorithms are compared to the existing upwind algorithms in OVERFLOW 2.1. The new algorithms use a multidimensional pressure gradient based switch to transition to a more numerically dissipative algorithm in the vicinity of strong shocks. One new algorithm also attempts to artificially thicken captured shocks in order to alleviate the errors in the solution introduced by "stair-stepping" of the shock resulting from the approximate Riemann solver. This algorithm performed well for all the example cases and produced results that were almost insensitive to the alignment of the grid and the shock.

  17. Assignment of protein sequences to existing domain and family classification systems: Pfam and the PDB.

    PubMed

    Xu, Qifang; Dunbrack, Roland L

    2012-11-01

    Automating the assignment of existing domain and protein family classifications to new sets of sequences is an important task. Current methods often miss assignments because remote relationships fail to achieve statistical significance. Some assignments are not as long as the actual domain definitions because local alignment methods often cut alignments short. Long insertions in query sequences often erroneously result in two copies of the domain assigned to the query. Divergent repeat sequences in proteins are often missed. We have developed a multilevel procedure to produce nearly complete assignments of protein families of an existing classification system to a large set of sequences. We apply this to the task of assigning Pfam domains to sequences and structures in the Protein Data Bank (PDB). We found that HHsearch alignments frequently scored more remotely related Pfams in Pfam clans higher than closely related Pfams, thus, leading to erroneous assignment at the Pfam family level. A greedy algorithm allowing for partial overlaps was, thus, applied first to sequence/HMM alignments, then HMM-HMM alignments and then structure alignments, taking care to join partial alignments split by large insertions into single-domain assignments. Additional assignment of repeat Pfams with weaker E-values was allowed after stronger assignments of the repeat HMM. Our database of assignments, presented in a database called PDBfam, contains Pfams for 99.4% of chains >50 residues. The Pfam assignment data in PDBfam are available at http://dunbrack2.fccc.edu/ProtCid/PDBfam, which can be searched by PDB codes and Pfam identifiers. They will be updated regularly.

  18. DR-TAMAS: Diffeomorphic Registration for Tensor Accurate Alignment of Anatomical Structures.

    PubMed

    Irfanoglu, M Okan; Nayak, Amritha; Jenkins, Jeffrey; Hutchinson, Elizabeth B; Sadeghi, Neda; Thomas, Cibu P; Pierpaoli, Carlo

    2016-05-15

    In this work, we propose DR-TAMAS (Diffeomorphic Registration for Tensor Accurate alignMent of Anatomical Structures), a novel framework for intersubject registration of Diffusion Tensor Imaging (DTI) data sets. This framework is optimized for brain data and its main goal is to achieve an accurate alignment of all brain structures, including white matter (WM), gray matter (GM), and spaces containing cerebrospinal fluid (CSF). Currently most DTI-based spatial normalization algorithms emphasize alignment of anisotropic structures. While some diffusion-derived metrics, such as diffusion anisotropy and tensor eigenvector orientation, are highly informative for proper alignment of WM, other tensor metrics such as the trace or mean diffusivity (MD) are fundamental for a proper alignment of GM and CSF boundaries. Moreover, it is desirable to include information from structural MRI data, e.g., T1-weighted or T2-weighted images, which are usually available together with the diffusion data. The fundamental property of DR-TAMAS is to achieve global anatomical accuracy by incorporating in its cost function the most informative metrics locally. Another important feature of DR-TAMAS is a symmetric time-varying velocity-based transformation model, which enables it to account for potentially large anatomical variability in healthy subjects and patients. The performance of DR-TAMAS is evaluated with several data sets and compared with other widely-used diffeomorphic image registration techniques employing both full tensor information and/or DTI-derived scalar maps. Our results show that the proposed method has excellent overall performance in the entire brain, while being equivalent to the best existing methods in WM. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Improved treatment of exact exchange in Quantum ESPRESSO

    DOE PAGES

    Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre; ...

    2017-01-18

    Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre

    Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less

  1. Robustly Aligning a Shape Model and Its Application to Car Alignment of Unknown Pose.

    PubMed

    Li, Yan; Gu, Leon; Kanade, Takeo

    2011-09-01

    Precisely localizing in an image a set of feature points that form a shape of an object, such as car or face, is called alignment. Previous shape alignment methods attempted to fit a whole shape model to the observed data, based on the assumption of Gaussian observation noise and the associated regularization process. However, such an approach, though able to deal with Gaussian noise in feature detection, turns out not to be robust or precise because it is vulnerable to gross feature detection errors or outliers resulting from partial occlusions or spurious features from the background or neighboring objects. We address this problem by adopting a randomized hypothesis-and-test approach. First, a Bayesian inference algorithm is developed to generate a shape-and-pose hypothesis of the object from a partial shape or a subset of feature points. For alignment, a large number of hypotheses are generated by randomly sampling subsets of feature points, and then evaluated to find the one that minimizes the shape prediction error. This method of randomized subset-based matching can effectively handle outliers and recover the correct object shape. We apply this approach on a challenging data set of over 5,000 different-posed car images, spanning a wide variety of car types, lighting, background scenes, and partial occlusions. Experimental results demonstrate favorable improvements over previous methods on both accuracy and robustness.

  2. Comparative modeling without implicit sequence alignments.

    PubMed

    Kolinski, Andrzej; Gront, Dominik

    2007-10-01

    The number of known protein sequences is about thousand times larger than the number of experimentally solved 3D structures. For more than half of the protein sequences a close or distant structural analog could be identified. The key starting point in a classical comparative modeling is to generate the best possible sequence alignment with a template or templates. With decreasing sequence similarity, the number of errors in the alignments increases and these errors are the main causes of the decreasing accuracy of the molecular models generated. Here we propose a new approach to comparative modeling, which does not require the implicit alignment - the model building phase explores geometric, evolutionary and physical properties of a template (or templates). The proposed method requires prior identification of a template, although the initial sequence alignment is ignored. The model is built using a very efficient reduced representation search engine CABS to find the best possible superposition of the query protein onto the template represented as a 3D multi-featured scaffold. The criteria used include: sequence similarity, predicted secondary structure consistency, local geometric features and hydrophobicity profile. For more difficult cases, the new method qualitatively outperforms existing schemes of comparative modeling. The algorithm unifies de novo modeling, 3D threading and sequence-based methods. The main idea is general and could be easily combined with other efficient modeling tools as Rosetta, UNRES and others.

  3. SWPS3 - fast multi-threaded vectorized Smith-Waterman for IBM Cell/B.E. and x86/SSE2.

    PubMed

    Szalkowski, Adam; Ledergerber, Christian; Krähenbühl, Philipp; Dessimoz, Christophe

    2008-10-29

    We present swps3, a vectorized implementation of the Smith-Waterman local alignment algorithm optimized for both the Cell/BE and x86 architectures. The paper describes swps3 and compares its performances with several other implementations. Our benchmarking results show that swps3 is currently the fastest implementation of a vectorized Smith-Waterman on the Cell/BE, outperforming the only other known implementation by a factor of at least 4: on a Playstation 3, it achieves up to 8.0 billion cell-updates per second (GCUPS). Using the SSE2 instruction set, a quad-core Intel Pentium can reach 15.7 GCUPS. We also show that swps3 on this CPU is faster than a recent GPU implementation. Finally, we note that under some circumstances, alignments are computed at roughly the same speed as BLAST, a heuristic method. The Cell/BE can be a powerful platform to align biological sequences. Besides, the performance gap between exact and heuristic methods has almost disappeared, especially for long protein sequences.

  4. SWPS3 – fast multi-threaded vectorized Smith-Waterman for IBM Cell/B.E. and ×86/SSE2

    PubMed Central

    Szalkowski, Adam; Ledergerber, Christian; Krähenbühl, Philipp; Dessimoz, Christophe

    2008-01-01

    Background We present swps3, a vectorized implementation of the Smith-Waterman local alignment algorithm optimized for both the Cell/BE and ×86 architectures. The paper describes swps3 and compares its performances with several other implementations. Findings Our benchmarking results show that swps3 is currently the fastest implementation of a vectorized Smith-Waterman on the Cell/BE, outperforming the only other known implementation by a factor of at least 4: on a Playstation 3, it achieves up to 8.0 billion cell-updates per second (GCUPS). Using the SSE2 instruction set, a quad-core Intel Pentium can reach 15.7 GCUPS. We also show that swps3 on this CPU is faster than a recent GPU implementation. Finally, we note that under some circumstances, alignments are computed at roughly the same speed as BLAST, a heuristic method. Conclusion The Cell/BE can be a powerful platform to align biological sequences. Besides, the performance gap between exact and heuristic methods has almost disappeared, especially for long protein sequences. PMID:18959793

  5. A short note on dynamic programming in a band.

    PubMed

    Gibrat, Jean-François

    2018-06-15

    Third generation sequencing technologies generate long reads that exhibit high error rates, in particular for insertions and deletions which are usually the most difficult errors to cope with. The only exact algorithm capable of aligning sequences with insertions and deletions is a dynamic programming algorithm. In this note, for the sake of efficiency, we consider dynamic programming in a band. We show how to choose the band width in function of the long reads' error rates, thus obtaining an [Formula: see text] algorithm in space and time. We also propose a procedure to decide whether this algorithm, when applied to semi-global alignments, provides the optimal score. We suggest that dynamic programming in a band is well suited to the problem of aligning long reads between themselves and can be used as a core component of methods for obtaining a consensus sequence from the long reads alone. The function implementing the dynamic programming algorithm in a band is available, as a standalone program, at: https://forgemia.inra.fr/jean-francois.gibrat/BAND_DYN_PROG.git.

  6. Spacecraft alignment estimation. [for onboard sensors

    NASA Technical Reports Server (NTRS)

    Shuster, Malcolm D.; Bierman, Gerald J.

    1988-01-01

    A numerically well-behaved factorized methodology is developed for estimating spacecraft sensor alignments from prelaunch and inflight data without the need to compute the spacecraft attitude or angular velocity. Such a methodology permits the estimation of sensor alignments (or other biases) in a framework free of unknown dynamical variables. In actual mission implementation such an algorithm is usually better behaved than one that must compute sensor alignments simultaneously with the spacecraft attitude, for example by means of a Kalman filter. In particular, such a methodology is less sensitive to data dropouts of long duration, and the derived measurement used in the attitude-independent algorithm usually makes data checking and editing of outliers much simpler than would be the case in the filter.

  7. Unsupervised parameter optimization for automated retention time alignment of severely shifted gas chromatographic data using the piecework alignment algorithm.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pierce, Karisa M.; Wright, Bob W.; Synovec, Robert E.

    2007-02-02

    First, simulated chromatographic separations with declining retention time precision were used to study the performance of the piecewise retention time alignment algorithm and to demonstrate an unsupervised parameter optimization method. The average correlation coefficient between the first chromatogram and every other chromatogram in the data set was used to optimize the alignment parameters. This correlation method does not require a training set, so it is unsupervised and automated. This frees the user from needing to provide class information and makes the alignment algorithm more generally applicable to classifying completely unknown data sets. For a data set of simulated chromatograms wheremore » the average chromatographic peak was shifted past two neighboring peaks between runs, the average correlation coefficient of the raw data was 0.46 ± 0.25. After automated, optimized piecewise alignment, the average correlation coefficient was 0.93 ± 0.02. Additionally, a relative shift metric and principal component analysis (PCA) were used to independently quantify and categorize the alignment performance, respectively. The relative shift metric was defined as four times the standard deviation of a given peak’s retention time in all of the chromatograms, divided by the peak-width-at-base. The raw simulated data sets that were studied contained peaks with average relative shifts ranging between 0.3 and 3.0. Second, a “real” data set of gasoline separations was gathered using three different GC methods to induce severe retention time shifting. In these gasoline separations, retention time precision improved ~8 fold following alignment. Finally, piecewise alignment and the unsupervised correlation optimization method were applied to severely shifted GC separations of reformate distillation fractions. The effect of piecewise alignment on peak heights and peak areas is also reported. Piecewise alignment either did not change the peak height, or caused it to slightly decrease. The average relative difference in peak height after piecewise alignment was –0.20%. Piecewise alignment caused the peak areas to either stay the same, slightly increase, or slightly decrease. The average absolute relative difference in area after piecewise alignment was 0.15%.« less

  8. Development and application of a modified dynamic time warping algorithm (DTW-S) to analyses of primate brain expression time series

    PubMed Central

    2011-01-01

    Background Comparing biological time series data across different conditions, or different specimens, is a common but still challenging task. Algorithms aligning two time series represent a valuable tool for such comparisons. While many powerful computation tools for time series alignment have been developed, they do not provide significance estimates for time shift measurements. Results Here, we present an extended version of the original DTW algorithm that allows us to determine the significance of time shift estimates in time series alignments, the DTW-Significance (DTW-S) algorithm. The DTW-S combines important properties of the original algorithm and other published time series alignment tools: DTW-S calculates the optimal alignment for each time point of each gene, it uses interpolated time points for time shift estimation, and it does not require alignment of the time-series end points. As a new feature, we implement a simulation procedure based on parameters estimated from real time series data, on a series-by-series basis, allowing us to determine the false positive rate (FPR) and the significance of the estimated time shift values. We assess the performance of our method using simulation data and real expression time series from two published primate brain expression datasets. Our results show that this method can provide accurate and robust time shift estimates for each time point on a gene-by-gene basis. Using these estimates, we are able to uncover novel features of the biological processes underlying human brain development and maturation. Conclusions The DTW-S provides a convenient tool for calculating accurate and robust time shift estimates at each time point for each gene, based on time series data. The estimates can be used to uncover novel biological features of the system being studied. The DTW-S is freely available as an R package TimeShift at http://www.picb.ac.cn/Comparative/data.html. PMID:21851598

  9. Development and application of a modified dynamic time warping algorithm (DTW-S) to analyses of primate brain expression time series.

    PubMed

    Yuan, Yuan; Chen, Yi-Ping Phoebe; Ni, Shengyu; Xu, Augix Guohua; Tang, Lin; Vingron, Martin; Somel, Mehmet; Khaitovich, Philipp

    2011-08-18

    Comparing biological time series data across different conditions, or different specimens, is a common but still challenging task. Algorithms aligning two time series represent a valuable tool for such comparisons. While many powerful computation tools for time series alignment have been developed, they do not provide significance estimates for time shift measurements. Here, we present an extended version of the original DTW algorithm that allows us to determine the significance of time shift estimates in time series alignments, the DTW-Significance (DTW-S) algorithm. The DTW-S combines important properties of the original algorithm and other published time series alignment tools: DTW-S calculates the optimal alignment for each time point of each gene, it uses interpolated time points for time shift estimation, and it does not require alignment of the time-series end points. As a new feature, we implement a simulation procedure based on parameters estimated from real time series data, on a series-by-series basis, allowing us to determine the false positive rate (FPR) and the significance of the estimated time shift values. We assess the performance of our method using simulation data and real expression time series from two published primate brain expression datasets. Our results show that this method can provide accurate and robust time shift estimates for each time point on a gene-by-gene basis. Using these estimates, we are able to uncover novel features of the biological processes underlying human brain development and maturation. The DTW-S provides a convenient tool for calculating accurate and robust time shift estimates at each time point for each gene, based on time series data. The estimates can be used to uncover novel biological features of the system being studied. The DTW-S is freely available as an R package TimeShift at http://www.picb.ac.cn/Comparative/data.html.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sato, M.; Kamide, Y.; Richmond, A.D.

    A new technique is presented to estimate electric fields and currents in a localized region of the high-latitude ionosphere by combining two magnetogram-inversion algorithms. This paper describes the concept and practical procedures of the method as well as the first results of our efforts in which this new scheme is applied to northern Scandinavia, computing the ionospheric parameters on a small scale. Examining latitudinal profiles of these parameters and precipitating particles, it is found that the region of the most intense precipitation in the morning sector is located equatorward of the region of the strongest electric field. To evaluate themore » relative importance of ionospheric and magnetospheric effects, the field-aligned current is divided into two components: (del Sigma) dot E and Sigma del dot E. These two components give often the opposite directions in the resultant field-aligned currents. The relative strength of the two components appears to vary considerably with latitude.« less

  11. SATCHMO-JS: a webserver for simultaneous protein multiple sequence alignment and phylogenetic tree construction.

    PubMed

    Hagopian, Raffi; Davidson, John R; Datta, Ruchira S; Samad, Bushra; Jarvis, Glen R; Sjölander, Kimmen

    2010-07-01

    We present the jump-start simultaneous alignment and tree construction using hidden Markov models (SATCHMO-JS) web server for simultaneous estimation of protein multiple sequence alignments (MSAs) and phylogenetic trees. The server takes as input a set of sequences in FASTA format, and outputs a phylogenetic tree and MSA; these can be viewed online or downloaded from the website. SATCHMO-JS is an extension of the SATCHMO algorithm, and employs a divide-and-conquer strategy to jump-start SATCHMO at a higher point in the phylogenetic tree, reducing the computational complexity of the progressive all-versus-all HMM-HMM scoring and alignment. Results on a benchmark dataset of 983 structurally aligned pairs from the PREFAB benchmark dataset show that SATCHMO-JS provides a statistically significant improvement in alignment accuracy over MUSCLE, Multiple Alignment using Fast Fourier Transform (MAFFT), ClustalW and the original SATCHMO algorithm. The SATCHMO-JS webserver is available at http://phylogenomics.berkeley.edu/satchmo-js. The datasets used in these experiments are available for download at http://phylogenomics.berkeley.edu/satchmo-js/supplementary/.

  12. Application of Improved 5th-Cubature Kalman Filter in Initial Strapdown Inertial Navigation System Alignment for Large Misalignment Angles

    PubMed Central

    Wang, Wei; Chen, Xiyuan

    2018-01-01

    In view of the fact the accuracy of the third-degree Cubature Kalman Filter (CKF) used for initial alignment under large misalignment angle conditions is insufficient, an improved fifth-degree CKF algorithm is proposed in this paper. In order to make full use of the innovation on filtering, the innovation covariance matrix is calculated recursively by an innovative sequence with an exponent fading factor. Then a new adaptive error covariance matrix scaling algorithm is proposed. The Singular Value Decomposition (SVD) method is used for improving the numerical stability of the fifth-degree CKF in this paper. In order to avoid the overshoot caused by excessive scaling of error covariance matrix during the convergence stage, the scaling scheme is terminated when the gradient of azimuth reaches the maximum. The experimental results show that the improved algorithm has better alignment accuracy with large misalignment angles than the traditional algorithm. PMID:29473912

  13. Point cloud registration from local feature correspondences-Evaluation on challenging datasets.

    PubMed

    Petricek, Tomas; Svoboda, Tomas

    2017-01-01

    Registration of laser scans, or point clouds in general, is a crucial step of localization and mapping with mobile robots or in object modeling pipelines. A coarse alignment of the point clouds is generally needed before applying local methods such as the Iterative Closest Point (ICP) algorithm. We propose a feature-based approach to point cloud registration and evaluate the proposed method and its individual components on challenging real-world datasets. For a moderate overlap between the laser scans, the method provides a superior registration accuracy compared to state-of-the-art methods including Generalized ICP, 3D Normal-Distribution Transform, Fast Point-Feature Histograms, and 4-Points Congruent Sets. Compared to the surface normals, the points as the underlying features yield higher performance in both keypoint detection and establishing local reference frames. Moreover, sign disambiguation of the basis vectors proves to be an important aspect in creating repeatable local reference frames. A novel method for sign disambiguation is proposed which yields highly repeatable reference frames.

  14. High-speed all-optical DNA local sequence alignment based on a three-dimensional artificial neural network.

    PubMed

    Maleki, Ehsan; Babashah, Hossein; Koohi, Somayyeh; Kavehvash, Zahra

    2017-07-01

    This paper presents an optical processing approach for exploring a large number of genome sequences. Specifically, we propose an optical correlator for global alignment and an extended moiré matching technique for local analysis of spatially coded DNA, whose output is fed to a novel three-dimensional artificial neural network for local DNA alignment. All-optical implementation of the proposed 3D artificial neural network is developed and its accuracy is verified in Zemax. Thanks to its parallel processing capability, the proposed structure performs local alignment of 4 million sequences of 150 base pairs in a few seconds, which is much faster than its electrical counterparts, such as the basic local alignment search tool.

  15. Profile-Based LC-MS Data Alignment—A Bayesian Approach

    PubMed Central

    Tsai, Tsung-Heng; Tadesse, Mahlet G.; Wang, Yue; Ressom, Habtom W.

    2014-01-01

    A Bayesian alignment model (BAM) is proposed for alignment of liquid chromatography-mass spectrometry (LC-MS) data. BAM belongs to the category of profile-based approaches, which are composed of two major components: a prototype function and a set of mapping functions. Appropriate estimation of these functions is crucial for good alignment results. BAM uses Markov chain Monte Carlo (MCMC) methods to draw inference on the model parameters and improves on existing MCMC-based alignment methods through 1) the implementation of an efficient MCMC sampler and 2) an adaptive selection of knots. A block Metropolis-Hastings algorithm that mitigates the problem of the MCMC sampler getting stuck at local modes of the posterior distribution is used for the update of the mapping function coefficients. In addition, a stochastic search variable selection (SSVS) methodology is used to determine the number and positions of knots. We applied BAM to a simulated data set, an LC-MS proteomic data set, and two LC-MS metabolomic data sets, and compared its performance with the Bayesian hierarchical curve registration (BHCR) model, the dynamic time-warping (DTW) model, and the continuous profile model (CPM). The advantage of applying appropriate profile-based retention time correction prior to performing a feature-based approach is also demonstrated through the metabolomic data sets. PMID:23929872

  16. H-BLAST: a fast protein sequence alignment toolkit on heterogeneous computers with GPUs.

    PubMed

    Ye, Weicai; Chen, Ying; Zhang, Yongdong; Xu, Yuesheng

    2017-04-15

    The sequence alignment is a fundamental problem in bioinformatics. BLAST is a routinely used tool for this purpose with over 118 000 citations in the past two decades. As the size of bio-sequence databases grows exponentially, the computational speed of alignment softwares must be improved. We develop the heterogeneous BLAST (H-BLAST), a fast parallel search tool for a heterogeneous computer that couples CPUs and GPUs, to accelerate BLASTX and BLASTP-basic tools of NCBI-BLAST. H-BLAST employs a locally decoupled seed-extension algorithm for better performance on GPUs, and offers a performance tuning mechanism for better efficiency among various CPUs and GPUs combinations. H-BLAST produces identical alignment results as NCBI-BLAST and its computational speed is much faster than that of NCBI-BLAST. Speedups achieved by H-BLAST over sequential NCBI-BLASTP (resp. NCBI-BLASTX) range mostly from 4 to 10 (resp. 5 to 7.2). With 2 CPU threads and 2 GPUs, H-BLAST can be faster than 16-threaded NCBI-BLASTX. Furthermore, H-BLAST is 1.5-4 times faster than GPU-BLAST. https://github.com/Yeyke/H-BLAST.git. yux06@syr.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  17. Multiscale registration algorithm for alignment of meshes

    NASA Astrophysics Data System (ADS)

    Vadde, Srikanth; Kamarthi, Sagar V.; Gupta, Surendra M.

    2004-03-01

    Taking a multi-resolution approach, this research work proposes an effective algorithm for aligning a pair of scans obtained by scanning an object's surface from two adjacent views. This algorithm first encases each scan in the pair with an array of cubes of equal and fixed size. For each scan in the pair a surrogate scan is created by the centroids of the cubes that encase the scan. The Gaussian curvatures of points across the surrogate scan pair are compared to find the surrogate corresponding points. If the difference between the Gaussian curvatures of any two points on the surrogate scan pair is less than a predetermined threshold, then those two points are accepted as a pair of surrogate corresponding points. The rotation and translation values between the surrogate scan pair are determined by using a set of surrogate corresponding points. Using the same rotation and translation values the original scan pairs are aligned. The resulting registration (or alignment) error is computed to check the accuracy of the scan alignment. When the registration error becomes acceptably small, the algorithm is terminated. Otherwise the above process is continued with cubes of smaller and smaller sizes until the algorithm is terminated. However at each finer resolution the search space for finding the surrogate corresponding points is restricted to the regions in the neighborhood of the surrogate points that were at found at the preceding coarser level. The surrogate corresponding points, as the resolution becomes finer and finer, converge to the true corresponding points on the original scans. This approach offers three main benefits: it improves the chances of finding the true corresponding points on the scans, minimize the adverse effects of noise in the scans, and reduce the computational load for finding the corresponding points.

  18. NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks.

    PubMed

    Hu, Jialu; Kehr, Birte; Reinert, Knut

    2014-02-15

    Owing to recent advancements in high-throughput technologies, protein-protein interaction networks of more and more species become available in public databases. The question of how to identify functionally conserved proteins across species attracts a lot of attention in computational biology. Network alignments provide a systematic way to solve this problem. However, most existing alignment tools encounter limitations in tackling this problem. Therefore, the demand for faster and more efficient alignment tools is growing. We present a fast and accurate algorithm, NetCoffee, which allows to find a global alignment of multiple protein-protein interaction networks. NetCoffee searches for a global alignment by maximizing a target function using simulated annealing on a set of weighted bipartite graphs that are constructed using a triplet approach similar to T-Coffee. To assess its performance, NetCoffee was applied to four real datasets. Our results suggest that NetCoffee remedies several limitations of previous algorithms, outperforms all existing alignment tools in terms of speed and nevertheless identifies biologically meaningful alignments. The source code and data are freely available for download under the GNU GPL v3 license at https://code.google.com/p/netcoffee/.

  19. Face Alignment via Regressing Local Binary Features.

    PubMed

    Ren, Shaoqing; Cao, Xudong; Wei, Yichen; Sun, Jian

    2016-03-01

    This paper presents a highly efficient and accurate regression approach for face alignment. Our approach has two novel components: 1) a set of local binary features and 2) a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. This approach achieves the state-of-the-art results when tested on the most challenging benchmarks to date. Furthermore, because extracting and regressing local binary features are computationally very cheap, our system is much faster than previous methods. It achieves over 3000 frames per second (FPS) on a desktop or 300 FPS on a mobile phone for locating a few dozens of landmarks. We also study a key issue that is important but has received little attention in the previous research, which is the face detector used to initialize alignment. We investigate several face detectors and perform quantitative evaluation on how they affect alignment accuracy. We find that an alignment friendly detector can further greatly boost the accuracy of our alignment method, reducing the error up to 16% relatively. To facilitate practical usage of face detection/alignment methods, we also propose a convenient metric to measure how good a detector is for alignment initialization.

  20. Approximate matching of regular expressions.

    PubMed

    Myers, E W; Miller, W

    1989-01-01

    Given a sequence A and regular expression R, the approximate regular expression matching problem is to find a sequence matching R whose optimal alignment with A is the highest scoring of all such sequences. This paper develops an algorithm to solve the problem in time O(MN), where M and N are the lengths of A and R. Thus, the time requirement is asymptotically no worse than for the simpler problem of aligning two fixed sequences. Our method is superior to an earlier algorithm by Wagner and Seiferas in several ways. First, it treats real-valued costs, in addition to integer costs, with no loss of asymptotic efficiency. Second, it requires only O(N) space to deliver just the score of the best alignment. Finally, its structure permits implementation techniques that make it extremely fast in practice. We extend the method to accommodate gap penalties, as required for typical applications in molecular biology, and further refine it to search for sub-strings of A that strongly align with a sequence in R, as required for typical data base searches. We also show how to deliver an optimal alignment between A and R in only O(N + log M) space using O(MN log M) time. Finally, an O(MN(M + N) + N2log N) time algorithm is presented for alignment scoring schemes where the cost of a gap is an arbitrary increasing function of its length.

  1. GAMUT: GPU accelerated microRNA analysis to uncover target genes through CUDA-miRanda

    PubMed Central

    2014-01-01

    Background Non-coding sequences such as microRNAs have important roles in disease processes. Computational microRNA target identification (CMTI) is becoming increasingly important since traditional experimental methods for target identification pose many difficulties. These methods are time-consuming, costly, and often need guidance from computational methods to narrow down candidate genes anyway. However, most CMTI methods are computationally demanding, since they need to handle not only several million query microRNA and reference RNA pairs, but also several million nucleotide comparisons within each given pair. Thus, the need to perform microRNA identification at such large scale has increased the demand for parallel computing. Methods Although most CMTI programs (e.g., the miRanda algorithm) are based on a modified Smith-Waterman (SW) algorithm, the existing parallel SW implementations (e.g., CUDASW++ 2.0/3.0, SWIPE) are unable to meet this demand in CMTI tasks. We present CUDA-miRanda, a fast microRNA target identification algorithm that takes advantage of massively parallel computing on Graphics Processing Units (GPU) using NVIDIA's Compute Unified Device Architecture (CUDA). CUDA-miRanda specifically focuses on the local alignment of short (i.e., ≤ 32 nucleotides) sequences against longer reference sequences (e.g., 20K nucleotides). Moreover, the proposed algorithm is able to report multiple alignments (up to 191 top scores) and the corresponding traceback sequences for any given (query sequence, reference sequence) pair. Results Speeds over 5.36 Giga Cell Updates Per Second (GCUPs) are achieved on a server with 4 NVIDIA Tesla M2090 GPUs. Compared to the original miRanda algorithm, which is evaluated on an Intel Xeon E5620@2.4 GHz CPU, the experimental results show up to 166 times performance gains in terms of execution time. In addition, we have verified that the exact same targets were predicted in both CUDA-miRanda and the original miRanda implementations through multiple test datasets. Conclusions We offer a GPU-based alternative to high performance compute (HPC) that can be developed locally at a relatively small cost. The community of GPU developers in the biomedical research community, particularly for genome analysis, is still growing. With increasing shared resources, this community will be able to advance CMTI in a very significant manner. Our source code is available at https://sourceforge.net/projects/cudamiranda/. PMID:25077821

  2. HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads

    PubMed Central

    Li, Pinghao; Jiang, Xiaoqian; Wang, Shuang; Kim, Jihoon; Xiong, Hongkai; Ohno-Machado, Lucila

    2014-01-01

    Background and objective Short-read sequencing is becoming the standard of practice for the study of structural variants associated with disease. However, with the growth of sequence data largely surpassing reasonable storage capability, the biomedical community is challenged with the management, transfer, archiving, and storage of sequence data. Methods We developed Hierarchical mUlti-reference Genome cOmpression (HUGO), a novel compression algorithm for aligned reads in the sorted Sequence Alignment/Map (SAM) format. We first aligned short reads against a reference genome and stored exactly mapped reads for compression. For the inexact mapped or unmapped reads, we realigned them against different reference genomes using an adaptive scheme by gradually shortening the read length. Regarding the base quality value, we offer lossy and lossless compression mechanisms. The lossy compression mechanism for the base quality values uses k-means clustering, where a user can adjust the balance between decompression quality and compression rate. The lossless compression can be produced by setting k (the number of clusters) to the number of different quality values. Results The proposed method produced a compression ratio in the range 0.5–0.65, which corresponds to 35–50% storage savings based on experimental datasets. The proposed approach achieved 15% more storage savings over CRAM and comparable compression ratio with Samcomp (CRAM and Samcomp are two of the state-of-the-art genome compression algorithms). The software is freely available at https://sourceforge.net/projects/hierachicaldnac/with a General Public License (GPL) license. Limitation Our method requires having different reference genomes and prolongs the execution time for additional alignments. Conclusions The proposed multi-reference-based compression algorithm for aligned reads outperforms existing single-reference based algorithms. PMID:24368726

  3. TU-D-209-03: Alignment of the Patient Graphic Model Using Fluoroscopic Images for Skin Dose Mapping

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oines, A; Oines, A; Kilian-Meneghin, J

    2016-06-15

    Purpose: The Dose Tracking System (DTS) was developed to provide realtime feedback of skin dose and dose rate during interventional fluoroscopic procedures. A color map on a 3D graphic of the patient represents the cumulative dose distribution on the skin. Automated image correlation algorithms are described which use the fluoroscopic procedure images to align and scale the patient graphic for more accurate dose mapping. Methods: Currently, the DTS employs manual patient graphic selection and alignment. To improve the accuracy of dose mapping and automate the software, various methods are explored to extract information about the beam location and patient morphologymore » from the procedure images. To match patient anatomy with a reference projection image, preprocessing is first used, including edge enhancement, edge detection, and contour detection. Template matching algorithms from OpenCV are then employed to find the location of the beam. Once a match is found, the reference graphic is scaled and rotated to fit the patient, using image registration correlation functions in Matlab. The algorithm runs correlation functions for all points and maps all correlation confidences to a surface map. The highest point of correlation is used for alignment and scaling. The transformation data is saved for later model scaling. Results: Anatomic recognition is used to find matching features between model and image and image registration correlation provides for alignment and scaling at any rotation angle with less than onesecond runtime, and at noise levels in excess of 150% of those found in normal procedures. Conclusion: The algorithm provides the necessary scaling and alignment tools to improve the accuracy of dose distribution mapping on the patient graphic with the DTS. Partial support from NIH Grant R01-EB002873 and Toshiba Medical Systems Corp.« less

  4. A Fast Approximate Algorithm for Mapping Long Reads to Large Reference Databases.

    PubMed

    Jain, Chirag; Dilthey, Alexander; Koren, Sergey; Aluru, Srinivas; Phillippy, Adam M

    2018-04-30

    Emerging single-molecule sequencing technologies from Pacific Biosciences and Oxford Nanopore have revived interest in long-read mapping algorithms. Alignment-based seed-and-extend methods demonstrate good accuracy, but face limited scalability, while faster alignment-free methods typically trade decreased precision for efficiency. In this article, we combine a fast approximate read mapping algorithm based on minimizers with a novel MinHash identity estimation technique to achieve both scalability and precision. In contrast to prior methods, we develop a mathematical framework that defines the types of mapping targets we uncover, establish probabilistic estimates of p-value and sensitivity, and demonstrate tolerance for alignment error rates up to 20%. With this framework, our algorithm automatically adapts to different minimum length and identity requirements and provides both positional and identity estimates for each mapping reported. For mapping human PacBio reads to the hg38 reference, our method is 290 × faster than Burrows-Wheeler Aligner-MEM with a lower memory footprint and recall rate of 96%. We further demonstrate the scalability of our method by mapping noisy PacBio reads (each ≥5 kbp in length) to the complete NCBI RefSeq database containing 838 Gbp of sequence and >60,000 genomes.

  5. Chunk Alignment for Corpus-Based Machine Translation

    ERIC Educational Resources Information Center

    Kim, Jae Dong

    2011-01-01

    Since sub-sentential alignment is critically important to the translation quality of an Example-Based Machine Translation (EBMT) system, which operates by finding and combining phrase-level matches against the training examples, we developed a new alignment algorithm for the purpose of improving the EBMT system's performance. This new…

  6. SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics

    PubMed Central

    Will, Sebastian; Otto, Christina; Miladi, Milad; Möhl, Mathias; Backofen, Rolf

    2015-01-01

    Motivation: RNA-Seq experiments have revealed a multitude of novel ncRNAs. The gold standard for their analysis based on simultaneous alignment and folding suffers from extreme time complexity of O(n6). Subsequently, numerous faster ‘Sankoff-style’ approaches have been suggested. Commonly, the performance of such methods relies on sequence-based heuristics that restrict the search space to optimal or near-optimal sequence alignments; however, the accuracy of sequence-based methods breaks down for RNAs with sequence identities below 60%. Alignment approaches like LocARNA that do not require sequence-based heuristics, have been limited to high complexity (≥ quartic time). Results: Breaking this barrier, we introduce the novel Sankoff-style algorithm ‘sparsified prediction and alignment of RNAs based on their structure ensembles (SPARSE)’, which runs in quadratic time without sequence-based heuristics. To achieve this low complexity, on par with sequence alignment algorithms, SPARSE features strong sparsification based on structural properties of the RNA ensembles. Following PMcomp, SPARSE gains further speed-up from lightweight energy computation. Although all existing lightweight Sankoff-style methods restrict Sankoff’s original model by disallowing loop deletions and insertions, SPARSE transfers the Sankoff algorithm to the lightweight energy model completely for the first time. Compared with LocARNA, SPARSE achieves similar alignment and better folding quality in significantly less time (speedup: 3.7). At similar run-time, it aligns low sequence identity instances substantially more accurate than RAF, which uses sequence-based heuristics. Availability and implementation: SPARSE is freely available at http://www.bioinf.uni-freiburg.de/Software/SPARSE. Contact: backofen@informatik.uni-freiburg.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25838465

  7. Aligning a Receiving Antenna Array to Reduce Interference

    NASA Technical Reports Server (NTRS)

    Jongeling, Andre P.; Rogstad, David H.

    2009-01-01

    A digital signal-processing algorithm has been devised as a means of aligning (as defined below) the outputs of multiple receiving radio antennas in a large array for the purpose of receiving a desired weak signal transmitted by a single distant source in the presence of an interfering signal that (1) originates at another source lying within the antenna beam and (2) occupies a frequency band significantly wider than that of the desired signal. In the original intended application of the algorithm, the desired weak signal is a spacecraft telemetry signal, the antennas are spacecraft-tracking antennas in NASA s Deep Space Network, and the source of the wide-band interfering signal is typically a radio galaxy or a planet that lies along or near the line of sight to the spacecraft. The algorithm could also afford the ability to discriminate between desired narrow-band and nearby undesired wide-band sources in related applications that include satellite and terrestrial radio communications and radio astronomy. The development of the present algorithm involved modification of a prior algorithm called SUMPLE and a predecessor called SIMPLE. SUMPLE was described in Algorithm for Aligning an Array of Receiving Radio Antennas (NPO-40574), NASA Tech Briefs Vol. 30, No. 4 (April 2006), page 54. To recapitulate: As used here, aligning signifies adjusting the delays and phases of the outputs from the various antennas so that their relatively weak replicas of the desired signal can be added coherently to increase the signal-to-noise ratio (SNR) for improved reception, as though one had a single larger antenna. Prior to the development of SUMPLE, it was common practice to effect alignment by means of a process that involves correlation of signals in pairs. SIMPLE is an example of an algorithm that effects such a process. SUMPLE also involves correlations, but the correlations are not performed in pairs. Instead, in a partly iterative process, each signal is appropriately weighted and then correlated with a composite signal equal to the sum of the other signals.

  8. Assignment of protein sequences to existing domain and family classification systems: Pfam and the PDB

    PubMed Central

    Dunbrack, Roland L.

    2012-01-01

    Motivation: Automating the assignment of existing domain and protein family classifications to new sets of sequences is an important task. Current methods often miss assignments because remote relationships fail to achieve statistical significance. Some assignments are not as long as the actual domain definitions because local alignment methods often cut alignments short. Long insertions in query sequences often erroneously result in two copies of the domain assigned to the query. Divergent repeat sequences in proteins are often missed. Results: We have developed a multilevel procedure to produce nearly complete assignments of protein families of an existing classification system to a large set of sequences. We apply this to the task of assigning Pfam domains to sequences and structures in the Protein Data Bank (PDB). We found that HHsearch alignments frequently scored more remotely related Pfams in Pfam clans higher than closely related Pfams, thus, leading to erroneous assignment at the Pfam family level. A greedy algorithm allowing for partial overlaps was, thus, applied first to sequence/HMM alignments, then HMM–HMM alignments and then structure alignments, taking care to join partial alignments split by large insertions into single-domain assignments. Additional assignment of repeat Pfams with weaker E-values was allowed after stronger assignments of the repeat HMM. Our database of assignments, presented in a database called PDBfam, contains Pfams for 99.4% of chains >50 residues. Availability: The Pfam assignment data in PDBfam are available at http://dunbrack2.fccc.edu/ProtCid/PDBfam, which can be searched by PDB codes and Pfam identifiers. They will be updated regularly. Contact: Roland.Dunbracks@fccc.edu PMID:22942020

  9. Registration of PET and CT images based on multiresolution gradient of mutual information demons algorithm for positioning esophageal cancer patients.

    PubMed

    Jin, Shuo; Li, Dengwang; Wang, Hongjun; Yin, Yong

    2013-01-07

    Accurate registration of 18F-FDG PET (positron emission tomography) and CT (computed tomography) images has important clinical significance in radiation oncology. PET and CT images are acquired from (18)F-FDG PET/CT scanner, but the two acquisition processes are separate and take a long time. As a result, there are position errors in global and deformable errors in local caused by respiratory movement or organ peristalsis. The purpose of this work was to implement and validate a deformable CT to PET image registration method in esophageal cancer to eventually facilitate accurate positioning the tumor target on CT, and improve the accuracy of radiation therapy. Global registration was firstly utilized to preprocess position errors between PET and CT images, achieving the purpose of aligning these two images on the whole. Demons algorithm, based on optical flow field, has the features of fast process speed and high accuracy, and the gradient of mutual information-based demons (GMI demons) algorithm adds an additional external force based on the gradient of mutual information (GMI) between two images, which is suitable for multimodality images registration. In this paper, GMI demons algorithm was used to achieve local deformable registration of PET and CT images, which can effectively reduce errors between internal organs. In addition, to speed up the registration process, maintain its robustness, and avoid the local extremum, multiresolution image pyramid structure was used before deformable registration. By quantitatively and qualitatively analyzing cases with esophageal cancer, the registration scheme proposed in this paper can improve registration accuracy and speed, which is helpful for precisely positioning tumor target and developing the radiation treatment planning in clinical radiation therapy application.

  10. Registration of PET and CT images based on multiresolution gradient of mutual information demons algorithm for positioning esophageal cancer patients

    PubMed Central

    Jin, Shuo; Li, Dengwang; Yin, Yong

    2013-01-01

    Accurate registration of  18F−FDG PET (positron emission tomography) and CT (computed tomography) images has important clinical significance in radiation oncology. PET and CT images are acquired from  18F−FDG PET/CT scanner, but the two acquisition processes are separate and take a long time. As a result, there are position errors in global and deformable errors in local caused by respiratory movement or organ peristalsis. The purpose of this work was to implement and validate a deformable CT to PET image registration method in esophageal cancer to eventually facilitate accurate positioning the tumor target on CT, and improve the accuracy of radiation therapy. Global registration was firstly utilized to preprocess position errors between PET and CT images, achieving the purpose of aligning these two images on the whole. Demons algorithm, based on optical flow field, has the features of fast process speed and high accuracy, and the gradient of mutual information‐based demons (GMI demons) algorithm adds an additional external force based on the gradient of mutual information (GMI) between two images, which is suitable for multimodality images registration. In this paper, GMI demons algorithm was used to achieve local deformable registration of PET and CT images, which can effectively reduce errors between internal organs. In addition, to speed up the registration process, maintain its robustness, and avoid the local extremum, multiresolution image pyramid structure was used before deformable registration. By quantitatively and qualitatively analyzing cases with esophageal cancer, the registration scheme proposed in this paper can improve registration accuracy and speed, which is helpful for precisely positioning tumor target and developing the radiation treatment planning in clinical radiation therapy application. PACS numbers: 87.57.nj, 87.57.Q‐, 87.57.uk PMID:23318381

  11. Accurate and robust brain image alignment using boundary-based registration.

    PubMed

    Greve, Douglas N; Fischl, Bruce

    2009-10-15

    The fine spatial scales of the structures in the human brain represent an enormous challenge to the successful integration of information from different images for both within- and between-subject analysis. While many algorithms to register image pairs from the same subject exist, visual inspection shows that their accuracy and robustness to be suspect, particularly when there are strong intensity gradients and/or only part of the brain is imaged. This paper introduces a new algorithm called Boundary-Based Registration, or BBR. The novelty of BBR is that it treats the two images very differently. The reference image must be of sufficient resolution and quality to extract surfaces that separate tissue types. The input image is then aligned to the reference by maximizing the intensity gradient across tissue boundaries. Several lower quality images can be aligned through their alignment with the reference. Visual inspection and fMRI results show that BBR is more accurate than correlation ratio or normalized mutual information and is considerably more robust to even strong intensity inhomogeneities. BBR also excels at aligning partial-brain images to whole-brain images, a domain in which existing registration algorithms frequently fail. Even in the limit of registering a single slice, we show the BBR results to be robust and accurate.

  12. Alignment and Calibration of Optical and Inertial Sensors Using Stellar Observations

    DTIC Science & Technology

    2007-01-01

    Force, Department of Defense, or the U.S Government. References [1] R. G. Brown and P. Y. Hwang . Introduction to Ran- dom Signals and Applied Kalman ...and stellar observations using an extended Kalman filter algorithm. The approach is verified using simulation and experimental data, and con- clusions...an extended Kalman filter (EKF) algorithm (see [10], [11]) to recur- sively estimate camera alignment and calibration param- eters by measuring the

  13. An additional reference axis improves femoral rotation alignment in image-free computer navigation assisted total knee arthroplasty.

    PubMed

    Inui, Hiroshi; Taketomi, Shuji; Nakamura, Kensuke; Sanada, Takaki; Tanaka, Sakae; Nakagawa, Takumi

    2013-05-01

    Few studies have demonstrated improvement in accuracy of rotational alignment using image-free navigation systems mainly due to the inconsistent registration of anatomical landmarks. We have used an image-free navigation for total knee arthroplasty, which adopts the average algorithm between two reference axes (transepicondylar axis and axis perpendicular to the Whiteside axis) for femoral component rotation control. We hypothesized that addition of another axis (condylar twisting axis measured on a preoperative radiograph) would improve the accuracy. One group using the average algorithm (double-axis group) was compared with the other group using another axis to confirm the accuracy of the average algorithm (triple-axis group). Femoral components were more accurately implanted for rotational alignment in the triple-axis group (ideal: triple-axis group 100%, double-axis group 82%, P<0.05). Copyright © 2013 Elsevier Inc. All rights reserved.

  14. Vehicle Localization by LIDAR Point Correlation Improved by Change Detection

    NASA Astrophysics Data System (ADS)

    Schlichting, A.; Brenner, C.

    2016-06-01

    LiDAR sensors are proven sensors for accurate vehicle localization. Instead of detecting and matching features in the LiDAR data, we want to use the entire information provided by the scanners. As dynamic objects, like cars, pedestrians or even construction sites could lead to wrong localization results, we use a change detection algorithm to detect these objects in the reference data. If an object occurs in a certain number of measurements at the same position, we mark it and every containing point as static. In the next step, we merge the data of the single measurement epochs to one reference dataset, whereby we only use static points. Further, we also use a classification algorithm to detect trees. For the online localization of the vehicle, we use simulated data of a vertical aligned automotive LiDAR sensor. As we only want to use static objects in this case as well, we use a random forest classifier to detect dynamic scan points online. Since the automotive data is derived from the LiDAR Mobile Mapping System, we are able to use the labelled objects from the reference data generation step to create the training data and further to detect dynamic objects online. The localization then can be done by a point to image correlation method using only static objects. We achieved a localization standard deviation of about 5 cm (position) and 0.06° (heading), and were able to successfully localize the vehicle in about 93 % of the cases along a trajectory of 13 km in Hannover, Germany.

  15. SU-E-J-24: Image-Guidance Using Cone-Beam CT for Stereotactic Body Radiotherapy (SBRT) of Lung Cancer Patients: Bony Alignment or Soft Tissue Alignment?

    PubMed

    Wang, L; Turaka, A; Meyer, J; Spoka, D; Jin, L; Fan, J; Ma, C

    2012-06-01

    To assess the reliability of soft tissue alignment by comparing pre- and post-treatment cone-beam CT (CBCT) for image guidance in stereotactic body radiotherapy (SBRT) of lung cancers. Our lung SBRT procedures require all patients undergo 4D CT scan in order to obtain patient-specific target motion information through reconstructed 4D data using the maximum-intensity projection (MIP) algorithm. The internal target volume (ITV) was outlined directly from the MIP images and a 3-5 mm margin expansion was then applied to the ITV to create the PTV. Conformal treatment planning was performed on the helical images, to which the MIP images were fused. Prior to each treatment, CBCT was used for image guidance by comparing with the simulation CT and for patient relocalization based on the bony anatomy. Any displacement of the patient bony structure would be considered as setup errors and would be corrected by couch shifts. Theoretically, as the PTV definition included target internal motion, no further shifts other than setup corrections should be made. However, it is our practice to have treating physicians further check target localization within the PTV. Whenever the shifts based on the soft-tissue alignment (that is, target alignment) exceeded a certain value (e.g. 5 mm), a post-treatment CBCT was carried out to ensure that the tissue alignment is reliable by comparing between pre- and post-treatment CBCT. Pre- and post-CBCT has been performed for 7 patients so far who had shifts beyond 5 mm despite bony alignment. For all patients, post CBCT confirmed that the visualized target position was kept in the same position as before treatment after adjusting for soft-tissue alignment. For the patient population studied, it is shown that soft-tissue alignment is necessary and reliable in the lung SBRT for individual cases. © 2012 American Association of Physicists in Medicine.

  16. High-Throughput Image Analysis of Fibrillar Materials: A Case Study on Polymer Nanofiber Packing, Alignment, and Defects in Organic Field Effect Transistors.

    PubMed

    Persson, Nils E; Rafshoon, Joshua; Naghshpour, Kaylie; Fast, Tony; Chu, Ping-Hsun; McBride, Michael; Risteen, Bailey; Grover, Martha; Reichmanis, Elsa

    2017-10-18

    High-throughput discovery of process-structure-property relationships in materials through an informatics-enabled empirical approach is an increasingly utilized technique in materials research due to the rapidly expanding availability of data. Here, process-structure-property relationships are extracted for the nucleation, growth, and deposition of semiconducting poly(3-hexylthiophene) (P3HT) nanofibers used in organic field effect transistors, via high-throughput image analysis. This study is performed using an automated image analysis pipeline combining existing open-source software and new algorithms, enabling the rapid evaluation of structural metrics for images of fibrillar materials, including local orientational order, fiber length density, and fiber length distributions. We observe that microfluidic processing leads to fibers that pack with unusually high density, while sonication yields fibers that pack sparsely with low alignment. This is attributed to differences in their crystallization mechanisms. P3HT nanofiber packing during thin film deposition exhibits behavior suggesting that fibers are confined to packing in two-dimensional layers. We find that fiber alignment, a feature correlated with charge carrier mobility, is driven by increasing fiber length, and that shorter fibers tend to segregate to the buried dielectric interface during deposition, creating potentially performance-limiting defects in alignment. Another barrier to perfect alignment is the curvature of P3HT fibers; we propose a mechanistic simulation of fiber growth that reconciles both this curvature and the log-normal distribution of fiber lengths inherent to the fiber populations under consideration.

  17. Tolerancing the alignment of large-core optical fibers, fiber bundles and light guides using a Fourier approach.

    PubMed

    Sawyer, Travis W; Petersburg, Ryan; Bohndiek, Sarah E

    2017-04-20

    Optical fiber technology is found in a wide variety of applications to flexibly relay light between two points, enabling information transfer across long distances and allowing access to hard-to-reach areas. Large-core optical fibers and light guides find frequent use in illumination and spectroscopic applications, for example, endoscopy and high-resolution astronomical spectroscopy. Proper alignment is critical for maximizing throughput in optical fiber coupling systems; however, there currently are no formal approaches to tolerancing the alignment of a light-guide coupling system. Here, we propose a Fourier alignment sensitivity (FAS) algorithm to determine the optimal tolerances on the alignment of a light guide by computing the alignment sensitivity. The algorithm shows excellent agreement with both simulated and experimentally measured values and improves on the computation time of equivalent ray-tracing simulations by two orders of magnitude. We then apply FAS to tolerance and fabricate a coupling system, which is shown to meet specifications, thus validating FAS as a tolerancing technique. These results indicate that FAS is a flexible and rapid means to quantify the alignment sensitivity of a light guide, widely informing the design and tolerancing of coupling systems.

  18. A new graph-based method for pairwise global network alignment

    PubMed Central

    Klau, Gunnar W

    2009-01-01

    Background In addition to component-based comparative approaches, network alignments provide the means to study conserved network topology such as common pathways and more complex network motifs. Yet, unlike in classical sequence alignment, the comparison of networks becomes computationally more challenging, as most meaningful assumptions instantly lead to NP-hard problems. Most previous algorithmic work on network alignments is heuristic in nature. Results We introduce the graph-based maximum structural matching formulation for pairwise global network alignment. We relate the formulation to previous work and prove NP-hardness of the problem. Based on the new formulation we build upon recent results in computational structural biology and present a novel Lagrangian relaxation approach that, in combination with a branch-and-bound method, computes provably optimal network alignments. The Lagrangian algorithm alone is a powerful heuristic method, which produces solutions that are often near-optimal and – unlike those computed by pure heuristics – come with a quality guarantee. Conclusion Computational experiments on the alignment of protein-protein interaction networks and on the classification of metabolic subnetworks demonstrate that the new method is reasonably fast and has advantages over pure heuristics. Our software tool is freely available as part of the LISA library. PMID:19208162

  19. Tolerancing the alignment of large-core optical fibers, fiber bundles and light guides using a Fourier approach

    PubMed Central

    Sawyer, Travis W.; Petersburg, Ryan; Bohndiek, Sarah E.

    2017-01-01

    Optical fiber technology is found in a wide variety of applications to flexibly relay light between two points, enabling information transfer across long distances and allowing access to hard-to-reach areas. Large-core optical fibers and light guides find frequent use in illumination and spectroscopic applications; for example, endoscopy and high-resolution astronomical spectroscopy. Proper alignment is critical for maximizing throughput in optical fiber coupling systems, however, there currently are no formal approaches to tolerancing the alignment of a light guide coupling system. Here, we propose a Fourier Alignment Sensitivity (FAS) algorithm to determine the optimal tolerances on the alignment of a light guide by computing the alignment sensitivity. The algorithm shows excellent agreement with both simulated and experimentally measured values and improves on the computation time of equivalent ray tracing simulations by two orders of magnitude. We then apply FAS to tolerance and fabricate a coupling system, which is shown to meet specifications, thus validating FAS as a tolerancing technique. These results indicate that FAS is a flexible and rapid means to quantify the alignment sensitivity of a light guide, widely informing the design and tolerancing of coupling systems. PMID:28430250

  20. What's in your next-generation sequence data? An exploration of unmapped DNA and RNA sequence reads from the bovine reference individual

    USDA-ARS?s Scientific Manuscript database

    BACKGROUND: Next-generation sequencing projects commonly commence by aligning reads to a reference genome assembly. While improvements in alignment algorithms and computational hardware have greatly enhanced the efficiency and accuracy of alignments, a significant percentage of reads often remain u...

  1. Detecting Local Ligand-Binding Site Similarity in Non-Homologous Proteins by Surface Patch Comparison

    PubMed Central

    Sael, Lee; Kihara, Daisuke

    2012-01-01

    Functional elucidation of proteins is one of the essential tasks in biology. Function of a protein, specifically, small ligand molecules that bind to a protein, can be predicted by finding similar local surface regions in binding sites of known proteins. Here, we developed an alignment free local surface comparison method for predicting a ligand molecule which binds to a query protein. The algorithm, named Patch-Surfer, represents a binding pocket as a combination of segmented surface patches, each of which is characterized by its geometrical shape, the electrostatic potential, the hydrophobicity, and the concaveness. Representing a pocket by a set of patches is effective to absorb difference of global pocket shape while capturing local similarity of pockets. The shape and the physicochemical properties of surface patches are represented using the 3D Zernike descriptor, which is a series expansion of mathematical 3D function. Two pockets are compared using a modified weighted bipartite matching algorithm, which matches similar patches from the two pockets. Patch-Surfer was benchmarked on three datasets, which consist in total of 390 proteins that bind to one of 21 ligands. Patch-Surfer showed superior performance to existing methods including a global pocket comparison method, Pocket-Surfer, which we have previously introduced. Particularly, as intended, the accuracy showed large improvement for flexible ligand molecules, which bind to pockets in different conformations. PMID:22275074

  2. Detecting local ligand-binding site similarity in nonhomologous proteins by surface patch comparison.

    PubMed

    Sael, Lee; Kihara, Daisuke

    2012-04-01

    Functional elucidation of proteins is one of the essential tasks in biology. Function of a protein, specifically, small ligand molecules that bind to a protein, can be predicted by finding similar local surface regions in binding sites of known proteins. Here, we developed an alignment free local surface comparison method for predicting a ligand molecule which binds to a query protein. The algorithm, named Patch-Surfer, represents a binding pocket as a combination of segmented surface patches, each of which is characterized by its geometrical shape, the electrostatic potential, the hydrophobicity, and the concaveness. Representing a pocket by a set of patches is effective to absorb difference of global pocket shape while capturing local similarity of pockets. The shape and the physicochemical properties of surface patches are represented using the 3D Zernike descriptor, which is a series expansion of mathematical 3D function. Two pockets are compared using a modified weighted bipartite matching algorithm, which matches similar patches from the two pockets. Patch-Surfer was benchmarked on three datasets, which consist in total of 390 proteins that bind to one of 21 ligands. Patch-Surfer showed superior performance to existing methods including a global pocket comparison method, Pocket-Surfer, which we have previously introduced. Particularly, as intended, the accuracy showed large improvement for flexible ligand molecules, which bind to pockets in different conformations. Copyright © 2011 Wiley Periodicals, Inc.

  3. Incremental Ontology-Based Extraction and Alignment in Semi-structured Documents

    NASA Astrophysics Data System (ADS)

    Thiam, Mouhamadou; Bennacer, Nacéra; Pernelle, Nathalie; Lô, Moussa

    SHIRIis an ontology-based system for integration of semi-structured documents related to a specific domain. The system’s purpose is to allow users to access to relevant parts of documents as answers to their queries. SHIRI uses RDF/OWL for representation of resources and SPARQL for their querying. It relies on an automatic, unsupervised and ontology-driven approach for extraction, alignment and semantic annotation of tagged elements of documents. In this paper, we focus on the Extract-Align algorithm which exploits a set of named entity and term patterns to extract term candidates to be aligned with the ontology. It proceeds in an incremental manner in order to populate the ontology with terms describing instances of the domain and to reduce the access to extern resources such as Web. We experiment it on a HTML corpus related to call for papers in computer science and the results that we obtain are very promising. These results show how the incremental behaviour of Extract-Align algorithm enriches the ontology and the number of terms (or named entities) aligned directly with the ontology increases.

  4. Joint Multi-Leaf Segmentation, Alignment, and Tracking for Fluorescence Plant Videos.

    PubMed

    Yin, Xi; Liu, Xiaoming; Chen, Jin; Kramer, David M

    2018-06-01

    This paper proposes a novel framework for fluorescence plant video processing. The plant research community is interested in the leaf-level photosynthetic analysis within a plant. A prerequisite for such analysis is to segment all leaves, estimate their structures, and track them over time. We identify this as a joint multi-leaf segmentation, alignment, and tracking problem. First, leaf segmentation and alignment are applied on the last frame of a plant video to find a number of well-aligned leaf candidates. Second, leaf tracking is applied on the remaining frames with leaf candidate transformation from the previous frame. We form two optimization problems with shared terms in their objective functions for leaf alignment and tracking respectively. A quantitative evaluation framework is formulated to evaluate the performance of our algorithm with four metrics. Two models are learned to predict the alignment accuracy and detect tracking failure respectively in order to provide guidance for subsequent plant biology analysis. The limitation of our algorithm is also studied. Experimental results show the effectiveness, efficiency, and robustness of the proposed method.

  5. Transcription Factor Map Alignment of Promoter Regions

    PubMed Central

    Blanco, Enrique; Messeguer, Xavier; Smith, Temple F; Guigó, Roderic

    2006-01-01

    We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments. PMID:16733547

  6. Image Registration for Stability Testing of MEMS

    NASA Technical Reports Server (NTRS)

    Memarsadeghi, Nargess; LeMoigne, Jacqueline; Blake, Peter N.; Morey, Peter A.; Landsman, Wayne B.; Chambers, Victor J.; Moseley, Samuel H.

    2011-01-01

    Image registration, or alignment of two or more images covering the same scenes or objects, is of great interest in many disciplines such as remote sensing, medical imaging. astronomy, and computer vision. In this paper, we introduce a new application of image registration algorithms. We demonstrate how through a wavelet based image registration algorithm, engineers can evaluate stability of Micro-Electro-Mechanical Systems (MEMS). In particular, we applied image registration algorithms to assess alignment stability of the MicroShutters Subsystem (MSS) of the Near Infrared Spectrograph (NIRSpec) instrument of the James Webb Space Telescope (JWST). This work introduces a new methodology for evaluating stability of MEMS devices to engineers as well as a new application of image registration algorithms to computer scientists.

  7. A Probabilistic Model of Local Sequence Alignment That Simplifies Statistical Significance Estimation

    PubMed Central

    Eddy, Sean R.

    2008-01-01

    Sequence database searches require accurate estimation of the statistical significance of scores. Optimal local sequence alignment scores follow Gumbel distributions, but determining an important parameter of the distribution (λ) requires time-consuming computational simulation. Moreover, optimal alignment scores are less powerful than probabilistic scores that integrate over alignment uncertainty (“Forward” scores), but the expected distribution of Forward scores remains unknown. Here, I conjecture that both expected score distributions have simple, predictable forms when full probabilistic modeling methods are used. For a probabilistic model of local sequence alignment, optimal alignment bit scores (“Viterbi” scores) are Gumbel-distributed with constant λ = log 2, and the high scoring tail of Forward scores is exponential with the same constant λ. Simulation studies support these conjectures over a wide range of profile/sequence comparisons, using 9,318 profile-hidden Markov models from the Pfam database. This enables efficient and accurate determination of expectation values (E-values) for both Viterbi and Forward scores for probabilistic local alignments. PMID:18516236

  8. Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

    PubMed Central

    2011-01-01

    Background Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. Results In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0. Conclusions The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research. PMID:21867510

  9. Simulator for beam-based LHC collimator alignment

    NASA Astrophysics Data System (ADS)

    Valentino, Gianluca; Aßmann, Ralph; Redaelli, Stefano; Sammut, Nicholas

    2014-02-01

    In the CERN Large Hadron Collider, collimators need to be set up to form a multistage hierarchy to ensure efficient multiturn cleaning of halo particles. Automatic algorithms were introduced during the first run to reduce the beam time required for beam-based setup, improve the alignment accuracy, and reduce the risk of human errors. Simulating the alignment procedure would allow for off-line tests of alignment policies and algorithms. A simulator was developed based on a diffusion beam model to generate the characteristic beam loss signal spike and decay produced when a collimator jaw touches the beam, which is observed in a beam loss monitor (BLM). Empirical models derived from the available measurement data are used to simulate the steady-state beam loss and crosstalk between multiple BLMs. The simulator design is presented, together with simulation results and comparison to measurement data.

  10. MANGO: a new approach to multiple sequence alignment.

    PubMed

    Zhang, Zefeng; Lin, Hao; Li, Ming

    2007-01-01

    Multiple sequence alignment is a classical and challenging task for biological sequence analysis. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state of the art multiple sequence alignment programs suffer from the 'once a gap, always a gap' phenomenon. Is there a radically new way to do multiple sequence alignment? This paper introduces a novel and orthogonal multiple sequence alignment method, using multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds are provably significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks showing that MANGO compares favorably, in both accuracy and speed, against state-of-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, Prob-ConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0 and Kalign 2.0.

  11. How effective are DNA barcodes in the identification of African rainforest trees?

    PubMed

    Parmentier, Ingrid; Duminil, Jérôme; Kuzmina, Maria; Philippe, Morgane; Thomas, Duncan W; Kenfack, David; Chuyong, George B; Cruaud, Corinne; Hardy, Olivier J

    2013-01-01

    DNA barcoding of rain forest trees could potentially help biologists identify species and discover new ones. However, DNA barcodes cannot always distinguish between closely related species, and the size and completeness of barcode databases are key parameters for their successful application. We test the ability of rbcL, matK and trnH-psbA plastid DNA markers to identify rain forest trees at two sites in Atlantic central Africa under the assumption that a database is exhaustive in terms of species content, but not necessarily in terms of haplotype diversity within species. We assess the accuracy of identification to species or genus using a genetic distance matrix between samples either based on a global multiple sequence alignment (GD) or on a basic local alignment search tool (BLAST). Where a local database is available (within a 50 ha plot), barcoding was generally reliable for genus identification (95-100% success), but less for species identification (71-88%). Using a single marker, best results for species identification were obtained with trnH-psbA. There was a significant decrease of barcoding success in species-rich clades. When the local database was used to identify the genus of trees from another region and did include all genera from the query individuals but not all species, genus identification success decreased to 84-90%. The GD method performed best but a global multiple sequence alignment is not applicable on trnH-psbA. Barcoding is a useful tool to assign unidentified African rain forest trees to a genus, but identification to a species is less reliable, especially in species-rich clades, even using an exhaustive local database. Combining two markers improves the accuracy of species identification but it would only marginally improve genus identification. Finally, we highlight some limitations of the BLAST algorithm as currently implemented and suggest possible improvements for barcoding applications.

  12. How Effective Are DNA Barcodes in the Identification of African Rainforest Trees?

    PubMed Central

    Parmentier, Ingrid; Duminil, Jérôme; Kuzmina, Maria; Philippe, Morgane; Thomas, Duncan W.; Kenfack, David; Chuyong, George B.; Cruaud, Corinne; Hardy, Olivier J.

    2013-01-01

    Background DNA barcoding of rain forest trees could potentially help biologists identify species and discover new ones. However, DNA barcodes cannot always distinguish between closely related species, and the size and completeness of barcode databases are key parameters for their successful application. We test the ability of rbcL, matK and trnH-psbA plastid DNA markers to identify rain forest trees at two sites in Atlantic central Africa under the assumption that a database is exhaustive in terms of species content, but not necessarily in terms of haplotype diversity within species. Methodology/Principal Findings We assess the accuracy of identification to species or genus using a genetic distance matrix between samples either based on a global multiple sequence alignment (GD) or on a basic local alignment search tool (BLAST). Where a local database is available (within a 50 ha plot), barcoding was generally reliable for genus identification (95–100% success), but less for species identification (71–88%). Using a single marker, best results for species identification were obtained with trnH-psbA. There was a significant decrease of barcoding success in species-rich clades. When the local database was used to identify the genus of trees from another region and did include all genera from the query individuals but not all species, genus identification success decreased to 84–90%. The GD method performed best but a global multiple sequence alignment is not applicable on trnH-psbA. Conclusions/Significance Barcoding is a useful tool to assign unidentified African rain forest trees to a genus, but identification to a species is less reliable, especially in species-rich clades, even using an exhaustive local database. Combining two markers improves the accuracy of species identification but it would only marginally improve genus identification. Finally, we highlight some limitations of the BLAST algorithm as currently implemented and suggest possible improvements for barcoding applications. PMID:23565134

  13. Multiple alignment analysis on phylogenetic tree of the spread of SARS epidemic using distance method

    NASA Astrophysics Data System (ADS)

    Amiroch, S.; Pradana, M. S.; Irawan, M. I.; Mukhlash, I.

    2017-09-01

    Multiple Alignment (MA) is a particularly important tool for studying the viral genome and determine the evolutionary process of the specific virus. Application of MA in the case of the spread of the Severe acute respiratory syndrome (SARS) epidemic is an interesting thing because this virus epidemic a few years ago spread so quickly that medical attention in many countries. Although there has been a lot of software to process multiple sequences, but the use of pairwise alignment to process MA is very important to consider. In previous research, the alignment between the sequences to process MA algorithm, Super Pairwise Alignment, but in this study used a dynamic programming algorithm Needleman wunchs simulated in Matlab. From the analysis of MA obtained and stable region and unstable which indicates the position where the mutation occurs, the system network topology that produced the phylogenetic tree of the SARS epidemic distance method, and system area networks mutation.

  14. A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment

    PubMed Central

    Freschi, Valerio; Bogliolo, Alessandro

    2012-01-01

    In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment. PMID:22518086

  15. Discovering Sequence Motifs with Arbitrary Insertions and Deletions

    PubMed Central

    Frith, Martin C.; Saunders, Neil F. W.; Kobe, Bostjan; Bailey, Timothy L.

    2008-01-01

    Biology is encoded in molecular sequences: deciphering this encoding remains a grand scientific challenge. Functional regions of DNA, RNA, and protein sequences often exhibit characteristic but subtle motifs; thus, computational discovery of motifs in sequences is a fundamental and much-studied problem. However, most current algorithms do not allow for insertions or deletions (indels) within motifs, and the few that do have other limitations. We present a method, GLAM2 (Gapped Local Alignment of Motifs), for discovering motifs allowing indels in a fully general manner, and a companion method GLAM2SCAN for searching sequence databases using such motifs. glam2 is a generalization of the gapless Gibbs sampling algorithm. It re-discovers variable-width protein motifs from the PROSITE database significantly more accurately than the alternative methods PRATT and SAM-T2K. Furthermore, it usefully refines protein motifs from the ELM database: in some cases, the refined motifs make orders of magnitude fewer overpredictions than the original ELM regular expressions. GLAM2 performs respectably on the BAliBASE multiple alignment benchmark, and may be superior to leading multiple alignment methods for “motif-like” alignments with N- and C-terminal extensions. Finally, we demonstrate the use of GLAM2 to discover protein kinase substrate motifs and a gapped DNA motif for the LIM-only transcriptional regulatory complex: using GLAM2SCAN, we identify promising targets for the latter. GLAM2 is especially promising for short protein motifs, and it should improve our ability to identify the protein cleavage sites, interaction sites, post-translational modification attachment sites, etc., that underlie much of biology. It may be equally useful for arbitrarily gapped motifs in DNA and RNA, although fewer examples of such motifs are known at present. GLAM2 is public domain software, available for download at http://bioinformatics.org.au/glam2. PMID:18437229

  16. Validation of Splicing Events in Transcriptome Sequencing Data

    PubMed Central

    Kaisers, Wolfgang; Ptok, Johannes; Schwender, Holger; Schaal, Heiner

    2017-01-01

    Genomic alignments of sequenced cellular messenger RNA contain gapped alignments which are interpreted as consequence of intron removal. The resulting gap-sites, genomic locations of alignment gaps, are landmarks representing potential splice-sites. As alignment algorithms report gap-sites with a considerable false discovery rate, validations are required. We describe two quality scores, gap quality score (gqs) and weighted gap information score (wgis), developed for validation of putative splicing events: While gqs solely relies on alignment data wgis additionally considers information from the genomic sequence. FASTQ files obtained from 54 human dermal fibroblast samples were aligned against the human genome (GRCh38) using TopHat and STAR aligner. Statistical properties of gap-sites validated by gqs and wgis were evaluated by their sequence similarity to known exon-intron borders. Within the 54 samples, TopHat identifies 1,000,380 and STAR reports 6,487,577 gap-sites. Due to the lack of strand information, however, the percentage of identified GT-AG gap-sites is rather low. While gap-sites from TopHat contain ≈89% GT-AG, gap-sites from STAR only contain ≈42% GT-AG dinucleotide pairs in merged data from 54 fibroblast samples. Validation with gqs yields 156,251 gap-sites from TopHat alignments and 166,294 from STAR alignments. Validation with wgis yields 770,327 gap-sites from TopHat alignments and 1,065,596 from STAR alignments. Both alignment algorithms, TopHat and STAR, report gap-sites with considerable false discovery rate, which can drastically be reduced by validation with gqs and wgis. PMID:28545234

  17. Easy and Fast Reconstruction of a 3D Avatar with an RGB-D Sensor.

    PubMed

    Mao, Aihua; Zhang, Hong; Liu, Yuxin; Zheng, Yinglong; Li, Guiqing; Han, Guoqiang

    2017-05-12

    This paper proposes a new easy and fast 3D avatar reconstruction method using an RGB-D sensor. Users can easily implement human body scanning and modeling just with a personal computer and a single RGB-D sensor such as a Microsoft Kinect within a small workspace in their home or office. To make the reconstruction of 3D avatars easy and fast, a new data capture strategy is proposed for efficient human body scanning, which captures only 18 frames from six views with a close scanning distance to fully cover the body; meanwhile, efficient alignment algorithms are presented to locally align the data frames in the single view and then globally align them in multi-views based on pairwise correspondence. In this method, we do not adopt shape priors or subdivision tools to synthesize the model, which helps to reduce modeling complexity. Experimental results indicate that this method can obtain accurate reconstructed 3D avatar models, and the running performance is faster than that of similar work. This research offers a useful tool for the manufacturers to quickly and economically create 3D avatars for products design, entertainment and online shopping.

  18. Adaptive Local Realignment of Protein Sequences.

    PubMed

    DeBlasio, Dan; Kececioglu, John

    2018-06-11

    While mutation rates can vary markedly over the residues of a protein, multiple sequence alignment tools typically use the same values for their scoring-function parameters across a protein's entire length. We present a new approach, called adaptive local realignment, that in contrast automatically adapts to the diversity of mutation rates along protein sequences. This builds upon a recent technique known as parameter advising, which finds global parameter settings for an aligner, to now adaptively find local settings. Our approach in essence identifies local regions with low estimated accuracy, constructs a set of candidate realignments using a carefully-chosen collection of parameter settings, and replaces the region if a realignment has higher estimated accuracy. This new method of local parameter advising, when combined with prior methods for global advising, boosts alignment accuracy as much as 26% over the best default setting on hard-to-align protein benchmarks, and by 6.4% over global advising alone. Adaptive local realignment has been implemented within the Opal aligner using the Facet accuracy estimator.

  19. Initial Navigation Alignment of Optical Instruments on GOES-R

    NASA Astrophysics Data System (ADS)

    Isaacson, P.; DeLuccia, F.; Reth, A. D.; Igli, D. A.; Carter, D.

    2016-12-01

    The GOES-R satellite is the first in NOAA's next-generation series of geostationary weather satellites. In addition to a number of space weather sensors, it will carry two principal optical earth-observing instruments, the Advanced Baseline Imager (ABI) and the Geostationary Lightning Mapper (GLM). During launch, currently scheduled for November of 2016, the alignment of these optical instruments is anticipated to shift from that measured during pre-launch characterization. While both instruments have image navigation and registration (INR) processing algorithms to enable automated geolocation of the collected data, the launch-derived misalignment may be too large for these approaches to function without an initial adjustment to calibration parameters. The parameters that may require adjustment are for Line of Sight Motion Compensation (LMC), and the adjustments will be estimated on orbit during the post-launch test (PLT) phase. We have developed approaches to estimate the initial alignment errors for both ABI and GLM image products. Our approaches involve comparison of ABI and GLM images collected during PLT to a set of reference ("truth") images using custom image processing tools and other software (the INR Performance Assessment Tool Set, or "IPATS") being developed for other INR assessments of ABI and GLM data. IPATS is based on image correlation approaches to determine offsets between input and reference images, and these offsets are the fundamental input to our estimate of the initial alignment errors. Initial testing of our alignment algorithms on proxy datasets lends high confidence that their application will determine the initial alignment errors to within sufficient accuracy to enable the operational INR processing approaches to proceed in a nominal fashion. We will report on the algorithms, implementation approach, and status of these initial alignment tools being developed for the GOES-R ABI and GLM instruments.

  20. Automatic detection of left and right ventricles from CTA enables efficient alignment of anatomy with myocardial perfusion data.

    PubMed

    Piccinelli, Marina; Faber, Tracy L; Arepalli, Chesnal D; Appia, Vikram; Vinten-Johansen, Jakob; Schmarkey, Susan L; Folks, Russell D; Garcia, Ernest V; Yezzi, Anthony

    2014-02-01

    Accurate alignment between cardiac CT angiographic studies (CTA) and nuclear perfusion images is crucial for improved diagnosis of coronary artery disease. This study evaluated in an animal model the accuracy of a CTA fully automated biventricular segmentation algorithm, a necessary step for automatic and thus efficient PET/CT alignment. Twelve pigs with acute infarcts were imaged using Rb-82 PET and 64-slice CTA. Post-mortem myocardium mass measurements were obtained. Endocardial and epicardial myocardial boundaries were manually and automatically detected on the CTA and both segmentations used to perform PET/CT alignment. To assess the segmentation performance, image-based myocardial masses were compared to experimental data; the hand-traced profiles were used as a reference standard to assess the global and slice-by-slice robustness of the automated algorithm in extracting myocardium, LV, and RV. Mean distances between the automated and the manual 3D segmented surfaces were computed. Finally, differences in rotations and translations between the manual and automatic surfaces were estimated post-PET/CT alignment. The largest, smallest, and median distances between interactive and automatic surfaces averaged 1.2 ± 2.1, 0.2 ± 1.6, and 0.7 ± 1.9 mm. The average angular and translational differences in CT/PET alignments were 0.4°, -0.6°, and -2.3° about x, y, and z axes, and 1.8, -2.1, and 2.0 mm in x, y, and z directions. Our automatic myocardial boundary detection algorithm creates surfaces from CTA that are similar in accuracy and provide similar alignments with PET as those obtained from interactive tracing. Specific difficulties in a reliable segmentation of the apex and base regions will require further improvements in the automated technique.

  1. Multiple DNA and protein sequence alignment on a workstation and a supercomputer.

    PubMed

    Tajima, K

    1988-11-01

    This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.

  2. MC64-ClustalWP2: A Highly-Parallel Hybrid Strategy to Align Multiple Sequences in Many-Core Architectures

    PubMed Central

    Díaz, David; Esteban, Francisco J.; Hernández, Pilar; Caballero, Juan Antonio; Guevara, Antonio

    2014-01-01

    We have developed the MC64-ClustalWP2 as a new implementation of the Clustal W algorithm, integrating a novel parallelization strategy and significantly increasing the performance when aligning long sequences in architectures with many cores. It must be stressed that in such a process, the detailed analysis of both the software and hardware features and peculiarities is of paramount importance to reveal key points to exploit and optimize the full potential of parallelism in many-core CPU systems. The new parallelization approach has focused into the most time-consuming stages of this algorithm. In particular, the so-called progressive alignment has drastically improved the performance, due to a fine-grained approach where the forward and backward loops were unrolled and parallelized. Another key approach has been the implementation of the new algorithm in a hybrid-computing system, integrating both an Intel Xeon multi-core CPU and a Tilera Tile64 many-core card. A comparison with other Clustal W implementations reveals the high-performance of the new algorithm and strategy in many-core CPU architectures, in a scenario where the sequences to align are relatively long (more than 10 kb) and, hence, a many-core GPU hardware cannot be used. Thus, the MC64-ClustalWP2 runs multiple alignments more than 18x than the original Clustal W algorithm, and more than 7x than the best x86 parallel implementation to date, being publicly available through a web service. Besides, these developments have been deployed in cost-effective personal computers and should be useful for life-science researchers, including the identification of identities and differences for mutation/polymorphism analyses, biodiversity and evolutionary studies and for the development of molecular markers for paternity testing, germplasm management and protection, to assist breeding, illegal traffic control, fraud prevention and for the protection of the intellectual property (identification/traceability), including the protected designation of origin, among other applications. PMID:24710354

  3. Active control for stabilization of neoclassical tearing modesa)

    NASA Astrophysics Data System (ADS)

    Humphreys, D. A.; Ferron, J. R.; La Haye, R. J.; Luce, T. C.; Petty, C. C.; Prater, R.; Welander, A. S.

    2006-05-01

    This work describes active control algorithms used by DIII-D [J. L. Luxon, Nucl. Fusion 42, 614 (2002)] to stabilize and maintain suppression of 3/2 or 2/1 neoclassical tearing modes (NTMs) by application of electron cyclotron current drive (ECCD) at the rational q surface. The DIII-D NTM control system can determine the correct q-surface/ECCD alignment and stabilize existing modes within 100-500ms of activation, or prevent mode growth with preemptive application of ECCD, in both cases enabling stable operation at normalized beta values above 3.5. Because NTMs can limit performance or cause plasma-terminating disruptions in tokamaks, their stabilization is essential to the high performance operation of ITER [R. Aymar et al., ITER Joint Central Team, ITER Home Teams, Nucl. Fusion 41, 1301 (2001)]. The DIII-D NTM control system has demonstrated many elements of an eventual ITER solution, including general algorithms for robust detection of q-surface/ECCD alignment and for real-time maintenance of alignment following the disappearance of the mode. This latter capability, unique to DIII-D, is based on real-time reconstruction of q-surface geometry by a Grad-Shafranov solver using external magnetics and internal motional Stark effect measurements. Alignment is achieved by varying either the plasma major radius (and the rational q surface) or the toroidal field (and the deposition location). The requirement to achieve and maintain q-surface/ECCD alignment with accuracy on the order of 1cm is routinely met by the DIII-D Plasma Control System and these algorithms. We discuss the integrated plasma control design process used for developing these and other general control algorithms, which includes physics-based modeling and testing of the algorithm implementation against simulations of actuator and plasma responses. This systematic design/test method and modeling environment enabled successful mode suppression by the NTM control system upon first-time use in an experimental discharge.

  4. A Hough Transform Global Probabilistic Approach to Multiple-Subject Diffusion MRI Tractography

    PubMed Central

    Aganj, Iman; Lenglet, Christophe; Jahanshad, Neda; Yacoub, Essa; Harel, Noam; Thompson, Paul M.; Sapiro, Guillermo

    2011-01-01

    A global probabilistic fiber tracking approach based on the voting process provided by the Hough transform is introduced in this work. The proposed framework tests candidate 3D curves in the volume, assigning to each one a score computed from the diffusion images, and then selects the curves with the highest scores as the potential anatomical connections. The algorithm avoids local minima by performing an exhaustive search at the desired resolution. The technique is easily extended to multiple subjects, considering a single representative volume where the registered high-angular resolution diffusion images (HARDI) from all the subjects are non-linearly combined, thereby obtaining population-representative tracts. The tractography algorithm is run only once for the multiple subjects, and no tract alignment is necessary. We present experimental results on HARDI volumes, ranging from simulated and 1.5T physical phantoms to 7T and 4T human brain and 7T monkey brain datasets. PMID:21376655

  5. Prefiltering Model for Homology Detection Algorithms on GPU.

    PubMed

    Retamosa, Germán; de Pedro, Luis; González, Ivan; Tamames, Javier

    2016-01-01

    Homology detection has evolved over the time from heavy algorithms based on dynamic programming approaches to lightweight alternatives based on different heuristic models. However, the main problem with these algorithms is that they use complex statistical models, which makes it difficult to achieve a relevant speedup and find exact matches with the original results. Thus, their acceleration is essential. The aim of this article was to prefilter a sequence database. To make this work, we have implemented a groundbreaking heuristic model based on NVIDIA's graphics processing units (GPUs) and multicore processors. Depending on the sensitivity settings, this makes it possible to quickly reduce the sequence database by factors between 50% and 95%, while rejecting no significant sequences. Furthermore, this prefiltering application can be used together with multiple homology detection algorithms as a part of a next-generation sequencing system. Extensive performance and accuracy tests have been carried out in the Spanish National Centre for Biotechnology (NCB). The results show that GPU hardware can accelerate the execution times of former homology detection applications, such as National Centre for Biotechnology Information (NCBI), Basic Local Alignment Search Tool for Proteins (BLASTP), up to a factor of 4.

  6. An efficient direct method for image registration of flat objects

    NASA Astrophysics Data System (ADS)

    Nikolaev, Dmitry; Tihonkih, Dmitrii; Makovetskii, Artyom; Voronin, Sergei

    2017-09-01

    Image alignment of rigid surfaces is a rapidly developing area of research and has many practical applications. Alignment methods can be roughly divided into two types: feature-based methods and direct methods. Known SURF and SIFT algorithms are examples of the feature-based methods. Direct methods refer to those that exploit the pixel intensities without resorting to image features and image-based deformations are general direct method to align images of deformable objects in 3D space. Nevertheless, it is not good for the registration of images of 3D rigid objects since the underlying structure cannot be directly evaluated. In the article, we propose a model that is suitable for image alignment of rigid flat objects under various illumination models. The brightness consistency assumptions used for reconstruction of optimal geometrical transformation. Computer simulation results are provided to illustrate the performance of the proposed algorithm for computing of an accordance between pixels of two images.

  7. Basecalling with LifeTrace

    PubMed Central

    Walther, Dirk; Bartha, Gábor; Morris, Macdonald

    2001-01-01

    A pivotal step in electrophoresis sequencing is the conversion of the raw, continuous chromatogram data into the actual sequence of discrete nucleotides, a process referred to as basecalling. We describe a novel algorithm for basecalling implemented in the program LifeTrace. Like Phred, currently the most widely used basecalling software program, LifeTrace takes processed trace data as input. It was designed to be tolerant to variable peak spacing by means of an improved peak-detection algorithm that emphasizes local chromatogram information over global properties. LifeTrace is shown to generate high-quality basecalls and reliable quality scores. It proved particularly effective when applied to MegaBACE capillary sequencing machines. In a benchmark test of 8372 dye-primer MegaBACE chromatograms, LifeTrace generated 17% fewer substitution errors, 16% fewer insertion/deletion errors, and 2.4% more aligned bases to the finished sequence than did Phred. For two sets totaling 6624 dye-terminator chromatograms, the performance improvement was 15% fewer substitution errors, 10% fewer insertion/deletion errors, and 2.1% more aligned bases. The processing time required by LifeTrace is comparable to that of Phred. The predicted quality scores were in line with observed quality scores, permitting direct use for quality clipping and in silico single nucleotide polymorphism (SNP) detection. Furthermore, we introduce a new type of quality score associated with every basecall: the gap-quality. It estimates the probability of a deletion error between the current and the following basecall. This additional quality score improves detection of single basepair deletions when used for locating potential basecalling errors during the alignment. We also describe a new protocol for benchmarking that we believe better discerns basecaller performance differences than methods previously published. PMID:11337481

  8. On decomposing stimulus and response waveforms in event-related potentials recordings.

    PubMed

    Yin, Gang; Zhang, Jun

    2011-06-01

    Event-related potentials (ERPs) reflect the brain activities related to specific behavioral events, and are obtained by averaging across many trial repetitions with individual trials aligned to the onset of a specific event, e.g., the onset of stimulus (s-aligned) or the onset of the behavioral response (r-aligned). However, the s-aligned and r-aligned ERP waveforms do not purely reflect, respectively, underlying stimulus (S-) or response (R-) component waveform, due to their cross-contaminations in the recorded ERP waveforms. Zhang [J. Neurosci. Methods, 80, pp. 49-63, 1998] proposed an algorithm to recover the pure S-component waveform and the pure R-component waveform from the s-aligned and r-aligned ERP average waveforms-however, due to the nature of this inverse problem, a direct solution is sensitive to noise that disproportionally affects low-frequency components, hindering the practical implementation of this algorithm. Here, we apply the Wiener deconvolution technique to deal with noise in input data, and investigate a Tikhonov regularization approach to obtain a stable solution that is robust against variances in the sampling of reaction-time distribution (when number of trials is low). Our method is demonstrated using data from a Go/NoGo experiment about image classification and recognition.

  9. B-cell Ligand Processing Pathways Detected by Large-scale Comparative Analysis

    PubMed Central

    Towfic, Fadi; Gupta, Shakti; Honavar, Vasant; Subramaniam, Shankar

    2012-01-01

    The initiation of B-cell ligand recognition is a critical step for the generation of an immune response against foreign bodies. We sought to identify the biochemical pathways involved in the B-cell ligand recognition cascade and sets of ligands that trigger similar immunological responses. We utilized several comparative approaches to analyze the gene coexpression networks generated from a set of microarray experiments spanning 33 different ligands. First, we compared the degree distributions of the generated networks. Second, we utilized a pairwise network alignment algorithm, BiNA, to align the networks based on the hubs in the networks. Third, we aligned the networks based on a set of KEGG pathways. We summarized our results by constructing a consensus hierarchy of pathways that are involved in B cell ligand recognition. The resulting pathways were further validated through literature for their common physiological responses. Collectively, the results based on our comparative analyses of degree distributions, alignment of hubs, and alignment based on KEGG pathways provide a basis for molecular characterization of the immune response states of B-cells and demonstrate the power of comparative approaches (e.g., gene coexpression network alignment algorithms) in elucidating biochemical pathways involved in complex signaling events in cells. PMID:22917187

  10. Spatial-Spectral Approaches to Edge Detection in Hyperspectral Remote Sensing

    NASA Astrophysics Data System (ADS)

    Cox, Cary M.

    This dissertation advances geoinformation science at the intersection of hyperspectral remote sensing and edge detection methods. A relatively new phenomenology among its remote sensing peers, hyperspectral imagery (HSI) comprises only about 7% of all remote sensing research - there are five times as many radar-focused peer reviewed journal articles than hyperspectral-focused peer reviewed journal articles. Similarly, edge detection studies comprise only about 8% of image processing research, most of which is dedicated to image processing techniques most closely associated with end results, such as image classification and feature extraction. Given the centrality of edge detection to mapping, that most important of geographic functions, improving the collective understanding of hyperspectral imagery edge detection methods constitutes a research objective aligned to the heart of geoinformation sciences. Consequently, this dissertation endeavors to narrow the HSI edge detection research gap by advancing three HSI edge detection methods designed to leverage HSI's unique chemical identification capabilities in pursuit of generating accurate, high-quality edge planes. The Di Zenzo-based gradient edge detection algorithm, an innovative version of the Resmini HySPADE edge detection algorithm and a level set-based edge detection algorithm are tested against 15 traditional and non-traditional HSI datasets spanning a range of HSI data configurations, spectral resolutions, spatial resolutions, bandpasses and applications. This study empirically measures algorithm performance against Dr. John Canny's six criteria for a good edge operator: false positives, false negatives, localization, single-point response, robustness to noise and unbroken edges. The end state is a suite of spatial-spectral edge detection algorithms that produce satisfactory edge results against a range of hyperspectral data types applicable to a diverse set of earth remote sensing applications. This work also explores the concept of an edge within hyperspectral space, the relative importance of spatial and spectral resolutions as they pertain to HSI edge detection and how effectively compressed HSI data improves edge detection results. The HSI edge detection experiments yielded valuable insights into the algorithms' strengths, weaknesses and optimal alignment to remote sensing applications. The gradient-based edge operator produced strong edge planes across a range of evaluation measures and applications, particularly with respect to false negatives, unbroken edges, urban mapping, vegetation mapping and oil spill mapping applications. False positives and uncompressed HSI data presented occasional challenges to the algorithm. The HySPADE edge operator produced satisfactory results with respect to localization, single-point response, oil spill mapping and trace chemical detection, and was challenged by false positives, declining spectral resolution and vegetation mapping applications. The level set edge detector produced high-quality edge planes for most tests and demonstrated strong performance with respect to false positives, single-point response, oil spill mapping and mineral mapping. False negatives were a regular challenge for the level set edge detection algorithm. Finally, HSI data optimized for spectral information compression and noise was shown to improve edge detection performance across all three algorithms, while the gradient-based algorithm and HySPADE demonstrated significant robustness to declining spectral and spatial resolutions.

  11. Incorporating evolution of transcription factor binding sites into annotated alignments.

    PubMed

    Bais, Abha S; Grossmann, Stefen; Vingron, Martin

    2007-08-01

    Identifying transcription factor binding sites (TFBSs) is essential to elucidate putative regulatory mechanisms. A common strategy is to combine cross-species conservation with single sequence TFBS annotation to yield "conserved TFBSs". Most current methods in this field adopt a multi-step approach that segregates the two aspects. Again, it is widely accepted that the evolutionary dynamics of binding sites differ from those of the surrounding sequence. Hence, it is desirable to have an approach that explicitly takes this factor into account. Although a plethora of approaches have been proposed for the prediction of conserved TFBSs, very few explicitly model TFBS evolutionary properties, while additionally being multi-step. Recently, we introduced a novel approach to simultaneously align and annotate conserved TFBSs in a pair of sequences. Building upon the standard Smith-Waterman algorithm for local alignments, SimAnn introduces additional states for profiles to output extended alignments or annotated alignments. That is, alignments with parts annotated as gaplessly aligned TFBSs (pair-profile hits)are generated. Moreover,the pair- profile related parameters are derived in a sound statistical framework. In this article, we extend this approach to explicitly incorporate evolution of binding sites in the SimAnn framework. We demonstrate the extension in the theoretical derivations through two position-specific evolutionary models, previously used for modelling TFBS evolution. In a simulated setting, we provide a proof of concept that the approach works given the underlying assumptions,as compared to the original work. Finally, using a real dataset of experimentally verified binding sites in human-mouse sequence pairs,we compare the new approach (eSimAnn) to an existing multi-step tool that also considers TFBS evolution. Although it is widely accepted that binding sites evolve differently from the surrounding sequences, most comparative TFBS identification methods do not explicitly consider this.Additionally, prediction of conserved binding sites is carried out in a multi-step approach that segregates alignment from TFBS annotation. In this paper, we demonstrate how the simultaneous alignment and annotation approach of SimAnn can be further extended to incorporate TFBS evolutionary relationships. We study how alignments and binding site predictions interplay at varying evolutionary distances and for various profile qualities.

  12. Application of fast Fourier transform cross-correlation and mass spectrometry data for accurate alignment of chromatograms.

    PubMed

    Zheng, Yi-Bao; Zhang, Zhi-Min; Liang, Yi-Zeng; Zhan, De-Jian; Huang, Jian-Hua; Yun, Yong-Huan; Xie, Hua-Lin

    2013-04-19

    Chromatography has been established as one of the most important analytical methods in the modern analytical laboratory. However, preprocessing of the chromatograms, especially peak alignment, is usually a time-consuming task prior to extracting useful information from the datasets because of the small unavoidable differences in the experimental conditions caused by minor changes and drift. Most of the alignment algorithms are performed on reduced datasets using only the detected peaks in the chromatograms, which means a loss of data and introduces the problem of extraction of peak data from the chromatographic profiles. These disadvantages can be overcome by using the full chromatographic information that is generated from hyphenated chromatographic instruments. A new alignment algorithm called CAMS (Chromatogram Alignment via Mass Spectra) is present here to correct the retention time shifts among chromatograms accurately and rapidly. In this report, peaks of each chromatogram were detected based on Continuous Wavelet Transform (CWT) with Haar wavelet and were aligned against the reference chromatogram via the correlation of mass spectra. The aligning procedure was accelerated by Fast Fourier Transform cross correlation (FFT cross correlation). This approach has been compared with several well-known alignment methods on real chromatographic datasets, which demonstrates that CAMS can preserve the shape of peaks and achieve a high quality alignment result. Furthermore, the CAMS method was implemented in the Matlab language and available as an open source package at http://www.github.com/matchcoder/CAMS. Copyright © 2013. Published by Elsevier B.V.

  13. A novel surface registration algorithm with biomedical modeling applications.

    PubMed

    Huang, Heng; Shen, Li; Zhang, Rong; Makedon, Fillia; Saykin, Andrew; Pearlman, Justin

    2007-07-01

    In this paper, we propose a novel surface matching algorithm for arbitrarily shaped but simply connected 3-D objects. The spherical harmonic (SPHARM) method is used to describe these 3-D objects, and a novel surface registration approach is presented. The proposed technique is applied to various applications of medical image analysis. The results are compared with those using the traditional method, in which the first-order ellipsoid is used for establishing surface correspondence and aligning objects. In these applications, our surface alignment method is demonstrated to be more accurate and flexible than the traditional approach. This is due in large part to the fact that a new surface parameterization is generated by a shortcut that employs a useful rotational property of spherical harmonic basis functions for a fast implementation. In order to achieve a suitable computational speed for practical applications, we propose a fast alignment algorithm that improves computational complexity of the new surface registration method from O(n3) to O(n2).

  14. Initial Navigation Alignment of Optical Instruments on GOES-R

    NASA Technical Reports Server (NTRS)

    Isaacson, Peter J.; DeLuccia, Frank J.; Reth, Alan D.; Igli, David A.; Carter, Delano R.

    2016-01-01

    Post-launch alignment errors for the Advanced Baseline Imager (ABI) and Geospatial Lightning Mapper (GLM) on GOES-R may be too large for the image navigation and registration (INR) processing algorithms to function without an initial adjustment to calibration parameters. We present an approach that leverages a combination of user-selected image-to-image tie points and image correlation algorithms to estimate this initial launch-induced offset and calculate adjustments to the Line of Sight Motion Compensation (LMC) parameters. We also present an approach to generate synthetic test images, to which shifts and rotations of known magnitude are applied. Results of applying the initial alignment tools to a subset of these synthetic test images are presented. The results for both ABI and GLM are within the specifications established for these tools, and indicate that application of these tools during the post-launch test (PLT) phase of GOES-R operations will enable the automated INR algorithms for both instruments to function as intended.

  15. Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm.

    PubMed

    Khaled, Heba; Faheem, Hossam El Deen Mostafa; El Gohary, Rania

    2015-01-01

    This paper provides a novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA. The proposed model targets homogeneous cluster nodes equipped with similar Graphical Processing Unit (GPU) cards. The model consists of the Master Node Dispatcher (MND) and the Worker GPU Nodes (WGN). The MND distributes the workload among the cluster working nodes and then aggregates the results. The WGN performs the multiple pair-wise sequence alignments using the Smith-Waterman algorithm. We also propose a modified implementation to the Smith-Waterman algorithm based on computing the alignment matrices row-wise. The experimental results demonstrate a considerable reduction in the running time by increasing the number of the working GPU nodes. The proposed model achieved a performance of about 12 Giga cell updates per second when we tested against the SWISS-PROT protein knowledge base running on four nodes.

  16. Depth inpainting by tensor voting.

    PubMed

    Kulkarni, Mandar; Rajagopalan, Ambasamudram N

    2013-06-01

    Depth maps captured by range scanning devices or by using optical cameras often suffer from missing regions due to occlusions, reflectivity, limited scanning area, sensor imperfections, etc. In this paper, we propose a fast and reliable algorithm for depth map inpainting using the tensor voting (TV) framework. For less complex missing regions, local edge and depth information is utilized for synthesizing missing values. The depth variations are modeled by local planes using 3D TV, and missing values are estimated using plane equations. For large and complex missing regions, we collect and evaluate depth estimates from self-similar (training) datasets. We align the depth maps of the training set with the target (defective) depth map and evaluate the goodness of depth estimates among candidate values using 3D TV. We demonstrate the effectiveness of the proposed approaches on real as well as synthetic data.

  17. Inverting Image Data For Optical Testing And Alignment

    NASA Technical Reports Server (NTRS)

    Shao, Michael; Redding, David; Yu, Jeffrey W.; Dumont, Philip J.

    1993-01-01

    Data from images produced by slightly incorrectly figured concave primary mirror in telescope processed into estimate of spherical aberration of mirror, by use of algorithm finding nonlinear least-squares best fit between actual images and synthetic images produced by multiparameter mathematical model of telescope optical system. Estimated spherical aberration, in turn, converted into estimate of deviation of reflector surface from nominal precise shape. Algorithm devised as part of effort to determine error in surface figure of primary mirror of Hubble space telescope, so corrective lens designed. Modified versions of algorithm also used to find optical errors in other components of telescope or of other optical systems, for purposes of testing, alignment, and/or correction.

  18. Advanced computer graphic techniques for laser range finder (LRF) simulation

    NASA Astrophysics Data System (ADS)

    Bedkowski, Janusz; Jankowski, Stanislaw

    2008-11-01

    This paper show an advanced computer graphic techniques for laser range finder (LRF) simulation. The LRF is the common sensor for unmanned ground vehicle, autonomous mobile robot and security applications. The cost of the measurement system is extremely high, therefore the simulation tool is designed. The simulation gives an opportunity to execute algorithm such as the obstacle avoidance[1], slam for robot localization[2], detection of vegetation and water obstacles in surroundings of the robot chassis[3], LRF measurement in crowd of people[1]. The Axis Aligned Bounding Box (AABB) and alternative technique based on CUDA (NVIDIA Compute Unified Device Architecture) is presented.

  19. LS-align: an atom-level, flexible ligand structural alignment algorithm for high-throughput virtual screening.

    PubMed

    Hu, Jun; Liu, Zi; Yu, Dong-Jun; Zhang, Yang

    2018-02-15

    Sequence-order independent structural comparison, also called structural alignment, of small ligand molecules is often needed for computer-aided virtual drug screening. Although many ligand structure alignment programs are proposed, most of them build the alignments based on rigid-body shape comparison which cannot provide atom-specific alignment information nor allow structural variation; both abilities are critical to efficient high-throughput virtual screening. We propose a novel ligand comparison algorithm, LS-align, to generate fast and accurate atom-level structural alignments of ligand molecules, through an iterative heuristic search of the target function that combines inter-atom distance with mass and chemical bond comparisons. LS-align contains two modules of Rigid-LS-align and Flexi-LS-align, designed for rigid-body and flexible alignments, respectively, where a ligand-size independent, statistics-based scoring function is developed to evaluate the similarity of ligand molecules relative to random ligand pairs. Large-scale benchmark tests are performed on prioritizing chemical ligands of 102 protein targets involving 1,415,871 candidate compounds from the DUD-E (Database of Useful Decoys: Enhanced) database, where LS-align achieves an average enrichment factor (EF) of 22.0 at the 1% cutoff and the AUC score of 0.75, which are significantly higher than other state-of-the-art methods. Detailed data analyses show that the advanced performance is mainly attributed to the design of the target function that combines structural and chemical information to enhance the sensitivity of recognizing subtle difference of ligand molecules and the introduces of structural flexibility that help capture the conformational changes induced by the ligand-receptor binding interactions. These data demonstrate a new avenue to improve the virtual screening efficiency through the development of sensitive ligand structural alignments. http://zhanglab.ccmb.med.umich.edu/LS-align/. njyudj@njust.edu.cn or zhng@umich.edu. Supplementary data are available at Bioinformatics online.

  20. KDE Bioscience: platform for bioinformatics analysis workflows.

    PubMed

    Lu, Qiang; Hao, Pei; Curcin, Vasa; He, Weizhong; Li, Yuan-Yuan; Luo, Qing-Ming; Guo, Yi-Ke; Li, Yi-Xue

    2006-08-01

    Bioinformatics is a dynamic research area in which a large number of algorithms and programs have been developed rapidly and independently without much consideration so far of the need for standardization. The lack of such common standards combined with unfriendly interfaces make it difficult for biologists to learn how to use these tools and to translate the data formats from one to another. Consequently, the construction of an integrative bioinformatics platform to facilitate biologists' research is an urgent and challenging task. KDE Bioscience is a java-based software platform that collects a variety of bioinformatics tools and provides a workflow mechanism to integrate them. Nucleotide and protein sequences from local flat files, web sites, and relational databases can be entered, annotated, and aligned. Several home-made or 3rd-party viewers are built-in to provide visualization of annotations or alignments. KDE Bioscience can also be deployed in client-server mode where simultaneous execution of the same workflow is supported for multiple users. Moreover, workflows can be published as web pages that can be executed from a web browser. The power of KDE Bioscience comes from the integrated algorithms and data sources. With its generic workflow mechanism other novel calculations and simulations can be integrated to augment the current sequence analysis functions. Because of this flexible and extensible architecture, KDE Bioscience makes an ideal integrated informatics environment for future bioinformatics or systems biology research.

  1. Multi-exposure high dynamic range image synthesis with camera shake correction

    NASA Astrophysics Data System (ADS)

    Li, Xudong; Chen, Yongfu; Jiang, Hongzhi; Zhao, Huijie

    2017-10-01

    Machine vision plays an important part in industrial online inspection. Owing to the nonuniform illuminance conditions and variable working distances, the captured image tends to be over-exposed or under-exposed. As a result, when processing the image such as crack inspection, the algorithm complexity and computing time increase. Multiexposure high dynamic range (HDR) image synthesis is used to improve the quality of the captured image, whose dynamic range is limited. Inevitably, camera shake will result in ghost effect, which blurs the synthesis image to some extent. However, existed exposure fusion algorithms assume that the input images are either perfectly aligned or captured in the same scene. These assumptions limit the application. At present, widely used registration based on Scale Invariant Feature Transform (SIFT) is usually time consuming. In order to rapidly obtain a high quality HDR image without ghost effect, we come up with an efficient Low Dynamic Range (LDR) images capturing approach and propose a registration method based on ORiented Brief (ORB) and histogram equalization which can eliminate the illumination differences between the LDR images. The fusion is performed after alignment. The experiment results demonstrate that the proposed method is robust to illumination changes and local geometric distortion. Comparing with other exposure fusion methods, our method is more efficient and can produce HDR images without ghost effect by registering and fusing four multi-exposure images.

  2. FASMA: a service to format and analyze sequences in multiple alignments.

    PubMed

    Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M

    2007-12-01

    Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.

  3. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    PubMed

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  4. Local alignment vectors reveal cancer cell-induced ECM fiber remodeling dynamics

    PubMed Central

    Lee, Byoungkoo; Konen, Jessica; Wilkinson, Scott; Marcus, Adam I.; Jiang, Yi

    2017-01-01

    Invasive cancer cells interact with the surrounding extracellular matrix (ECM), remodeling ECM fiber network structure by condensing, degrading, and aligning these fibers. We developed a novel local alignment vector analysis method to quantitatively measure collagen fiber alignment as a vector field using Circular Statistics. This method was applied to human non-small cell lung carcinoma (NSCLC) cell lines, embedded as spheroids in a collagen gel. Collagen remodeling was monitored using second harmonic generation imaging under normal conditions and when the LKB1-MARK1 pathway was disrupted through RNAi-based approaches. The results showed that inhibiting LKB1 or MARK1 in NSCLC increases the collagen fiber alignment and captures outward alignment vectors from the tumor spheroid, corresponding to high invasiveness of LKB1 mutant cancer cells. With time-lapse imaging of ECM micro-fiber morphology, the local alignment vector can measure the dynamic signature of invasive cancer cell activity and cell-migration-induced ECM and collagen remodeling and realigning dynamics. PMID:28045069

  5. Use artificial neural network to align biological ontologies.

    PubMed

    Huang, Jingshan; Dang, Jiangbo; Huhns, Michael N; Zheng, W Jim

    2008-09-16

    Being formal, declarative knowledge representation models, ontologies help to address the problem of imprecise terminologies in biological and biomedical research. However, ontologies constructed under the auspices of the Open Biomedical Ontologies (OBO) group have exhibited a great deal of variety, because different parties can design ontologies according to their own conceptual views of the world. It is therefore becoming critical to align ontologies from different parties. During automated/semi-automated alignment across biological ontologies, different semantic aspects, i.e., concept name, concept properties, and concept relationships, contribute in different degrees to alignment results. Therefore, a vector of weights must be assigned to these semantic aspects. It is not trivial to determine what those weights should be, and current methodologies depend a lot on human heuristics. In this paper, we take an artificial neural network approach to learn and adjust these weights, and thereby support a new ontology alignment algorithm, customized for biological ontologies, with the purpose of avoiding some disadvantages in both rule-based and learning-based aligning algorithms. This approach has been evaluated by aligning two real-world biological ontologies, whose features include huge file size, very few instances, concept names in numerical strings, and others. The promising experiment results verify our proposed hypothesis, i.e., three weights for semantic aspects learned from a subset of concepts are representative of all concepts in the same ontology. Therefore, our method represents a large leap forward towards automating biological ontology alignment.

  6. Mango: multiple alignment with N gapped oligos.

    PubMed

    Zhang, Zefeng; Lin, Hao; Li, Ming

    2008-06-01

    Multiple sequence alignment is a classical and challenging task. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state-of-the-art works suffer from the "once a gap, always a gap" phenomenon. Is there a radically new way to do multiple sequence alignment? In this paper, we introduce a novel and orthogonal multiple sequence alignment method, using both multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole and tries to build the alignment vertically, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds have proved significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks, showing that MANGO compares favorably, in both accuracy and speed, against state-of-the-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, ProbConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0, and Kalign 2.0. We have further demonstrated the scalability of MANGO on very large datasets of repeat elements. MANGO can be downloaded at http://www.bioinfo.org.cn/mango/ and is free for academic usage.

  7. A Novel Partial Sequence Alignment Tool for Finding Large Deletions

    PubMed Central

    Aruk, Taner; Ustek, Duran; Kursun, Olcay

    2012-01-01

    Finding large deletions in genome sequences has become increasingly more useful in bioinformatics, such as in clinical research and diagnosis. Although there are a number of publically available next generation sequencing mapping and sequence alignment programs, these software packages do not correctly align fragments containing deletions larger than one kb. We present a fast alignment software package, BinaryPartialAlign, that can be used by wet lab scientists to find long structural variations in their experiments. For BinaryPartialAlign, we make use of the Smith-Waterman (SW) algorithm with a binary-search-based approach for alignment with large gaps that we called partial alignment. BinaryPartialAlign implementation is compared with other straight-forward applications of SW. Simulation results on mtDNA fragments demonstrate the effectiveness (runtime and accuracy) of the proposed method. PMID:22566777

  8. Long Read Alignment with Parallel MapReduce Cloud Platform

    PubMed Central

    Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki

    2015-01-01

    Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms. PMID:26839887

  9. Long Read Alignment with Parallel MapReduce Cloud Platform.

    PubMed

    Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki

    2015-01-01

    Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms.

  10. Localization Algorithms of Underwater Wireless Sensor Networks: A Survey

    PubMed Central

    Han, Guangjie; Jiang, Jinfang; Shu, Lei; Xu, Yongjun; Wang, Feng

    2012-01-01

    In Underwater Wireless Sensor Networks (UWSNs), localization is one of most important technologies since it plays a critical role in many applications. Motivated by widespread adoption of localization, in this paper, we present a comprehensive survey of localization algorithms. First, we classify localization algorithms into three categories based on sensor nodes’ mobility: stationary localization algorithms, mobile localization algorithms and hybrid localization algorithms. Moreover, we compare the localization algorithms in detail and analyze future research directions of localization algorithms in UWSNs. PMID:22438752

  11. Computational complexity of algorithms for sequence comparison, short-read assembly and genome alignment.

    PubMed

    Baichoo, Shakuntala; Ouzounis, Christos A

    A multitude of algorithms for sequence comparison, short-read assembly and whole-genome alignment have been developed in the general context of molecular biology, to support technology development for high-throughput sequencing, numerous applications in genome biology and fundamental research on comparative genomics. The computational complexity of these algorithms has been previously reported in original research papers, yet this often neglected property has not been reviewed previously in a systematic manner and for a wider audience. We provide a review of space and time complexity of key sequence analysis algorithms and highlight their properties in a comprehensive manner, in order to identify potential opportunities for further research in algorithm or data structure optimization. The complexity aspect is poised to become pivotal as we will be facing challenges related to the continuous increase of genomic data on unprecedented scales and complexity in the foreseeable future, when robust biological simulation at the cell level and above becomes a reality. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Iterative non-sequential protein structural alignment.

    PubMed

    Salem, Saeed; Zaki, Mohammed J; Bystroff, Christopher

    2009-06-01

    Structural similarity between proteins gives us insights into their evolutionary relationships when there is low sequence similarity. In this paper, we present a novel approach called SNAP for non-sequential pair-wise structural alignment. Starting from an initial alignment, our approach iterates over a two-step process consisting of a superposition step and an alignment step, until convergence. We propose a novel greedy algorithm to construct both sequential and non-sequential alignments. The quality of SNAP alignments were assessed by comparing against the manually curated reference alignments in the challenging SISY and RIPC datasets. Moreover, when applied to a dataset of 4410 protein pairs selected from the CATH database, SNAP produced longer alignments with lower rmsd than several state-of-the-art alignment methods. Classification of folds using SNAP alignments was both highly sensitive and highly selective. The SNAP software along with the datasets are available online at http://www.cs.rpi.edu/~zaki/software/SNAP.

  13. An efficient multi-resolution GA approach to dental image alignment

    NASA Astrophysics Data System (ADS)

    Nassar, Diaa Eldin; Ogirala, Mythili; Adjeroh, Donald; Ammar, Hany

    2006-02-01

    Automating the process of postmortem identification of individuals using dental records is receiving an increased attention in forensic science, especially with the large volume of victims encountered in mass disasters. Dental radiograph alignment is a key step required for automating the dental identification process. In this paper, we address the problem of dental radiograph alignment using a Multi-Resolution Genetic Algorithm (MR-GA) approach. We use location and orientation information of edge points as features; we assume that affine transformations suffice to restore geometric discrepancies between two images of a tooth, we efficiently search the 6D space of affine parameters using GA progressively across multi-resolution image versions, and we use a Hausdorff distance measure to compute the similarity between a reference tooth and a query tooth subject to a possible alignment transform. Testing results based on 52 teeth-pair images suggest that our algorithm converges to reasonable solutions in more than 85% of the test cases, with most of the error in the remaining cases due to excessive misalignments.

  14. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures.

    PubMed

    Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

    2009-01-01

    ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/

  15. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures

    PubMed Central

    Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

    2009-01-01

    ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/ PMID:18971256

  16. cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on CPU+GPU.

    PubMed

    Zhang, Jing; Wang, Hao; Feng, Wu-Chun

    2017-01-01

    BLAST, short for Basic Local Alignment Search Tool, is a ubiquitous tool used in the life sciences for pairwise sequence search. However, with the advent of next-generation sequencing (NGS), whether at the outset or downstream from NGS, the exponential growth of sequence databases is outstripping our ability to analyze the data. While recent studies have utilized the graphics processing unit (GPU) to speedup the BLAST algorithm for searching protein sequences (i.e., BLASTP), these studies use coarse-grained parallelism, where one sequence alignment is mapped to only one thread. Such an approach does not efficiently utilize the capabilities of a GPU, particularly due to the irregularity of BLASTP in both execution paths and memory-access patterns. To address the above shortcomings, we present a fine-grained approach to parallelize BLASTP, where each individual phase of sequence search is mapped to many threads on a GPU. This approach, which we refer to as cuBLASTP, reorders data-access patterns and reduces divergent branches of the most time-consuming phases (i.e., hit detection and ungapped extension). In addition, cuBLASTP optimizes the remaining phases (i.e., gapped extension and alignment with trace back) on a multicore CPU and overlaps their execution with the phases running on the GPU.

  17. Alignments of the galaxies in and around the Virgo cluster with the local velocity shear

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Jounghun; Rey, Soo Chang; Kim, Suk, E-mail: jounghun@astro.snu.ac.kr

    2014-08-10

    Observational evidence is presented for the alignment between the cosmic sheet and the principal axis of the velocity shear field at the position of the Virgo cluster. The galaxies in and around the Virgo cluster from the Extended Virgo Cluster Catalog that was recently constructed by Kim et al. are used to determine the direction of the local sheet. The peculiar velocity field reconstructed from the Sloan Digital Sky Survey Data Release 7 is analyzed to estimate the local velocity shear tensor at the Virgo center. Showing first that the minor principal axis of the local velocity shear tensor ismore » almost parallel to the direction of the line of sight, we detect a clear signal of alignment between the positions of the Virgo satellites and the intermediate principal axis of the local velocity shear projected onto the plane of the sky. Furthermore, the dwarf satellites are found to appear more strongly aligned than their normal counterparts, which is interpreted as an indication of the following. (1) The normal satellites and the dwarf satellites fall in the Virgo cluster preferentially along the local filament and the local sheet, respectively. (2) The local filament is aligned with the minor principal axis of the local velocity shear while the local sheet is parallel to the plane spanned by the minor and intermediate principal axes. Our result is consistent with the recent numerical claim that the velocity shear is a good tracer of the cosmic web.« less

  18. DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability

    PubMed Central

    Little, Damon P.

    2011-01-01

    For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple–sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g. matK). Thus, algorithms that depend on global multiple–sequence alignments are not universally applicable. Some sequence identification methods that use local pairwise alignments (e.g. BLAST) are unable to accurately differentiate between highly similar sequences and are not designed to cope with hierarchic phylogenetic relationships or within taxon variability. Here, I present a novel alignment–free sequence identification algorithm–BRONX–that accounts for observed within taxon variability and hierarchic relationships among taxa. BRONX identifies short variable segments and corresponding invariant flanking regions in reference sequences. These flanking regions are used to score variable regions in the query sequence without the production of a global multiple–sequence alignment. By incorporating observed within taxon variability into the scoring procedure, misidentifications arising from shared alleles/haplotypes are minimized. An explicit treatment of more inclusive terminals allows for separate identifications to be made for each taxonomic level and/or for user–defined terminals. BRONX performs better than all other methods when there is imperfect overlap between query and reference sequences (e.g. mini–barcode queries against a full–length barcode database). BRONX consistently produced better identifications at the genus–level for all query types. PMID:21857897

  19. Linking GPS and travel diary data using sequence alignment in a study of children's independent mobility

    PubMed Central

    2011-01-01

    Background Global positioning systems (GPS) are increasingly being used in health research to determine the location of study participants. Combining GPS data with data collected via travel/activity diaries allows researchers to assess where people travel in conjunction with data about trip purpose and accompaniment. However, linking GPS and diary data is problematic and to date the only method has been to match the two datasets manually, which is time consuming and unlikely to be practical for larger data sets. This paper assesses the feasibility of a new sequence alignment method of linking GPS and travel diary data in comparison with the manual matching method. Methods GPS and travel diary data obtained from a study of children's independent mobility were linked using sequence alignment algorithms to test the proof of concept. Travel diaries were assessed for quality by counting the number of errors and inconsistencies in each participant's set of diaries. The success of the sequence alignment method was compared for higher versus lower quality travel diaries, and for accompanied versus unaccompanied trips. Time taken and percentage of trips matched were compared for the sequence alignment method and the manual method. Results The sequence alignment method matched 61.9% of all trips. Higher quality travel diaries were associated with higher match rates in both the sequence alignment and manual matching methods. The sequence alignment method performed almost as well as the manual method and was an order of magnitude faster. However, the sequence alignment method was less successful at fully matching trips and at matching unaccompanied trips. Conclusions Sequence alignment is a promising method of linking GPS and travel diary data in large population datasets, especially if limitations in the trip detection algorithm are addressed. PMID:22142322

  20. Chromaligner: a web server for chromatogram alignment.

    PubMed

    Wang, San-Yuan; Ho, Tsung-Jung; Kuo, Ching-Hua; Tseng, Yufeng J

    2010-09-15

    Chromaligner is a tool for chromatogram alignment to align retention time for chromatographic methods coupled to spectrophotometers such as high performance liquid chromatography and capillary electrophoresis for metabolomics works. Chromaligner resolves peak shifts by a constrained chromatogram alignment. For a collection of chromatograms and a set of defined peaks, Chromaligner aligns the chromatograms on defined peaks using correlation warping (COW). Chromaligner is faster than the original COW algorithm by k(2) times, where k is the number of defined peaks in a chromatogram. It also provides alignments based on known component peaks to reach the best results for further chemometric analysis. Chromaligner is freely accessible at http://cmdd.csie.ntu.edu.tw/~chromaligner.

  1. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences

    PubMed Central

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong

    2015-01-01

    Abstract We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate—slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory. PMID:25549288

  2. A 2D range Hausdorff approach to 3D facial recognition.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Koch, Mark William; Russ, Trina Denise; Little, Charles Quentin

    2004-11-01

    This paper presents a 3D facial recognition algorithm based on the Hausdorff distance metric. The standard 3D formulation of the Hausdorff matching algorithm has been modified to operate on a 2D range image, enabling a reduction in computation from O(N2) to O(N) without large storage requirements. The Hausdorff distance is known for its robustness to data outliers and inconsistent data between two data sets, making it a suitable choice for dealing with the inherent problems in many 3D datasets due to sensor noise and object self-occlusion. For optimal performance, the algorithm assumes a good initial alignment between probe and templatemore » datasets. However, to minimize the error between two faces, the alignment can be iteratively refined. Results from the algorithm are presented using 3D face images from the Face Recognition Grand Challenge database version 1.0.« less

  3. Quadrupole Alignment and Trajectory Correction for Future Linear Colliders: SLC Tests of a Dispersion-Free Steering Algorithm

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Assmann, R

    2004-06-08

    The feasibility of future linear colliders depends on achieving very tight alignment and steering tolerances. All proposals (NLC, JLC, CLIC, TESLA and S-BAND) currently require a total emittance growth in the main linac of less than 30-100% [1]. This should be compared with a 100% emittance growth in the much smaller SLC linac [2]. Major advances in alignment and beam steering techniques beyond those used in the SLC are necessary for the next generation of linear colliders. In this paper, we present an experimental study of quadrupole alignment with a dispersion-free steering algorithm. A closely related method (wakefield-free steering) takesmore » into account wakefield effects [3]. However, this method can not be studied at the SLC. The requirements for future linear colliders lead to new and unconventional ideas about alignment and beam steering. For example, no dipole correctors are foreseen for the standard trajectory correction in the NLC [4]; beam steering will be done by moving the quadrupole positions with magnet movers. This illustrates the close symbiosis between alignment, beam steering and beam dynamics that will emerge. It is no longer possible to consider the accelerator alignment as static with only a few surveys and realignments per year. The alignment in future linear colliders will be a dynamic process in which the whole linac, with thousands of beam-line elements, is aligned in a few hours or minutes, while the required accuracy of about 5 pm for the NLC quadrupole alignment [4] is a factor of 20 higher than in existing accelerators. The major task in alignment and steering is the accurate determination of the optimum beam-line position. Ideally one would like all elements to be aligned along a straight line. However, this is not practical. Instead a ''smooth curve'' is acceptable as long as its wavelength is much longer than the betatron wavelength of the accelerated beam. Conventional alignment methods are limited in accuracy by errors in the survey and the fiducials. Beam-based alignment methods ideally only depend upon the BPM resolution and generally provide much better precision. Many of those techniques are described in other contributions to this workshop. In this paper we describe our experiences with a dispersion-free steering algorithm for linacs. This algorithm was first suggested by Raubenheimer and Ruth in 1990 [5]. It h as been studied in simulations for NLC [5], TESLA [6], the S-BAND proposal [7] and CLIC [8]. The dispersion-free steering technique can be applied to the whole linac at once and returns the alignment (or trajectory) that minimizes the dispersive emittance growth of the beam. Thus it allows an extremely fast alignment of the beam-line. As we will show dispersion-free steering is only sensitive to quadrupole misalignments. Wakefield-free steering [3] as mentioned before is a closely related technique that minimizes the emittance growth caused by both dispersion and wakefields. Due to hardware limitations (i.e. insufficient relative range of power supplies) we could not study this method experimentally in the SLC. However, its systematics are very similar to those of dispersion-free steering. The studies of dispersion-free steering which are presented made extensive use of the unique potential of the SLC as the only operating linear collider. We used it to study the performance and problems of advanced beam-based optimization tools in a real beam-line environment and on a large scale. We should mention that the SLC has utilized beam-based alignment for years [9], using the difference of electron and positron trajectories. This method, however, cannot be used in future linear colliders. The goal of our work is to demonstrate the performance of advanced beam-based alignment techniques in linear colliders and to anticipate possible reality-related problems. Those can then be solved in the design state for the next generation of linear colliders.« less

  4. Magnetic Field Strengths and Grain Alignment Variations in the Local Bubble Wall

    NASA Astrophysics Data System (ADS)

    Medan, Ilija; Andersson, B.-G.

    2018-01-01

    Optical and infrared continuum polarization is known to be due to irregular dust grains aligned with the magnetic field. This provides an important tool to probe the geometry and strength of those fields, particularly if the variations in the grain alignment efficiencies can be understood. Here, we examine polarization variations observed throughout the Local Bubble for b>30○, using a large polarization survey of the North Galactic cap from Berdyugin et al. (2014). These data are supported by archival photometric and spectroscopic data along with the mapping of the Local Bubble by Lallement et al. (2003). We can accurately model the observational data assuming that the grain alignment variations are due to the radiation from the OB associations within 1 kpc of the sun. This strongly supports radiatively driven grain alignment. We also probe the relative strength of the magnetic field in the wall of the Local Bubble using the Davis-Chandrasekhar-Fermi method. We find evidence for a bimodal field strength distribution, where the variations in the field are correlated with the variations in grain alignment efficiency, indicating that the higher strength regions might represent a compression of the wall by the interaction of the outflow in the Local Bubble and the opposing flows by the surrounding OB associations.

  5. Development and evaluation of an articulated registration algorithm for human skeleton registration

    NASA Astrophysics Data System (ADS)

    Yip, Stephen; Perk, Timothy; Jeraj, Robert

    2014-03-01

    Accurate registration over multiple scans is necessary to assess treatment response of bone diseases (e.g. metastatic bone lesions). This study aimed to develop and evaluate an articulated registration algorithm for the whole-body skeleton registration in human patients. In articulated registration, whole-body skeletons are registered by auto-segmenting into individual bones using atlas-based segmentation, and then rigidly aligning them. Sixteen patients (weight = 80-117 kg, height = 168-191 cm) with advanced prostate cancer underwent the pre- and mid-treatment PET/CT scans over a course of cancer therapy. Skeletons were extracted from the CT images by thresholding (HU>150). Skeletons were registered using the articulated, rigid, and deformable registration algorithms to account for position and postural variability between scans. The inter-observers agreement in the atlas creation, the agreement between the manually and atlas-based segmented bones, and the registration performances of all three registration algorithms were all assessed using the Dice similarity index—DSIobserved, DSIatlas, and DSIregister. Hausdorff distance (dHausdorff) of the registered skeletons was also used for registration evaluation. Nearly negligible inter-observers variability was found in the bone atlases creation as the DSIobserver was 96 ± 2%. Atlas-based and manual segmented bones were in excellent agreement with DSIatlas of 90 ± 3%. Articulated (DSIregsiter = 75 ± 2%, dHausdorff = 0.37 ± 0.08 cm) and deformable registration algorithms (DSIregister = 77 ± 3%, dHausdorff = 0.34 ± 0.08 cm) considerably outperformed the rigid registration algorithm (DSIregsiter = 59 ± 9%, dHausdorff = 0.69 ± 0.20 cm) in the skeleton registration as the rigid registration algorithm failed to capture the skeleton flexibility in the joints. Despite superior skeleton registration performance, deformable registration algorithm failed to preserve the local rigidity of bones as over 60% of the skeletons were deformed. Articulated registration is superior to rigid and deformable registrations by capturing global flexibility while preserving local rigidity inherent in skeleton registration. Therefore, articulated registration can be employed to accurately register the whole-body human skeletons, and it enables the treatment response assessment of various bone diseases.

  6. Cache-Oblivious parallel SIMD Viterbi decoding for sequence search in HMMER.

    PubMed

    Ferreira, Miguel; Roma, Nuno; Russo, Luis M S

    2014-05-30

    HMMER is a commonly used bioinformatics tool based on Hidden Markov Models (HMMs) to analyze and process biological sequences. One of its main homology engines is based on the Viterbi decoding algorithm, which was already highly parallelized and optimized using Farrar's striped processing pattern with Intel SSE2 instruction set extension. A new SIMD vectorization of the Viterbi decoding algorithm is proposed, based on an SSE2 inter-task parallelization approach similar to the DNA alignment algorithm proposed by Rognes. Besides this alternative vectorization scheme, the proposed implementation also introduces a new partitioning of the Markov model that allows a significantly more efficient exploitation of the cache locality. Such optimization, together with an improved loading of the emission scores, allows the achievement of a constant processing throughput, regardless of the innermost-cache size and of the dimension of the considered model. The proposed optimized vectorization of the Viterbi decoding algorithm was extensively evaluated and compared with the HMMER3 decoder to process DNA and protein datasets, proving to be a rather competitive alternative implementation. Being always faster than the already highly optimized ViterbiFilter implementation of HMMER3, the proposed Cache-Oblivious Parallel SIMD Viterbi (COPS) implementation provides a constant throughput and offers a processing speedup as high as two times faster, depending on the model's size.

  7. Extracting DNA words based on the sequence features: non-uniform distribution and integrity.

    PubMed

    Li, Zhi; Cao, Hongyan; Cui, Yuehua; Zhang, Yanbo

    2016-01-25

    DNA sequence can be viewed as an unknown language with words as its functional units. Given that most sequence alignment algorithms such as the motif discovery algorithms depend on the quality of background information about sequences, it is necessary to develop an ab initio algorithm for extracting the "words" based only on the DNA sequences. We considered that non-uniform distribution and integrity were two important features of a word, based on which we developed an ab initio algorithm to extract "DNA words" that have potential functional meaning. A Kolmogorov-Smirnov test was used for consistency test of uniform distribution of DNA sequences, and the integrity was judged by the sequence and position alignment. Two random base sequences were adopted as negative control, and an English book was used as positive control to verify our algorithm. We applied our algorithm to the genomes of Saccharomyces cerevisiae and 10 strains of Escherichia coli to show the utility of the methods. The results provide strong evidences that the algorithm is a promising tool for ab initio building a DNA dictionary. Our method provides a fast way for large scale screening of important DNA elements and offers potential insights into the understanding of a genome.

  8. Measuring the visual salience of alignments by their non-accidentalness.

    PubMed

    Blusseau, S; Carboni, A; Maiche, A; Morel, J M; Grompone von Gioi, R

    2016-09-01

    Quantitative approaches are part of the understanding of contour integration and the Gestalt law of good continuation. The present study introduces a new quantitative approach based on the a contrario theory, which formalizes the non-accidentalness principle for good continuation. This model yields an ideal observer algorithm, able to detect non-accidental alignments in Gabor patterns. More precisely, this parameterless algorithm associates with each candidate percept a measure, the Number of False Alarms (NFA), quantifying its degree of masking. To evaluate the approach, we compared this ideal observer with the human attentive performance on three experiments of straight contours detection in arrays of Gabor patches. The experiments showed a strong correlation between the detectability of the target stimuli and their degree of non-accidentalness, as measured by our model. What is more, the algorithm's detection curves were very similar to the ones of human subjects. This fact seems to validate our proposed measurement method as a convenient way to predict the visibility of alignments. This framework could be generalized to other Gestalts. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters.

    PubMed

    Sela, Itamar; Ashkenazy, Haim; Katoh, Kazutaka; Pupko, Tal

    2015-07-01

    Inference of multiple sequence alignments (MSAs) is a critical part of phylogenetic and comparative genomics studies. However, from the same set of sequences different MSAs are often inferred, depending on the methodologies used and the assumed parameters. Much effort has recently been devoted to improving the ability to identify unreliable alignment regions. Detecting such unreliable regions was previously shown to be important for downstream analyses relying on MSAs, such as the detection of positive selection. Here we developed GUIDANCE2, a new integrative methodology that accounts for: (i) uncertainty in the process of indel formation, (ii) uncertainty in the assumed guide tree and (iii) co-optimal solutions in the pairwise alignments, used as building blocks in progressive alignment algorithms. We compared GUIDANCE2 with seven methodologies to detect unreliable MSA regions using extensive simulations and empirical benchmarks. We show that GUIDANCE2 outperforms all previously developed methodologies. Furthermore, GUIDANCE2 also provides a set of alternative MSAs which can be useful for downstream analyses. The novel algorithm is implemented as a web-server, available at: http://guidance.tau.ac.il. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Fluorescein thiocarbamyl amino acids as internal standards for migration time correction in capillary sieving electrophoresis

    PubMed Central

    Pugsley, Haley R.; Swearingen, Kristian E.; Dovichi, Norman J.

    2009-01-01

    A number of algorithms have been developed to correct for migration time drift in capillary electrophoresis. Those algorithms require identification of common components in each run. However, not all components may be present or resolved in separations of complex samples, which can confound attempts for alignment. This paper reports the use of fluorescein thiocarbamyl derivatives of amino acids as internal standards for alignment of 3-(2-furoyl)quinoline-2-carboxaldehyde (FQ)-labeled proteins in capillary sieving electrophoresis. The fluorescein thiocarbamyl derivative of aspartic acid migrates before FQ-labeled proteins and the fluorescein thiocarbamyl derivative of arginine migrates after the FQ-labeled proteins. These compounds were used as internal standards to correct for variations in migration time over a two-week period in the separation of a cellular homogenate. The experimental conditions were deliberately manipulated by varying electric field and sample preparation conditions. Three components of the homogenate were used to evaluate the alignment efficiency. Before alignment, the average relative standard deviation in migration time for these components was 13.3%. After alignment, the average relative standard deviation in migration time for these components was reduced to 0.5%. PMID:19249052

  11. Ligand Binding Site Detection by Local Structure Alignment and Its Performance Complementarity

    PubMed Central

    Lee, Hui Sun; Im, Wonpil

    2013-01-01

    Accurate determination of potential ligand binding sites (BS) is a key step for protein function characterization and structure-based drug design. Despite promising results of template-based BS prediction methods using global structure alignment (GSA), there is a room to improve the performance by properly incorporating local structure alignment (LSA) because BS are local structures and often similar for proteins with dissimilar global folds. We present a template-based ligand BS prediction method using G-LoSA, our LSA tool. A large benchmark set validation shows that G-LoSA predicts drug-like ligands’ positions in single-chain protein targets more precisely than TM-align, a GSA-based method, while the overall success rate of TM-align is better. G-LoSA is particularly efficient for accurate detection of local structures conserved across proteins with diverse global topologies. Recognizing the performance complementarity of G-LoSA to TM-align and a non-template geometry-based method, fpocket, a robust consensus scoring method, CMCS-BSP (Complementary Methods and Consensus Scoring for ligand Binding Site Prediction), is developed and shows improvement on prediction accuracy. The G-LoSA source code is freely available at http://im.bioinformatics.ku.edu/GLoSA. PMID:23957286

  12. Second Language Writing Classification System Based on Word-Alignment Distribution

    ERIC Educational Resources Information Center

    Kotani, Katsunori; Yoshimi, Takehiko

    2010-01-01

    The present paper introduces an automatic classification system for assisting second language (L2) writing evaluation. This system, which classifies sentences written by L2 learners as either native speaker-like or learner-like sentences, is constructed by machine learning algorithms using word-alignment distributions as classification features…

  13. Approximate static condensation algorithm for solving multi-material diffusion problems on meshes non-aligned with material interfaces

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kikinzon, Evgeny; Kuznetsov, Yuri; Lipnikov, Konstatin

    In this study, we describe a new algorithm for solving multi-material diffusion problem when material interfaces are not aligned with the mesh. In this case interface reconstruction methods are used to construct approximate representation of interfaces between materials. They produce so-called multi-material cells, in which materials are represented by material polygons that contain only one material. The reconstructed interface is not continuous between cells. Finally, we suggest the new method for solving multi-material diffusion problems on such meshes and compare its performance with known homogenization methods.

  14. Active angular alignment of gauge blocks in double-ended interferometers.

    PubMed

    Buchta, Zdeněk; Reřucha, Simon; Hucl, Václav; Cížek, Martin; Sarbort, Martin; Lazar, Josef; Cíp, Ondřej

    2013-09-27

    This paper presents a method implemented in a system for automatic contactless calibration of gauge blocks designed at ISI ASCR. The system combines low-coherence interferometry and laser interferometry, where the first identifies the gauge block sides position and the second one measures the gauge block length itself. A crucial part of the system is the algorithm for gauge block alignment to the measuring beam which is able to compensate the gauge block lateral and longitudinal tilt up to 0.141 mrad. The algorithm is also important for the gauge block position monitoring during its length measurement.

  15. Active Angular Alignment of Gauge Blocks in Double-Ended Interferometers

    PubMed Central

    Buchta, Zdeněk; Řeřucha, Šimon; Hucl, Václav; Čížek, Martin; Šarbort, Martin; Lazar, Josef; Číp, Ondřej

    2013-01-01

    This paper presents a method implemented in a system for automatic contactless calibration of gauge blocks designed at ISI ASCR. The system combines low-coherence interferometry and laser interferometry, where the first identifies the gauge block sides position and the second one measures the gauge block length itself. A crucial part of the system is the algorithm for gauge block alignment to the measuring beam which is able to compensate the gauge block lateral and longitudinal tilt up to 0.141 mrad. The algorithm is also important for the gauge block position monitoring during its length measurement. PMID:24084107

  16. Approximate static condensation algorithm for solving multi-material diffusion problems on meshes non-aligned with material interfaces

    DOE PAGES

    Kikinzon, Evgeny; Kuznetsov, Yuri; Lipnikov, Konstatin; ...

    2017-07-08

    In this study, we describe a new algorithm for solving multi-material diffusion problem when material interfaces are not aligned with the mesh. In this case interface reconstruction methods are used to construct approximate representation of interfaces between materials. They produce so-called multi-material cells, in which materials are represented by material polygons that contain only one material. The reconstructed interface is not continuous between cells. Finally, we suggest the new method for solving multi-material diffusion problems on such meshes and compare its performance with known homogenization methods.

  17. Eigenvector synchronization, graph rigidity and the molecule problemR

    PubMed Central

    Cucuringu, Mihai; Singer, Amit; Cowburn, David

    2013-01-01

    The graph realization problem has received a great deal of attention in recent years, due to its importance in applications such as wireless sensor networks and structural biology. In this paper, we extend the previous work and propose the 3D-As-Synchronized-As-Possible (3D-ASAP) algorithm, for the graph realization problem in ℝ3, given a sparse and noisy set of distance measurements. 3D-ASAP is a divide and conquer, non-incremental and non-iterative algorithm, which integrates local distance information into a global structure determination. Our approach starts with identifying, for every node, a subgraph of its 1-hop neighborhood graph, which can be accurately embedded in its own coordinate system. In the noise-free case, the computed coordinates of the sensors in each patch must agree with their global positioning up to some unknown rigid motion, that is, up to translation, rotation and possibly reflection. In other words, to every patch, there corresponds an element of the Euclidean group, Euc(3), of rigid transformations in ℝ3, and the goal was to estimate the group elements that will properly align all the patches in a globally consistent way. Furthermore, 3D-ASAP successfully incorporates information specific to the molecule problem in structural biology, in particular information on known substructures and their orientation. In addition, we also propose 3D-spectral-partitioning (SP)-ASAP, a faster version of 3D-ASAP, which uses a spectral partitioning algorithm as a pre-processing step for dividing the initial graph into smaller subgraphs. Our extensive numerical simulations show that 3D-ASAP and 3D-SP-ASAP are very robust to high levels of noise in the measured distances and to sparse connectivity in the measurement graph, and compare favorably with similar state-of-the-art localization algorithms. PMID:24432187

  18. Electron tomography simulator with realistic 3D phantom for evaluation of acquisition, alignment and reconstruction methods.

    PubMed

    Wan, Xiaohua; Katchalski, Tsvi; Churas, Christopher; Ghosh, Sreya; Phan, Sebastien; Lawrence, Albert; Hao, Yu; Zhou, Ziying; Chen, Ruijuan; Chen, Yu; Zhang, Fa; Ellisman, Mark H

    2017-05-01

    Because of the significance of electron microscope tomography in the investigation of biological structure at nanometer scales, ongoing improvement efforts have been continuous over recent years. This is particularly true in the case of software developments. Nevertheless, verification of improvements delivered by new algorithms and software remains difficult. Current analysis tools do not provide adaptable and consistent methods for quality assessment. This is particularly true with images of biological samples, due to image complexity, variability, low contrast and noise. We report an electron tomography (ET) simulator with accurate ray optics modeling of image formation that includes curvilinear trajectories through the sample, warping of the sample and noise. As a demonstration of the utility of our approach, we have concentrated on providing verification of the class of reconstruction methods applicable to wide field images of stained plastic-embedded samples. Accordingly, we have also constructed digital phantoms derived from serial block face scanning electron microscope images. These phantoms are also easily modified to include alignment features to test alignment algorithms. The combination of more realistic phantoms with more faithful simulations facilitates objective comparison of acquisition parameters, alignment and reconstruction algorithms and their range of applicability. With proper phantoms, this approach can also be modified to include more complex optical models, including distance-dependent blurring and phase contrast functions, such as may occur in cryotomography. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. A Hough transform global probabilistic approach to multiple-subject diffusion MRI tractography.

    PubMed

    Aganj, Iman; Lenglet, Christophe; Jahanshad, Neda; Yacoub, Essa; Harel, Noam; Thompson, Paul M; Sapiro, Guillermo

    2011-08-01

    A global probabilistic fiber tracking approach based on the voting process provided by the Hough transform is introduced in this work. The proposed framework tests candidate 3D curves in the volume, assigning to each one a score computed from the diffusion images, and then selects the curves with the highest scores as the potential anatomical connections. The algorithm avoids local minima by performing an exhaustive search at the desired resolution. The technique is easily extended to multiple subjects, considering a single representative volume where the registered high-angular resolution diffusion images (HARDI) from all the subjects are non-linearly combined, thereby obtaining population-representative tracts. The tractography algorithm is run only once for the multiple subjects, and no tract alignment is necessary. We present experimental results on HARDI volumes, ranging from simulated and 1.5T physical phantoms to 7T and 4T human brain and 7T monkey brain datasets. Copyright © 2011 Elsevier B.V. All rights reserved.

  20. Development and application of an algorithm to compute weighted multiple glycan alignments.

    PubMed

    Hosoda, Masae; Akune, Yukie; Aoki-Kinoshita, Kiyoko F

    2017-05-01

    A glycan consists of monosaccharides linked by glycosidic bonds, has branches and forms complex molecular structures. Databases have been developed to store large amounts of glycan-binding experiments, including glycan arrays with glycan-binding proteins. However, there are few bioinformatics techniques to analyze large amounts of data for glycans because there are few tools that can handle the complexity of glycan structures. Thus, we have developed the MCAW (Multiple Carbohydrate Alignment with Weights) tool that can align multiple glycan structures, to aid in the understanding of their function as binding recognition molecules. We have described in detail the first algorithm to perform multiple glycan alignments by modeling glycans as trees. To test our tool, we prepared several data sets, and as a result, we found that the glycan motif could be successfully aligned without any prior knowledge applied to the tool, and the known recognition binding sites of glycans could be aligned at a high rate amongst all our datasets tested. We thus claim that our tool is able to find meaningful glycan recognition and binding patterns using data obtained by glycan-binding experiments. The development and availability of an effective multiple glycan alignment tool opens possibilities for many other glycoinformatics analysis, making this work a big step towards furthering glycomics analysis. http://www.rings.t.soka.ac.jp. kkiyoko@soka.ac.jp. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  1. Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers.

    PubMed

    Galpert, Deborah; Fernández, Alberto; Herrera, Francisco; Antunes, Agostinho; Molina-Ruiz, Reinaldo; Agüero-Chapin, Guillermin

    2018-05-03

    The development of new ortholog detection algorithms and the improvement of existing ones are of major importance in functional genomics. We have previously introduced a successful supervised pairwise ortholog classification approach implemented in a big data platform that considered several pairwise protein features and the low ortholog pair ratios found between two annotated proteomes (Galpert, D et al., BioMed Research International, 2015). The supervised models were built and tested using a Saccharomycete yeast benchmark dataset proposed by Salichos and Rokas (2011). Despite several pairwise protein features being combined in a supervised big data approach; they all, to some extent were alignment-based features and the proposed algorithms were evaluated on a unique test set. Here, we aim to evaluate the impact of alignment-free features on the performance of supervised models implemented in the Spark big data platform for pairwise ortholog detection in several related yeast proteomes. The Spark Random Forest and Decision Trees with oversampling and undersampling techniques, and built with only alignment-based similarity measures or combined with several alignment-free pairwise protein features showed the highest classification performance for ortholog detection in three yeast proteome pairs. Although such supervised approaches outperformed traditional methods, there were no significant differences between the exclusive use of alignment-based similarity measures and their combination with alignment-free features, even within the twilight zone of the studied proteomes. Just when alignment-based and alignment-free features were combined in Spark Decision Trees with imbalance management, a higher success rate (98.71%) within the twilight zone could be achieved for a yeast proteome pair that underwent a whole genome duplication. The feature selection study showed that alignment-based features were top-ranked for the best classifiers while the runners-up were alignment-free features related to amino acid composition. The incorporation of alignment-free features in supervised big data models did not significantly improve ortholog detection in yeast proteomes regarding the classification qualities achieved with just alignment-based similarity measures. However, the similarity of their classification performance to that of traditional ortholog detection methods encourages the evaluation of other alignment-free protein pair descriptors in future research.

  2. Integration of retinal image sequences

    NASA Astrophysics Data System (ADS)

    Ballerini, Lucia

    1998-10-01

    In this paper a method for noise reduction in ocular fundus image sequences is described. The eye is the only part of the human body where the capillary network can be observed along with the arterial and venous circulation using a non invasive technique. The study of the retinal vessels is very important both for the study of the local pathology (retinal disease) and for the large amount of information it offers on systematic haemodynamics, such as hypertension, arteriosclerosis, and diabetes. In this paper a method for image integration of ocular fundus image sequences is described. The procedure can be divided in two step: registration and fusion. First we describe an automatic alignment algorithm for registration of ocular fundus images. In order to enhance vessel structures, we used a spatially oriented bank of filters designed to match the properties of the objects of interest. To evaluate interframe misalignment we adopted a fast cross-correlation algorithm. The performances of the alignment method have been estimated by simulating shifts between image pairs and by using a cross-validation approach. Then we propose a temporal integration technique of image sequences so as to compute enhanced pictures of the overall capillary network. Image registration is combined with image enhancement by fusing subsequent frames of a same region. To evaluate the attainable results, the signal-to-noise ratio was estimated before and after integration. Experimental results on synthetic images of vessel-like structures with different kind of Gaussian additive noise as well as on real fundus images are reported.

  3. A novel standardized algorithm using SPECT/CT evaluating unhappy patients after unicondylar knee arthroplasty--a combined analysis of tracer uptake distribution and component position.

    PubMed

    Suter, Basil; Testa, Enrique; Stämpfli, Patrick; Konala, Praveen; Rasch, Helmut; Friederich, Niklaus F; Hirschmann, Michael T

    2015-03-20

    The introduction of a standardized SPECT/CT algorithm including a localization scheme, which allows accurate identification of specific patterns and thresholds of SPECT/CT tracer uptake, could lead to a better understanding of the bone remodeling and specific failure modes of unicondylar knee arthroplasty (UKA). The purpose of the present study was to introduce a novel standardized SPECT/CT algorithm for patients after UKA and evaluate its clinical applicability, usefulness and inter- and intra-observer reliability. Tc-HDP-SPECT/CT images of consecutive patients (median age 65, range 48-84 years) with 21 knees after UKA were prospectively evaluated. The tracer activity on SPECT/CT was localized using a specific standardized UKA localization scheme. For tracer uptake analysis (intensity and anatomical distribution pattern) a 3D volumetric quantification method was used. The maximum intensity values were recorded for each anatomical area. In addition, ratios between the respective value in the measured area and the background tracer activity were calculated. The femoral and tibial component position (varus-valgus, flexion-extension, internal and external rotation) was determined in 3D-CT. The inter- and intraobserver reliability of the localization scheme, grading of the tracer activity and component measurements were determined by calculating the intraclass correlation coefficients (ICC). The localization scheme, grading of the tracer activity and component measurements showed high inter- and intra-observer reliabilities for all regions (tibia, femur and patella). For measurement of component position there was strong agreement between the readings of the two observers; the ICC for the orientation of the femoral component was 0.73-1.00 (intra-observer reliability) and 0.91-1.00 (inter-observer reliability). The ICC for the orientation of the tibial component was 0.75-1.00 (intra-observer reliability) and 0.77-1.00 (inter-observer reliability). The SPECT/CT algorithm presented combining the mechanical information on UKA component position, alignment and metabolic data is highly reliable and proved to be a valuable, consistent and useful tool for analysing postoperative knees after UKA. Using this standardized approach in clinical studies might be helpful in establishing the diagnosis in patients with pain after UKA.

  4. Survey of local and global biological network alignment: the need to reconcile the two sides of the same coin.

    PubMed

    Guzzi, Pietro Hiram; Milenkovic, Tijana

    2018-05-01

    Analogous to genomic sequence alignment that allows for across-species transfer of biological knowledge between conserved sequence regions, biological network alignment can be used to guide the knowledge transfer between conserved regions of molecular networks of different species. Hence, biological network alignment can be used to redefine the traditional notion of a sequence-based homology to a new notion of network-based homology. Analogous to genomic sequence alignment, there exist local and global biological network alignments. Here, we survey prominent and recent computational approaches of each network alignment type and discuss their (dis)advantages. Then, as it was recently shown that the two approach types are complementary, in the sense that they capture different slices of cellular functioning, we discuss the need to reconcile the two network alignment types and present a recent first step in this direction. We conclude with some open research problems on this topic and comment on the usefulness of network alignment in other domains besides computational biology.

  5. Spatio-temporal alignment of multiple sensors

    NASA Astrophysics Data System (ADS)

    Zhang, Tinghua; Ni, Guoqiang; Fan, Guihua; Sun, Huayan; Yang, Biao

    2018-01-01

    Aiming to achieve the spatio-temporal alignment of multi sensor on the same platform for space target observation, a joint spatio-temporal alignment method is proposed. To calibrate the parameters and measure the attitude of cameras, an astronomical calibration method is proposed based on star chart simulation and collinear invariant features of quadrilateral diagonal between the observed star chart. In order to satisfy a temporal correspondence and spatial alignment similarity simultaneously, the method based on the astronomical calibration and attitude measurement in this paper formulates the video alignment to fold the spatial and temporal alignment into a joint alignment framework. The advantage of this method is reinforced by exploiting the similarities and prior knowledge of velocity vector field between adjacent frames, which is calculated by the SIFT Flow algorithm. The proposed method provides the highest spatio-temporal alignment accuracy compared to the state-of-the-art methods on sequences recorded from multi sensor at different times.

  6. Searching Remote Homology with Spectral Clustering with Symmetry in Neighborhood Cluster Kernels

    PubMed Central

    Maulik, Ujjwal; Sarkar, Anasua

    2013-01-01

    Remote homology detection among proteins utilizing only the unlabelled sequences is a central problem in comparative genomics. The existing cluster kernel methods based on neighborhoods and profiles and the Markov clustering algorithms are currently the most popular methods for protein family recognition. The deviation from random walks with inflation or dependency on hard threshold in similarity measure in those methods requires an enhancement for homology detection among multi-domain proteins. We propose to combine spectral clustering with neighborhood kernels in Markov similarity for enhancing sensitivity in detecting homology independent of “recent” paralogs. The spectral clustering approach with new combined local alignment kernels more effectively exploits the unsupervised protein sequences globally reducing inter-cluster walks. When combined with the corrections based on modified symmetry based proximity norm deemphasizing outliers, the technique proposed in this article outperforms other state-of-the-art cluster kernels among all twelve implemented kernels. The comparison with the state-of-the-art string and mismatch kernels also show the superior performance scores provided by the proposed kernels. Similar performance improvement also is found over an existing large dataset. Therefore the proposed spectral clustering framework over combined local alignment kernels with modified symmetry based correction achieves superior performance for unsupervised remote homolog detection even in multi-domain and promiscuous domain proteins from Genolevures database families with better biological relevance. Source code available upon request. Contact: sarkar@labri.fr. PMID:23457439

  7. Searching remote homology with spectral clustering with symmetry in neighborhood cluster kernels.

    PubMed

    Maulik, Ujjwal; Sarkar, Anasua

    2013-01-01

    Remote homology detection among proteins utilizing only the unlabelled sequences is a central problem in comparative genomics. The existing cluster kernel methods based on neighborhoods and profiles and the Markov clustering algorithms are currently the most popular methods for protein family recognition. The deviation from random walks with inflation or dependency on hard threshold in similarity measure in those methods requires an enhancement for homology detection among multi-domain proteins. We propose to combine spectral clustering with neighborhood kernels in Markov similarity for enhancing sensitivity in detecting homology independent of "recent" paralogs. The spectral clustering approach with new combined local alignment kernels more effectively exploits the unsupervised protein sequences globally reducing inter-cluster walks. When combined with the corrections based on modified symmetry based proximity norm deemphasizing outliers, the technique proposed in this article outperforms other state-of-the-art cluster kernels among all twelve implemented kernels. The comparison with the state-of-the-art string and mismatch kernels also show the superior performance scores provided by the proposed kernels. Similar performance improvement also is found over an existing large dataset. Therefore the proposed spectral clustering framework over combined local alignment kernels with modified symmetry based correction achieves superior performance for unsupervised remote homolog detection even in multi-domain and promiscuous domain proteins from Genolevures database families with better biological relevance. Source code available upon request. sarkar@labri.fr.

  8. A Rapid Convergent Low Complexity Interference Alignment Algorithm for Wireless Sensor Networks.

    PubMed

    Jiang, Lihui; Wu, Zhilu; Ren, Guanghui; Wang, Gangyi; Zhao, Nan

    2015-07-29

    Interference alignment (IA) is a novel technique that can effectively eliminate the interference and approach the sum capacity of wireless sensor networks (WSNs) when the signal-to-noise ratio (SNR) is high, by casting the desired signal and interference into different signal subspaces. The traditional alternating minimization interference leakage (AMIL) algorithm for IA shows good performance in high SNR regimes, however, the complexity of the AMIL algorithm increases dramatically as the number of users and antennas increases, posing limits to its applications in the practical systems. In this paper, a novel IA algorithm, called directional quartic optimal (DQO) algorithm, is proposed to minimize the interference leakage with rapid convergence and low complexity. The properties of the AMIL algorithm are investigated, and it is discovered that the difference between the two consecutive iteration results of the AMIL algorithm will approximately point to the convergence solution when the precoding and decoding matrices obtained from the intermediate iterations are sufficiently close to their convergence values. Based on this important property, the proposed DQO algorithm employs the line search procedure so that it can converge to the destination directly. In addition, the optimal step size can be determined analytically by optimizing a quartic function. Numerical results show that the proposed DQO algorithm can suppress the interference leakage more rapidly than the traditional AMIL algorithm, and can achieve the same level of sum rate as that of AMIL algorithm with far less iterations and execution time.

  9. A flexible motif search technique based on generalized profiles.

    PubMed

    Bucher, P; Karplus, K; Moeri, N; Hofmann, K

    1996-03-01

    A flexible motif search technique is presented which has two major components: (1) a generalized profile syntax serving as a motif definition language; and (2) a motif search method specifically adapted to the problem of finding multiple instances of a motif in the same sequence. The new profile structure, which is the core of the generalized profile syntax, combines the functions of a variety of motif descriptors implemented in other methods, including regular expression-like patterns, weight matrices, previously used profiles, and certain types of hidden Markov models (HMMs). The relationship between generalized profiles and other biomolecular motif descriptors is analyzed in detail, with special attention to HMMs. Generalized profiles are shown to be equivalent to a particular class of HMMs, and conversion procedures in both directions are given. The conversion procedures provide an interpretation for local alignment in the framework of stochastic models, allowing for clear, simple significance tests. A mathematical statement of the motif search problem defines the new method exactly without linking it to a specific algorithmic solution. Part of the definition includes a new definition of disjointness of alignments.

  10. biobambam: tools for read pair collation based algorithms on BAM files

    PubMed Central

    2014-01-01

    Background Sequence alignment data is often ordered by coordinate (id of the reference sequence plus position on the sequence where the fragment was mapped) when stored in BAM files, as this simplifies the extraction of variants between the mapped data and the reference or of variants within the mapped data. In this order paired reads are usually separated in the file, which complicates some other applications like duplicate marking or conversion to the FastQ format which require to access the full information of the pairs. Results In this paper we introduce biobambam, a set of tools based on the efficient collation of alignments in BAM files by read name. The employed collation algorithm avoids time and space consuming sorting of alignments by read name where this is possible without using more than a specified amount of main memory. Using this algorithm tasks like duplicate marking in BAM files and conversion of BAM files to the FastQ format can be performed very efficiently with limited resources. We also make the collation algorithm available in the form of an API for other projects. This API is part of the libmaus package. Conclusions In comparison with previous approaches to problems involving the collation of alignments by read name like the BAM to FastQ or duplication marking utilities our approach can often perform an equivalent task more efficiently in terms of the required main memory and run-time. Our BAM to FastQ conversion is faster than all widely known alternatives including Picard and bamUtil. Our duplicate marking is about as fast as the closest competitor bamUtil for small data sets and faster than all known alternatives on large and complex data sets.

  11. Flocking algorithm for autonomous flying robots.

    PubMed

    Virágh, Csaba; Vásárhelyi, Gábor; Tarcai, Norbert; Szörényi, Tamás; Somorjai, Gergő; Nepusz, Tamás; Vicsek, Tamás

    2014-06-01

    Animal swarms displaying a variety of typical flocking patterns would not exist without the underlying safe, optimal and stable dynamics of the individuals. The emergence of these universal patterns can be efficiently reconstructed with agent-based models. If we want to reproduce these patterns with artificial systems, such as autonomous aerial robots, agent-based models can also be used in their control algorithms. However, finding the proper algorithms and thus understanding the essential characteristics of the emergent collective behaviour requires thorough and realistic modeling of the robot and also the environment. In this paper, we first present an abstract mathematical model of an autonomous flying robot. The model takes into account several realistic features, such as time delay and locality of communication, inaccuracy of the on-board sensors and inertial effects. We present two decentralized control algorithms. One is based on a simple self-propelled flocking model of animal collective motion, the other is a collective target tracking algorithm. Both algorithms contain a viscous friction-like term, which aligns the velocities of neighbouring agents parallel to each other. We show that this term can be essential for reducing the inherent instabilities of such a noisy and delayed realistic system. We discuss simulation results on the stability of the control algorithms, and perform real experiments to show the applicability of the algorithms on a group of autonomous quadcopters. In our case, bio-inspiration works in two ways. On the one hand, the whole idea of trying to build and control a swarm of robots comes from the observation that birds tend to flock to optimize their behaviour as a group. On the other hand, by using a realistic simulation framework and studying the group behaviour of autonomous robots we can learn about the major factors influencing the flight of bird flocks.

  12. Lord-Wingersky Algorithm Version 2.0 for Hierarchical Item Factor Models with Applications in Test Scoring, Scale Alignment, and Model Fit Testing. CRESST Report 830

    ERIC Educational Resources Information Center

    Cai, Li

    2013-01-01

    Lord and Wingersky's (1984) recursive algorithm for creating summed score based likelihoods and posteriors has a proven track record in unidimensional item response theory (IRT) applications. Extending the recursive algorithm to handle multidimensionality is relatively simple, especially with fixed quadrature because the recursions can be defined…

  13. Desktop aligner for fabrication of multilayer microfluidic devices.

    PubMed

    Li, Xiang; Yu, Zeta Tak For; Geraldo, Dalton; Weng, Shinuo; Alve, Nitesh; Dun, Wu; Kini, Akshay; Patel, Karan; Shu, Roberto; Zhang, Feng; Li, Gang; Jin, Qinghui; Fu, Jianping

    2015-07-01

    Multilayer assembly is a commonly used technique to construct multilayer polydimethylsiloxane (PDMS)-based microfluidic devices with complex 3D architecture and connectivity for large-scale microfluidic integration. Accurate alignment of structure features on different PDMS layers before their permanent bonding is critical in determining the yield and quality of assembled multilayer microfluidic devices. Herein, we report a custom-built desktop aligner capable of both local and global alignments of PDMS layers covering a broad size range. Two digital microscopes were incorporated into the aligner design to allow accurate global alignment of PDMS structures up to 4 in. in diameter. Both local and global alignment accuracies of the desktop aligner were determined to be about 20 μm cm(-1). To demonstrate its utility for fabrication of integrated multilayer PDMS microfluidic devices, we applied the desktop aligner to achieve accurate alignment of different functional PDMS layers in multilayer microfluidics including an organs-on-chips device as well as a microfluidic device integrated with vertical passages connecting channels located in different PDMS layers. Owing to its convenient operation, high accuracy, low cost, light weight, and portability, the desktop aligner is useful for microfluidic researchers to achieve rapid and accurate alignment for generating multilayer PDMS microfluidic devices.

  14. Desktop aligner for fabrication of multilayer microfluidic devices

    PubMed Central

    Li, Xiang; Yu, Zeta Tak For; Geraldo, Dalton; Weng, Shinuo; Alve, Nitesh; Dun, Wu; Kini, Akshay; Patel, Karan; Shu, Roberto; Zhang, Feng; Li, Gang; Jin, Qinghui; Fu, Jianping

    2015-01-01

    Multilayer assembly is a commonly used technique to construct multilayer polydimethylsiloxane (PDMS)-based microfluidic devices with complex 3D architecture and connectivity for large-scale microfluidic integration. Accurate alignment of structure features on different PDMS layers before their permanent bonding is critical in determining the yield and quality of assembled multilayer microfluidic devices. Herein, we report a custom-built desktop aligner capable of both local and global alignments of PDMS layers covering a broad size range. Two digital microscopes were incorporated into the aligner design to allow accurate global alignment of PDMS structures up to 4 in. in diameter. Both local and global alignment accuracies of the desktop aligner were determined to be about 20 μm cm−1. To demonstrate its utility for fabrication of integrated multilayer PDMS microfluidic devices, we applied the desktop aligner to achieve accurate alignment of different functional PDMS layers in multilayer microfluidics including an organs-on-chips device as well as a microfluidic device integrated with vertical passages connecting channels located in different PDMS layers. Owing to its convenient operation, high accuracy, low cost, light weight, and portability, the desktop aligner is useful for microfluidic researchers to achieve rapid and accurate alignment for generating multilayer PDMS microfluidic devices. PMID:26233409

  15. SEAN: SNP prediction and display program utilizing EST sequence clusters.

    PubMed

    Huntley, Derek; Baldo, Angela; Johri, Saurabh; Sergot, Marek

    2006-02-15

    SEAN is an application that predicts single nucleotide polymorphisms (SNPs) using multiple sequence alignments produced from expressed sequence tag (EST) clusters. The algorithm uses rules of sequence identity and SNP abundance to determine the quality of the prediction. A Java viewer is provided to display the EST alignments and predicted SNPs.

  16. Optical derotator alignment using image-processing algorithm for tracking laser vibrometer measurements of rotating objects.

    PubMed

    Khalil, Hossam; Kim, Dongkyu; Jo, Youngjoon; Park, Kyihwan

    2017-06-01

    An optical component called a Dove prism is used to rotate the laser beam of a laser-scanning vibrometer (LSV). This is called a derotator and is used for measuring the vibration of rotating objects. The main advantage of a derotator is that it works independently from an LSV. However, this device requires very specific alignment, in which the axis of the Dove prism must coincide with the rotational axis of the object. If the derotator is misaligned with the rotating object, the results of the vibration measurement are imprecise, owing to the alteration of the laser beam on the surface of the rotating object. In this study, a method is proposed for aligning a derotator with a rotating object through an image-processing algorithm that obtains the trajectory of a landmark attached to the object. After the trajectory of the landmark is mathematically modeled, the amount of derotator misalignment with respect to the object is calculated. The accuracy of the proposed method for aligning the derotator with the rotating object is experimentally tested.

  17. KISS for STRAP: user extensions for a protein alignment editor.

    PubMed

    Gille, Christoph; Lorenzen, Stephan; Michalsky, Elke; Frömmel, Cornelius

    2003-12-12

    The Structural Alignment Program STRAP is a comfortable comprehensive editor and analyzing tool for protein alignments. A wide range of functions related to protein sequences and protein structures are accessible with an intuitive graphical interface. Recent features include mapping of mutations and polymorphisms onto structures and production of high quality figures for publication. Here we address the general problem of multi-purpose program packages to keep up with the rapid development of bioinformatical methods and the demand for specific program functions. STRAP was remade implementing a novel design which aims at Keeping Interfaces in STRAP Simple (KISS). KISS renders STRAP extendable to bio-scientists as well as to bio-informaticians. Scientists with basic computer skills are capable of implementing statistical methods or embedding existing bioinformatical tools in STRAP themselves. For bio-informaticians STRAP may serve as an environment for rapid prototyping and testing of complex algorithms such as automatic alignment algorithms or phylogenetic methods. Further, STRAP can be applied as an interactive web applet to present data related to a particular protein family and as a teaching tool. JAVA-1.4 or higher. http://www.charite.de/bioinf/strap/

  18. SU-F-P-30: Clinical Assessment of Auto Beam-Hold Triggered by Fiducial Localization During Prostate RapidArc Delivery

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Atkinson, P; Chen, Q

    2016-06-15

    Purpose: To assess the clinical efficacy of auto beam hold during prostate RapidArc delivery, triggered by fiducial localization on kV imaging with a Varian True Beam. Methods: Prostate patients with four gold fiducials were candidates in this study. Daily setup was accomplished by aligning to fiducials using orthogonal kV imaging. During RapidArc delivery, a kV image was automatically acquired with a momentary beam hold every 60 degrees of gantry rotation. The position of each fiducial was identified by a search algorithm and compared to a predetermined 1.4 cm diameter target area. Treatment continued if all the fiducials were within themore » target area. If any fiducial was outside the target area the beam hold was not released, and the operators determined if the patient needed re-alignment using the daily setup method. Results: Four patients were initially selected. For three patients, the auto beam hold performed seamlessly. In one instance, the system correctly identified misaligned fiducials, stopped treatment, and the patient was re-positioned. The fourth patient had a prosthetic hip which sometimes blocked the fiducials and caused the fiducial search algorithm to fail. The auto beam hold was disabled for this patient and the therapists manually monitored the fiducial positions during treatment. Average delivery time for a 2-arc fraction was increased by 59 seconds. Phantom studies indicated the dose discrepancy related to multiple beam holds is <0.1%. For a plan with 43 fractions, the additional imaging increased dose by an estimated 68 cGy. Conclusion: Automated intrafraction kV imaging can effectively perform auto beam holds due to patient movement, with the exception of prosthetic hip patients. The additional imaging dose and delivery time are clinically acceptable. It may be a cost-effective alternative to Calypso in RapidArc prostate patient delivery. Further study is warranted to explore its feasibility under various clinical conditions.« less

  19. Darwin v. 2.0: an interpreted computer language for the biosciences.

    PubMed

    Gonnet, G H; Hallett, M T; Korostensky, C; Bernardin, L

    2000-02-01

    We announce the availability of the second release of Darwin v. 2.0, an interpreted computer language especially tailored to researchers in the biosciences. The system is a general tool applicable to a wide range of problems. This second release improves Darwin version 1.6 in several ways: it now contains (1) a larger set of libraries touching most of the classical problems from computational biology (pairwise alignment, all versus all alignments, tree construction, multiple sequence alignment), (2) an expanded set of general purpose algorithms (search algorithms for discrete problems, matrix decomposition routines, complex/long integer arithmetic operations), (3) an improved language with a cleaner syntax, (4) better on-line help, and (5) a number of fixes to user-reported bugs. Darwin is made available for most operating systems free of char ge from the Computational Biochemistry Research Group (CBRG), reachable at http://chrg.inf.ethz.ch. darwin@inf.ethz.ch

  20. Theory of short-scale field-aligned density striations due to ionospheric heating

    NASA Technical Reports Server (NTRS)

    Lee, M.-C.; Fejer, J. A.

    1978-01-01

    The theoretical saturation spectrum of parametrically excited Langmuir waves in a locally uniform ionosphere is shown by the present calculations to produce, by ohmic dissipation, short-scale field-aligned density striations. The spectrum of the calculated striations is not inconsistent with observations of field-aligned scatter of VHF and UHF waves in ionospheric modification experiments if local increases of the pump field due to focusing are invoked.

  1. Automation of the targeting and reflective alignment concept

    NASA Technical Reports Server (NTRS)

    Redfield, Robin C.

    1992-01-01

    The automated alignment system, described herein, employs a reflective, passive (requiring no power) target and includes a PC-based imaging system and one camera mounted on a six degree of freedom robot manipulator. The system detects and corrects for manipulator misalignment in three translational and three rotational directions by employing the Targeting and Reflective Alignment Concept (TRAC), which simplifies alignment by decoupling translational and rotational alignment control. The concept uses information on the camera and the target's relative position based on video feedback from the camera. These relative positions are converted into alignment errors and minimized by motions of the robot. The system is robust to exogenous lighting by virtue of a subtraction algorithm which enables the camera to only see the target. These capabilities are realized with relatively minimal complexity and expense.

  2. A statistical physics perspective on alignment-independent protein sequence comparison.

    PubMed

    Chattopadhyay, Amit K; Nasiev, Diar; Flower, Darren R

    2015-08-01

    Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from 'first passage probability distribution' to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach. © The Author 2015. Published by Oxford University Press.

  3. High-speed peak matching algorithm for retention time alignment of gas chromatographic data for chemometric analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnson, Kevin J.; Wright, Bob W.; Jarman, Kristin H.

    2003-05-09

    A rapid retention time alignment algorithm was developed as a preprocessing utility to be used prior to chemometric analysis of large datasets of diesel fuel gas chromatographic profiles. Retention time variation from chromatogram-to-chromatogram has been a significant impediment against the use of chemometric techniques in the analysis of chromatographic data due to the inability of current multivariate techniques to correctly model information that shifts from variable to variable within a dataset. The algorithm developed is shown to increase the efficacy of pattern recognition methods applied to a set of diesel fuel chromatograms by retaining chemical selectivity while reducing chromatogram-to-chromatogram retentionmore » time variations and to do so on a time scale that makes analysis of large sets of chromatographic data practical.« less

  4. Polynomial interpretation of multipole vectors

    NASA Astrophysics Data System (ADS)

    Katz, Gabriel; Weeks, Jeff

    2004-09-01

    Copi, Huterer, Starkman, and Schwarz introduced multipole vectors in a tensor context and used them to demonstrate that the first-year Wilkinson microwave anisotropy probe (WMAP) quadrupole and octopole planes align at roughly the 99.9% confidence level. In the present article, the language of polynomials provides a new and independent derivation of the multipole vector concept. Bézout’s theorem supports an elementary proof that the multipole vectors exist and are unique (up to rescaling). The constructive nature of the proof leads to a fast, practical algorithm for computing multipole vectors. We illustrate the algorithm by finding exact solutions for some simple toy examples and numerical solutions for the first-year WMAP quadrupole and octopole. We then apply our algorithm to Monte Carlo skies to independently reconfirm the estimate that the WMAP quadrupole and octopole planes align at the 99.9% level.

  5. TotalReCaller: improved accuracy and performance via integrated alignment and base-calling.

    PubMed

    Menges, Fabian; Narzisi, Giuseppe; Mishra, Bud

    2011-09-01

    Currently, re-sequencing approaches use multiple modules serially to interpret raw sequencing data from next-generation sequencing platforms, while remaining oblivious to the genomic information until the final alignment step. Such approaches fail to exploit the full information from both raw sequencing data and the reference genome that can yield better quality sequence reads, SNP-calls, variant detection, as well as an alignment at the best possible location in the reference genome. Thus, there is a need for novel reference-guided bioinformatics algorithms for interpreting analog signals representing sequences of the bases ({A, C, G, T}), while simultaneously aligning possible sequence reads to a source reference genome whenever available. Here, we propose a new base-calling algorithm, TotalReCaller, to achieve improved performance. A linear error model for the raw intensity data and Burrows-Wheeler transform (BWT) based alignment are combined utilizing a Bayesian score function, which is then globally optimized over all possible genomic locations using an efficient branch-and-bound approach. The algorithm has been implemented in soft- and hardware [field-programmable gate array (FPGA)] to achieve real-time performance. Empirical results on real high-throughput Illumina data were used to evaluate TotalReCaller's performance relative to its peers-Bustard, BayesCall, Ibis and Rolexa-based on several criteria, particularly those important in clinical and scientific applications. Namely, it was evaluated for (i) its base-calling speed and throughput, (ii) its read accuracy and (iii) its specificity and sensitivity in variant calling. A software implementation of TotalReCaller as well as additional information, is available at: http://bioinformatics.nyu.edu/wordpress/projects/totalrecaller/ fabian.menges@nyu.edu.

  6. GASP: Gapped Ancestral Sequence Prediction for proteins

    PubMed Central

    Edwards, Richard J; Shields, Denis C

    2004-01-01

    Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199

  7. K-mer Content, Correlation, and Position Analysis of Genome DNA Sequences for the Identification of Function and Evolutionary Features

    PubMed Central

    Sievers, Aaron; Bosiek, Katharina; Bisch, Marc; Dreessen, Chris; Riedel, Jascha; Froß, Patrick; Hausmann, Michael; Hildenbrand, Georg

    2017-01-01

    In genome analysis, k-mer-based comparison methods have become standard tools. However, even though they are able to deliver reliable results, other algorithms seem to work better in some cases. To improve k-mer-based DNA sequence analysis and comparison, we successfully checked whether adding positional resolution is beneficial for finding and/or comparing interesting organizational structures. A simple but efficient algorithm for extracting and saving local k-mer spectra (frequency distribution of k-mers) was developed and used. The results were analyzed by including positional information based on visualizations as genomic maps and by applying basic vector correlation methods. This analysis was concentrated on small word lengths (1 ≤ k ≤ 4) on relatively small viral genomes of Papillomaviridae and Herpesviridae, while also checking its usability for larger sequences, namely human chromosome 2 and the homologous chromosomes (2A, 2B) of a chimpanzee. Using this alignment-free analysis, several regions with specific characteristics in Papillomaviridae and Herpesviridae formerly identified by independent, mostly alignment-based methods, were confirmed. Correlations between the k-mer content and several genes in these genomes have been found, showing similarities between classified and unclassified viruses, which may be potentially useful for further taxonomic research. Furthermore, unknown k-mer correlations in the genomes of Human Herpesviruses (HHVs), which are probably of major biological function, are found and described. Using the chromosomes of a chimpanzee and human that are currently known, identities between the species on every analyzed chromosome were reproduced. This demonstrates the feasibility of our approach for large data sets of complex genomes. Based on these results, we suggest k-mer analysis with positional resolution as a method for closing a gap between the effectiveness of alignment-based methods (like NCBI BLAST) and the high pace of standard k-mer analysis. PMID:28422050

  8. Coherent Detection of High-Rate Optical PPM Signals

    NASA Technical Reports Server (NTRS)

    Vilnrotter, Victor; Fernandez, Michela Munoz

    2006-01-01

    A method of coherent detection of high-rate pulse-position modulation (PPM) on a received laser beam has been conceived as a means of reducing the deleterious effects of noise and atmospheric turbulence in free-space optical communication using focal-plane detector array technologies. In comparison with a receiver based on direct detection of the intensity modulation of a PPM signal, a receiver based on the present method of coherent detection performs well at much higher background levels. In principle, the coherent-detection receiver can exhibit quantum-limited performance despite atmospheric turbulence. The key components of such a receiver include standard receiver optics, a laser that serves as a local oscillator, a focal-plane array of photodetectors, and a signal-processing and data-acquisition assembly needed to sample the focal-plane fields and reconstruct the pulsed signal prior to detection. The received PPM-modulated laser beam and the local-oscillator beam are focused onto the photodetector array, where they are mixed in the detection process. The two lasers are of the same or nearly the same frequency. If the two lasers are of different frequencies, then the coherent detection process is characterized as heterodyne and, using traditional heterodyne-detection terminology, the difference between the two laser frequencies is denoted the intermediate frequency (IF). If the two laser beams are of the same frequency and remain aligned in phase, then the coherent detection process is characterized as homodyne (essentially, heterodyne detection at zero IF). As a result of the inherent squaring operation of each photodetector, the output current includes an IF component that contains the signal modulation. The amplitude of the IF component is proportional to the product of the local-oscillator signal amplitude and the PPM signal amplitude. Hence, by using a sufficiently strong local-oscillator signal, one can make the PPM-modulated IF signal strong enough to overcome thermal noise in the receiver circuits: this is what makes it possible to achieve near-quantum-limited detection in the presence of strong background. Following quantum-limited coherent detection, the outputs of the individual photodetectors are automatically aligned in phase by use of one or more adaptive array compensation algorithms [e.g., the least-mean-square (LMS) algorithm]. Then the outputs are combined and the resulting signal is processed to extract the high-rate information, as though the PPM signal were received by a single photodetector. In a continuing series of experiments to test this method (see Fig. 1), the local oscillator has a wavelength of 1,064 nm, and another laser is used as a signal transmitter at a slightly different wavelength to establish an IF of about 6 MHz. There are 16 photodetectors in a 4 4 focal-plane array; the detector outputs are digitized at a sampling rate of 25 MHz, and the signals in digital form are combined by use of the LMS algorithm. Convergence of the adaptive combining algorithm in the presence of simulated atmospheric turbulence for optical PPM signals has already been demonstrated in the laboratory; the combined output is shown in Fig. 2(a), and Fig. 2(b) shows the behavior of the phase of the combining weights as a function of time (or samples). We observe that the phase of the weights has a sawtooth shape due to the continuously changing phase in the down-converted output, which is not exactly at zero frequency. Detailed performance analysis of this coherent free-space optical communication system in the presence of simulated atmospheric turbulence is currently under way.

  9. GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

    PubMed

    Alser, Mohammed; Hassan, Hasan; Xin, Hongyi; Ergin, Oguz; Mutlu, Onur; Alkan, Can

    2017-11-01

    High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10. https://github.com/BilkentCompGen/GateKeeper. mohammedalser@bilkent.edu.tr or onur.mutlu@inf.ethz.ch or calkan@cs.bilkent.edu.tr. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  10. Hidden Markov models of biological primary sequence information.

    PubMed Central

    Baldi, P; Chauvin, Y; Hunkapiller, T; McClure, M A

    1994-01-01

    Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences. PMID:8302831

  11. Reconstruction software of the silicon tracker of DAMPE mission

    NASA Astrophysics Data System (ADS)

    Tykhonov, A.; Gallo, V.; Wu, X.; Zimmer, S.

    2017-10-01

    DAMPE is a satellite-borne experiment aimed to probe astroparticle physics in the GeV-TeV energy range. The Silicon tracker (STK) is one of the key components of DAMPE, which allows the reconstruction of trajectories (tracks) of detected particles. The non-negligible amount of material in the tracker poses a challenge to its reconstruction and alignment. In this paper we describe methods to address this challenge. We present the track reconstruction algorithm and give insight into the alignment algorithm. We also present our CAD-to-GDML converter, an in-house tool for implementing detector geometry in the software from the CAD drawings of the detector.

  12. Automatic segmentation of mammogram and tomosynthesis images

    NASA Astrophysics Data System (ADS)

    Sargent, Dusty; Park, Sun Young

    2016-03-01

    Breast cancer is a one of the most common forms of cancer in terms of new cases and deaths both in the United States and worldwide. However, the survival rate with breast cancer is high if it is detected and treated before it spreads to other parts of the body. The most common screening methods for breast cancer are mammography and digital tomosynthesis, which involve acquiring X-ray images of the breasts that are interpreted by radiologists. The work described in this paper is aimed at optimizing the presentation of mammography and tomosynthesis images to the radiologist, thereby improving the early detection rate of breast cancer and the resulting patient outcomes. Breast cancer tissue has greater density than normal breast tissue, and appears as dense white image regions that are asymmetrical between the breasts. These irregularities are easily seen if the breast images are aligned and viewed side-by-side. However, since the breasts are imaged separately during mammography, the images may be poorly centered and aligned relative to each other, and may not properly focus on the tissue area. Similarly, although a full three dimensional reconstruction can be created from digital tomosynthesis images, the same centering and alignment issues can occur for digital tomosynthesis. Thus, a preprocessing algorithm that aligns the breasts for easy side-by-side comparison has the potential to greatly increase the speed and accuracy of mammogram reading. Likewise, the same preprocessing can improve the results of automatic tissue classification algorithms for mammography. In this paper, we present an automated segmentation algorithm for mammogram and tomosynthesis images that aims to improve the speed and accuracy of breast cancer screening by mitigating the above mentioned problems. Our algorithm uses information in the DICOM header to facilitate preprocessing, and incorporates anatomical region segmentation and contour analysis, along with a hidden Markov model (HMM) for processing the multi-frame tomosynthesis images. The output of the algorithm is a new set of images that have been processed to show only the diagnostically relevant region and align the breasts so that they can be easily compared side-by-side. Our method has been tested on approximately 750 images, including various examples of mammogram, tomosynthesis, and scanned images, and has correctly segmented the diagnostically relevant image region in 97% of cases.

  13. Alignment and assembly process for primary mirror subsystem of a spaceborne telescope

    NASA Astrophysics Data System (ADS)

    Lin, Wei-Cheng; Chang, Shenq-Tsong; Chang, Sheng-Hsiung; Chang, Chen-Peng; Lin, Yu-Chuan; Chin, Chi-Chieh; Pan, Hsu-Pin; Huang, Ting-Ming

    2015-11-01

    In this study, a multispectral spaceborne Cassegrain telescope was developed. The telescope was equipped with a primary mirror with a 450-mm clear aperture composed of Zerodur and lightweighted at a ratio of approximately 50% to meet both thermal and mass requirements. Reducing the astigmatism was critical for this mirror. The astigmatism is caused by gravity effects, the bonding process, and deformation from mounting the main structure of the telescope (main plate). This article presents the primary mirror alignment, mechanical ground-supported equipment (MGSE), assembly process, and optical performance test used to assemble the primary mirror. A mechanical compensated shim is used as the interface between the bipod flexure and main plate. The shim was used to compensate for manufacturer errors found in components and differences between local coplanarity errors to prevent stress while the bipod flexure was screwed to the main plate. After primary mirror assembly, an optical performance test method called a bench test with an algorithm was used to analyze the astigmatism caused by the gravity effect and deformation from the mounting or supporter. The tolerance conditions for the primary mirror assembly require the astigmatism caused by gravity and mounting force deformation to be less than P-V 0.02 λ at 632.8 nm. The results demonstrated that the designed MGSE used in the alignment and assembly processes met the critical requirements for the primary mirror assembly of the telescope.

  14. A Palmprint Recognition Algorithm Using Phase-Only Correlation

    NASA Astrophysics Data System (ADS)

    Ito, Koichi; Aoki, Takafumi; Nakajima, Hiroshi; Kobayashi, Koji; Higuchi, Tatsuo

    This paper presents a palmprint recognition algorithm using Phase-Only Correlation (POC). The use of phase components in 2D (two-dimensional) discrete Fourier transforms of palmprint images makes it possible to achieve highly robust image registration and matching. In the proposed algorithm, POC is used to align scaling, rotation and translation between two palmprint images, and evaluate similarity between them. Experimental evaluation using a palmprint image database clearly demonstrates efficient matching performance of the proposed algorithm.

  15. Clustering algorithms for identifying core atom sets and for assessing the precision of protein structure ensembles.

    PubMed

    Snyder, David A; Montelione, Gaetano T

    2005-06-01

    An important open question in the field of NMR-based biomolecular structure determination is how best to characterize the precision of the resulting ensemble of structures. Typically, the RMSD, as minimized in superimposing the ensemble of structures, is the preferred measure of precision. However, the presence of poorly determined atomic coordinates and multiple "RMSD-stable domains"--locally well-defined regions that are not aligned in global superimpositions--complicate RMSD calculations. In this paper, we present a method, based on a novel, structurally defined order parameter, for identifying a set of core atoms to use in determining superimpositions for RMSD calculations. In addition we present a method for deciding whether to partition that core atom set into "RMSD-stable domains" and, if so, how to determine partitioning of the core atom set. We demonstrate our algorithm and its application in calculating statistically sound RMSD values by applying it to a set of NMR-derived structural ensembles, superimposing each RMSD-stable domain (or the entire core atom set, where appropriate) found in each protein structure under consideration. A parameter calculated by our algorithm using a novel, kurtosis-based criterion, the epsilon-value, is a measure of precision of the superimposition that complements the RMSD. In addition, we compare our algorithm with previously described algorithms for determining core atom sets. The methods presented in this paper for biomolecular structure superimposition are quite general, and have application in many areas of structural bioinformatics and structural biology.

  16. GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies.

    PubMed

    Kim, Jeremie S; Senol Cali, Damla; Xin, Hongyi; Lee, Donghyuk; Ghose, Saugata; Alser, Mohammed; Hassan, Hasan; Ergin, Oguz; Alkan, Can; Mutlu, Onur

    2018-05-09

    Seed location filtering is critical in DNA read mapping, a process where billions of DNA fragments (reads) sampled from a donor are mapped onto a reference genome to identify genomic variants of the donor. State-of-the-art read mappers 1) quickly generate possible mapping locations for seeds (i.e., smaller segments) within each read, 2) extract reference sequences at each of the mapping locations, and 3) check similarity between each read and its associated reference sequences with a computationally-expensive algorithm (i.e., sequence alignment) to determine the origin of the read. A seed location filter comes into play before alignment, discarding seed locations that alignment would deem a poor match. The ideal seed location filter would discard all poor match locations prior to alignment such that there is no wasted computation on unnecessary alignments. We propose a novel seed location filtering algorithm, GRIM-Filter, optimized to exploit 3D-stacked memory systems that integrate computation within a logic layer stacked under memory layers, to perform processing-in-memory (PIM). GRIM-Filter quickly filters seed locations by 1) introducing a new representation of coarse-grained segments of the reference genome, and 2) using massively-parallel in-memory operations to identify read presence within each coarse-grained segment. Our evaluations show that for a sequence alignment error tolerance of 0.05, GRIM-Filter 1) reduces the false negative rate of filtering by 5.59x-6.41x, and 2) provides an end-to-end read mapper speedup of 1.81x-3.65x, compared to a state-of-the-art read mapper employing the best previous seed location filtering algorithm. GRIM-Filter exploits 3D-stacked memory, which enables the efficient use of processing-in-memory, to overcome the memory bandwidth bottleneck in seed location filtering. We show that GRIM-Filter significantly improves the performance of a state-of-the-art read mapper. GRIM-Filter is a universal seed location filter that can be applied to any read mapper. We hope that our results provide inspiration for new works to design other bioinformatics algorithms that take advantage of emerging technologies and new processing paradigms, such as processing-in-memory using 3D-stacked memory devices.

  17. SFESA: a web server for pairwise alignment refinement by secondary structure shifts.

    PubMed

    Tong, Jing; Pei, Jimin; Grishin, Nick V

    2015-09-03

    Protein sequence alignment is essential for a variety of tasks such as homology modeling and active site prediction. Alignment errors remain the main cause of low-quality structure models. A bioinformatics tool to refine alignments is needed to make protein alignments more accurate. We developed the SFESA web server to refine pairwise protein sequence alignments. Compared to the previous version of SFESA, which required a set of 3D coordinates for a protein, the new server will search a sequence database for the closest homolog with an available 3D structure to be used as a template. For each alignment block defined by secondary structure elements in the template, SFESA evaluates alignment variants generated by local shifts and selects the best-scoring alignment variant. A scoring function that combines the sequence score of profile-profile comparison and the structure score of template-derived contact energy is used for evaluation of alignments. PROMALS pairwise alignments refined by SFESA are more accurate than those produced by current advanced alignment methods such as HHpred and CNFpred. In addition, SFESA also improves alignments generated by other software. SFESA is a web-based tool for alignment refinement, designed for researchers to compute, refine, and evaluate pairwise alignments with a combined sequence and structure scoring of alignment blocks. To our knowledge, the SFESA web server is the only tool that refines alignments by evaluating local shifts of secondary structure elements. The SFESA web server is available at http://prodata.swmed.edu/sfesa.

  18. In-flight alignment using H ∞ filter for strapdown INS on aircraft.

    PubMed

    Pei, Fu-Jun; Liu, Xuan; Zhu, Li

    2014-01-01

    In-flight alignment is an effective way to improve the accuracy and speed of initial alignment for strapdown inertial navigation system (INS). During the aircraft flight, strapdown INS alignment was disturbed by lineal and angular movements of the aircraft. To deal with the disturbances in dynamic initial alignment, a novel alignment method for SINS is investigated in this paper. In this method, an initial alignment error model of SINS in the inertial frame is established. The observability of the system is discussed by piece-wise constant system (PWCS) theory and observable degree is computed by the singular value decomposition (SVD) theory. It is demonstrated that the system is completely observable, and all the system state parameters can be estimated by optimal filter. Then a H ∞ filter was designed to resolve the uncertainty of measurement noise. The simulation results demonstrate that the proposed algorithm can reach a better accuracy under the dynamic disturbance condition.

  19. Using structure to explore the sequence alignment space of remote homologs.

    PubMed

    Kuziemko, Andrew; Honig, Barry; Petrey, Donald

    2011-10-01

    Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP) are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is "optimal" in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are "suboptimal" in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for "modelability", we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended.

  20. The review and results of different methods for facial recognition

    NASA Astrophysics Data System (ADS)

    Le, Yifan

    2017-09-01

    In recent years, facial recognition draws much attention due to its wide potential applications. As a unique technology in Biometric Identification, facial recognition represents a significant improvement since it could be operated without cooperation of people under detection. Hence, facial recognition will be taken into defense system, medical detection, human behavior understanding, etc. Several theories and methods have been established to make progress in facial recognition: (1) A novel two-stage facial landmark localization method is proposed which has more accurate facial localization effect under specific database; (2) A statistical face frontalization method is proposed which outperforms state-of-the-art methods for face landmark localization; (3) It proposes a general facial landmark detection algorithm to handle images with severe occlusion and images with large head poses; (4) There are three methods proposed on Face Alignment including shape augmented regression method, pose-indexed based multi-view method and a learning based method via regressing local binary features. The aim of this paper is to analyze previous work of different aspects in facial recognition, focusing on concrete method and performance under various databases. In addition, some improvement measures and suggestions in potential applications will be put forward.

  1. Efficient and robust model-to-image alignment using 3D scale-invariant features.

    PubMed

    Toews, Matthew; Wells, William M

    2013-04-01

    This paper presents feature-based alignment (FBA), a general method for efficient and robust model-to-image alignment. Volumetric images, e.g. CT scans of the human body, are modeled probabilistically as a collage of 3D scale-invariant image features within a normalized reference space. Features are incorporated as a latent random variable and marginalized out in computing a maximum a posteriori alignment solution. The model is learned from features extracted in pre-aligned training images, then fit to features extracted from a new image to identify a globally optimal locally linear alignment solution. Novel techniques are presented for determining local feature orientation and efficiently encoding feature intensity in 3D. Experiments involving difficult magnetic resonance (MR) images of the human brain demonstrate FBA achieves alignment accuracy similar to widely-used registration methods, while requiring a fraction of the memory and computation resources and offering a more robust, globally optimal solution. Experiments on CT human body scans demonstrate FBA as an effective system for automatic human body alignment where other alignment methods break down. Copyright © 2012 Elsevier B.V. All rights reserved.

  2. Efficient and Robust Model-to-Image Alignment using 3D Scale-Invariant Features

    PubMed Central

    Toews, Matthew; Wells, William M.

    2013-01-01

    This paper presents feature-based alignment (FBA), a general method for efficient and robust model-to-image alignment. Volumetric images, e.g. CT scans of the human body, are modeled probabilistically as a collage of 3D scale-invariant image features within a normalized reference space. Features are incorporated as a latent random variable and marginalized out in computing a maximum a-posteriori alignment solution. The model is learned from features extracted in pre-aligned training images, then fit to features extracted from a new image to identify a globally optimal locally linear alignment solution. Novel techniques are presented for determining local feature orientation and efficiently encoding feature intensity in 3D. Experiments involving difficult magnetic resonance (MR) images of the human brain demonstrate FBA achieves alignment accuracy similar to widely-used registration methods, while requiring a fraction of the memory and computation resources and offering a more robust, globally optimal solution. Experiments on CT human body scans demonstrate FBA as an effective system for automatic human body alignment where other alignment methods break down. PMID:23265799

  3. Cache-Oblivious parallel SIMD Viterbi decoding for sequence search in HMMER

    PubMed Central

    2014-01-01

    Background HMMER is a commonly used bioinformatics tool based on Hidden Markov Models (HMMs) to analyze and process biological sequences. One of its main homology engines is based on the Viterbi decoding algorithm, which was already highly parallelized and optimized using Farrar’s striped processing pattern with Intel SSE2 instruction set extension. Results A new SIMD vectorization of the Viterbi decoding algorithm is proposed, based on an SSE2 inter-task parallelization approach similar to the DNA alignment algorithm proposed by Rognes. Besides this alternative vectorization scheme, the proposed implementation also introduces a new partitioning of the Markov model that allows a significantly more efficient exploitation of the cache locality. Such optimization, together with an improved loading of the emission scores, allows the achievement of a constant processing throughput, regardless of the innermost-cache size and of the dimension of the considered model. Conclusions The proposed optimized vectorization of the Viterbi decoding algorithm was extensively evaluated and compared with the HMMER3 decoder to process DNA and protein datasets, proving to be a rather competitive alternative implementation. Being always faster than the already highly optimized ViterbiFilter implementation of HMMER3, the proposed Cache-Oblivious Parallel SIMD Viterbi (COPS) implementation provides a constant throughput and offers a processing speedup as high as two times faster, depending on the model’s size. PMID:24884826

  4. Identification of Conserved Water Sites in Protein Structures for Drug Design.

    PubMed

    Jukič, Marko; Konc, Janez; Gobec, Stanislav; Janežič, Dušanka

    2017-12-26

    Identification of conserved waters in protein structures is a challenging task with applications in molecular docking and protein stability prediction. As an alternative to computationally demanding simulations of proteins in water, experimental cocrystallized waters in the Protein Data Bank (PDB) in combination with a local structure alignment algorithm can be used for reliable prediction of conserved water sites. We developed the ProBiS H2O approach based on the previously developed ProBiS algorithm, which enables identification of conserved water sites in proteins using experimental protein structures from the PDB or a set of custom protein structures available to the user. With a protein structure, a binding site, or an individual water molecule as a query, ProBiS H2O collects similar proteins from the PDB and performs local or binding site-specific superimpositions of the query structure with similar proteins using the ProBiS algorithm. It collects the experimental water molecules from the similar proteins and transposes them to the query protein. Transposed waters are clustered by their mutual proximity, which enables identification of discrete sites in the query protein with high water conservation. ProBiS H2O is a robust and fast new approach that uses existing experimental structural data to identify conserved water sites on the interfaces of protein complexes, for example protein-small molecule interfaces, and elsewhere on the protein structures. It has been successfully validated in several reported proteins in which conserved water molecules were found to play an important role in ligand binding with applications in drug design.

  5. Two Simple and Efficient Algorithms to Compute the SP-Score Objective Function of a Multiple Sequence Alignment.

    PubMed

    Ranwez, Vincent

    2016-01-01

    Multiple sequence alignment (MSA) is a crucial step in many molecular analyses and many MSA tools have been developed. Most of them use a greedy approach to construct a first alignment that is then refined by optimizing the sum of pair score (SP-score). The SP-score estimation is thus a bottleneck for most MSA tools since it is repeatedly required and is time consuming. Given an alignment of n sequences and L sites, I introduce here optimized solutions reaching O(nL) time complexity for affine gap cost, instead of O(n2L), which are easy to implement.

  6. An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes

    NASA Astrophysics Data System (ADS)

    Vincenti, H.; Lobet, M.; Lehe, R.; Sasanka, R.; Vay, J.-L.

    2017-01-01

    In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈ 20 pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD register length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scatter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of × 2 to × 2.5 speed-up in double precision for particle shape factor of orders 1- 3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles).

  7. Concurrent and Accurate Short Read Mapping on Multicore Processors.

    PubMed

    Martínez, Héctor; Tárraga, Joaquín; Medina, Ignacio; Barrachina, Sergio; Castillo, Maribel; Dopazo, Joaquín; Quintana-Ortí, Enrique S

    2015-01-01

    We introduce a parallel aligner with a work-flow organization for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, HPG Aligner SA (HPG Aligner SA is an open-source application. The software is available at http://www.opencb.org, exploits a suffix array to rapidly map a large fraction of the RNA fragments (reads), as well as leverages the accuracy of the Smith-Waterman algorithm to deal with conflictive reads. The aligner is enhanced with a careful strategy to detect splice junctions based on an adaptive division of RNA reads into small segments (or seeds), which are then mapped onto a number of candidate alignment locations, providing crucial information for the successful alignment of the complete reads. The experimental results on a platform with Intel multicore technology report the parallel performance of HPG Aligner SA, on RNA reads of 100-400 nucleotides, which excels in execution time/sensitivity to state-of-the-art aligners such as TopHat 2+Bowtie 2, MapSplice, and STAR.

  8. An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes

    DOE PAGES

    Vincenti, H.; Lobet, M.; Lehe, R.; ...

    2016-09-19

    In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈20pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD registermore » length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scat;ter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of ×2 to ×2.5 speed-up in double precision for particle shape factor of orders 1–3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles). Program summary Program Title: vec_deposition Program Files doi:http://dx.doi.org/10.17632/nh77fv9k8c.1 Licensing provisions: BSD 3-Clause Programming language: Fortran 90 External routines/libraries:  OpenMP > 4.0 Nature of problem: Exascale architectures will have many-core processors per node with long vector data registers capable of performing one single instruction on multiple data during one clock cycle. Data register lengths are expected to double every four years and this pushes for new portable solutions for efficiently vectorizing Particle-In-Cell codes on these future many-core architectures. One of the main hotspot routines of the PIC algorithm is the current/charge deposition for which there is no efficient and portable vector algorithm. Solution method: Here we provide an efficient and portable vector algorithm of current/charge deposition routines that uses a new data structure, which significantly reduces gather/scatter operations. Vectorization is controlled using OpenMP 4.0 compiler directives for vectorization which ensures portability across different architectures. Restrictions: Here we do not provide the full PIC algorithm with an executable but only vector routines for current/charge deposition. These scalar/vector routines can be used as library routines in your 3D Particle-In-Cell code. However, to get the best performances out of vector routines you have to satisfy the two following requirements: (1) Your code should implement particle tiling (as explained in the manuscript) to allow for maximized cache reuse and reduce memory accesses that can hinder vector performances. The routines can be used directly on each particle tile. (2) You should compile your code with a Fortran 90 compiler (e.g Intel, gnu or cray) and provide proper alignment flags and compiler alignment directives (more details in README file).« less

  9. An efficient and portable SIMD algorithm for charge/current deposition in Particle-In-Cell codes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vincenti, H.; Lobet, M.; Lehe, R.

    In current computer architectures, data movement (from die to network) is by far the most energy consuming part of an algorithm (≈20pJ/word on-die to ≈10,000 pJ/word on the network). To increase memory locality at the hardware level and reduce energy consumption related to data movement, future exascale computers tend to use many-core processors on each compute nodes that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD registermore » length is expected to double every four years. As a consequence, Particle-In-Cell (PIC) codes will have to achieve good vectorization to fully take advantage of these upcoming architectures. In this paper, we present a new algorithm that allows for efficient and portable SIMD vectorization of current/charge deposition routines that are, along with the field gathering routines, among the most time consuming parts of the PIC algorithm. Our new algorithm uses a particular data structure that takes into account memory alignment constraints and avoids gather/scat;ter instructions that can significantly affect vectorization performances on current CPUs. The new algorithm was successfully implemented in the 3D skeleton PIC code PICSAR and tested on Haswell Xeon processors (AVX2-256 bits wide data registers). Results show a factor of ×2 to ×2.5 speed-up in double precision for particle shape factor of orders 1–3. The new algorithm can be applied as is on future KNL (Knights Landing) architectures that will include AVX-512 instruction sets with 512 bits register lengths (8 doubles/16 singles). Program summary Program Title: vec_deposition Program Files doi:http://dx.doi.org/10.17632/nh77fv9k8c.1 Licensing provisions: BSD 3-Clause Programming language: Fortran 90 External routines/libraries:  OpenMP > 4.0 Nature of problem: Exascale architectures will have many-core processors per node with long vector data registers capable of performing one single instruction on multiple data during one clock cycle. Data register lengths are expected to double every four years and this pushes for new portable solutions for efficiently vectorizing Particle-In-Cell codes on these future many-core architectures. One of the main hotspot routines of the PIC algorithm is the current/charge deposition for which there is no efficient and portable vector algorithm. Solution method: Here we provide an efficient and portable vector algorithm of current/charge deposition routines that uses a new data structure, which significantly reduces gather/scatter operations. Vectorization is controlled using OpenMP 4.0 compiler directives for vectorization which ensures portability across different architectures. Restrictions: Here we do not provide the full PIC algorithm with an executable but only vector routines for current/charge deposition. These scalar/vector routines can be used as library routines in your 3D Particle-In-Cell code. However, to get the best performances out of vector routines you have to satisfy the two following requirements: (1) Your code should implement particle tiling (as explained in the manuscript) to allow for maximized cache reuse and reduce memory accesses that can hinder vector performances. The routines can be used directly on each particle tile. (2) You should compile your code with a Fortran 90 compiler (e.g Intel, gnu or cray) and provide proper alignment flags and compiler alignment directives (more details in README file).« less

  10. Dark Energy Camera for Blanco

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Binder, Gary A.; /Caltech /SLAC

    2010-08-25

    In order to make accurate measurements of dark energy, a system is needed to monitor the focus and alignment of the Dark Energy Camera (DECam) to be located on the Blanco 4m Telescope for the upcoming Dark Energy Survey. One new approach under development is to fit out-of-focus star images to a point spread function from which information about the focus and tilt of the camera can be obtained. As a first test of a new algorithm using this idea, simulated star images produced from a model of DECam in the optics software Zemax were fitted. Then, real images frommore » the Mosaic II imager currently installed on the Blanco telescope were used to investigate the algorithm's capabilities. A number of problems with the algorithm were found, and more work is needed to understand its limitations and improve its capabilities so it can reliably predict camera alignment and focus.« less

  11. Reducing the number of templates for aligned-spin compact binary coalescence gravitational wave searches using metric-agnostic template nudging

    NASA Astrophysics Data System (ADS)

    Indik, Nathaniel; Fehrmann, Henning; Harke, Franz; Krishnan, Badri; Nielsen, Alex B.

    2018-06-01

    Efficient multidimensional template placement is crucial in computationally intensive matched-filtering searches for gravitational waves (GWs). Here, we implement the neighboring cell algorithm (NCA) to improve the detection volume of an existing compact binary coalescence (CBC) template bank. This algorithm has already been successfully applied for a binary millisecond pulsar search in data from the Fermi satellite. It repositions templates from overdense regions to underdense regions and reduces the number of templates that would have been required by a stochastic method to achieve the same detection volume. Our method is readily generalizable to other CBC parameter spaces. Here we apply this method to the aligned-single-spin neutron star-black hole binary coalescence inspiral-merger-ringdown gravitational wave parameter space. We show that the template nudging algorithm can attain the equivalent effectualness of the stochastic method with 12% fewer templates.

  12. Successive approximation algorithm for beam-position-monitor-based LHC collimator alignment

    NASA Astrophysics Data System (ADS)

    Valentino, Gianluca; Nosych, Andriy A.; Bruce, Roderik; Gasior, Marek; Mirarchi, Daniele; Redaelli, Stefano; Salvachua, Belen; Wollmann, Daniel

    2014-02-01

    Collimators with embedded beam position monitor (BPM) button electrodes will be installed in the Large Hadron Collider (LHC) during the current long shutdown period. For the subsequent operation, BPMs will allow the collimator jaws to be kept centered around the beam orbit. In this manner, a better beam cleaning efficiency and machine protection can be provided at unprecedented higher beam energies and intensities. A collimator alignment algorithm is proposed to center the jaws automatically around the beam. The algorithm is based on successive approximation and takes into account a correction of the nonlinear BPM sensitivity to beam displacement and an asymmetry of the electronic channels processing the BPM electrode signals. A software implementation was tested with a prototype collimator in the Super Proton Synchrotron. This paper presents results of the tests along with some considerations for eventual operation in the LHC.

  13. A smart-pixel holographic competitive learning network

    NASA Astrophysics Data System (ADS)

    Slagle, Timothy Michael

    Neural networks are adaptive classifiers which modify their decision boundaries based on feedback from externally- or internally-generated error signals. Optics is an attractive technology for neural network implementation because it offers the possibility of parallel, nearly instantaneous computation of the weighted neuron inputs by the propagation of light through the optical system. Using current optical device technology, system performance levels of 3 × 1011 connection updates per second can be achieved. This thesis presents an architecture for an optical competitive learning network which offers advantages over previous optical implementations, including smart-pixel-based optical neurons, phase- conjugate self-alignment of a single neuron plane, and high-density, parallel-access weight storage, interconnection, and learning in a volume hologram. The competitive learning algorithm with modifications for optical implementation is described, and algorithm simulations are performed for an example problem. The optical competitive learning architecture is then introduced. The optical system is simulated using the ``beamprop'' algorithm at the level of light propagating through the system components, and results showing competitive learning operation in agreement with the algorithm simulations are presented. The optical competitive learning requires a non-linear, non-local ``winner-take-all'' (WTA) neuron function. Custom-designed smart-pixel WTA neuron arrays were fabricated using CMOS VLSI/liquid crystal technology. Results of laboratory tests of the WTA arrays' switching characteristics, time response, and uniformity are then presented. The system uses a phase-conjugate mirror to write the self-aligning interconnection weight holograms, and energy gain is required from the reflection to minimize erasure of the existing weights. An experimental system for characterizing the PCM response is described. Useful gains of 20 were obtained with a polarization-multiplexed PCM readout, and gains of up to 60 were observed when a time-sequential read-out technique was used. Finally, the optical competitive learning laboratory system is described, including some necessary modifications to the previous architectures, and the data acquisition and control system developed for the system. Experimental results showing phase conjugation of the WTA outputs, holographic interconnect storage, associative storage between input images and WTA neuron outputs, and WTA array switching are presented, demonstrating the functions necessary for the operation of the optical learning system.

  14. A New Polar Transfer Alignment Algorithm with the Aid of a Star Sensor and Based on an Adaptive Unscented Kalman Filter.

    PubMed

    Cheng, Jianhua; Wang, Tongda; Wang, Lu; Wang, Zhenmin

    2017-10-23

    Because of the harsh polar environment, the master strapdown inertial navigation system (SINS) has low accuracy and the system model information becomes abnormal. In this case, existing polar transfer alignment (TA) algorithms which use the measurement information provided by master SINS would lose their effectiveness. In this paper, a new polar TA algorithm with the aid of a star sensor and based on an adaptive unscented Kalman filter (AUKF) is proposed to deal with the problems. Since the measurement information provided by master SINS is inaccurate, the accurate information provided by the star sensor is chosen as the measurement. With the compensation of lever-arm effect and the model of star sensor, the nonlinear navigation equations are derived. Combined with the attitude matching method, the filter models for polar TA are designed. An AUKF is introduced to solve the abnormal information of system model. Then, the AUKF is used to estimate the states of TA. Results have demonstrated that the performance of the new polar TA algorithm is better than the state-of-the-art polar TA algorithms. Therefore, the new polar TA algorithm proposed in this paper is effectively to ensure and improve the accuracy of TA in the harsh polar environment.

  15. A New Polar Transfer Alignment Algorithm with the Aid of a Star Sensor and Based on an Adaptive Unscented Kalman Filter

    PubMed Central

    Cheng, Jianhua; Wang, Tongda; Wang, Lu; Wang, Zhenmin

    2017-01-01

    Because of the harsh polar environment, the master strapdown inertial navigation system (SINS) has low accuracy and the system model information becomes abnormal. In this case, existing polar transfer alignment (TA) algorithms which use the measurement information provided by master SINS would lose their effectiveness. In this paper, a new polar TA algorithm with the aid of a star sensor and based on an adaptive unscented Kalman filter (AUKF) is proposed to deal with the problems. Since the measurement information provided by master SINS is inaccurate, the accurate information provided by the star sensor is chosen as the measurement. With the compensation of lever-arm effect and the model of star sensor, the nonlinear navigation equations are derived. Combined with the attitude matching method, the filter models for polar TA are designed. An AUKF is introduced to solve the abnormal information of system model. Then, the AUKF is used to estimate the states of TA. Results have demonstrated that the performance of the new polar TA algorithm is better than the state-of-the-art polar TA algorithms. Therefore, the new polar TA algorithm proposed in this paper is effectively to ensure and improve the accuracy of TA in the harsh polar environment. PMID:29065521

  16. Modified compensation algorithm of lever-arm effect and flexural deformation for polar shipborne transfer alignment based on improved adaptive Kalman filter

    NASA Astrophysics Data System (ADS)

    Wang, Tongda; Cheng, Jianhua; Guan, Dongxue; Kang, Yingyao; Zhang, Wei

    2017-09-01

    Due to the lever-arm effect and flexural deformation in the practical application of transfer alignment (TA), the TA performance is decreased. The existing polar TA algorithm only compensates a fixed lever-arm without considering the dynamic lever-arm caused by flexural deformation; traditional non-polar TA algorithms also have some limitations. Thus, the performance of existing compensation algorithms is unsatisfactory. In this paper, a modified compensation algorithm of the lever-arm effect and flexural deformation is proposed to promote the accuracy and speed of the polar TA. On the basis of a dynamic lever-arm model and a noise compensation method for flexural deformation, polar TA equations are derived in grid frames. Based on the velocity-plus-attitude matching method, the filter models of polar TA are designed. An adaptive Kalman filter (AKF) is improved to promote the robustness and accuracy of the system, and then applied to the estimation of the misalignment angles. Simulation and experiment results have demonstrated that the modified compensation algorithm based on the improved AKF for polar TA can effectively compensate the lever-arm effect and flexural deformation, and then improve the accuracy and speed of TA in the polar region.

  17. The Effect of Professional Learning Activities on Implementation of California's Quality Professional Learning Standards in Alignment with the Local Control Funding Formula Priority 2

    ERIC Educational Resources Information Center

    Pinotti, Sadie

    2017-01-01

    The purpose of this Delphi study was to identify the professional learning activities that experts perceive are necessary for local education agencies (LEAs) to effectively implement California's Quality Professional Learning Standards (QPLS) in alignment with the Local Control Funding Formula (LCFF) Priority 2. The study also examined the degree…

  18. LCS-TA to identify similar fragments in RNA 3D structures.

    PubMed

    Wiedemann, Jakub; Zok, Tomasz; Milostan, Maciej; Szachniuk, Marta

    2017-10-23

    In modern structural bioinformatics, comparison of molecular structures aimed to identify and assess similarities and differences between them is one of the most commonly performed procedures. It gives the basis for evaluation of in silico predicted models. It constitutes the preliminary step in searching for structural motifs. In particular, it supports tracing the molecular evolution. Faced with an ever-increasing amount of available structural data, researchers need a range of methods enabling comparative analysis of the structures from either global or local perspective. Herein, we present a new, superposition-independent method which processes pairs of RNA 3D structures to identify their local similarities. The similarity is considered in the context of structure bending and bonds' rotation which are described by torsion angles. In the analyzed RNA structures, the method finds the longest continuous segments that show similar torsion within a user-defined threshold. The length of the segment is provided as local similarity measure. The method has been implemented as LCS-TA algorithm (Longest Continuous Segments in Torsion Angle space) and is incorporated into our MCQ4Structures application, freely available for download from http://www.cs.put.poznan.pl/tzok/mcq/ . The presented approach ties torsion-angle-based method of structure analysis with the idea of local similarity identification by handling continuous 3D structure segments. The first method, implemented in MCQ4Structures, has been successfully utilized in RNA-Puzzles initiative. The second one, originally applied in Euclidean space, is a component of LGA (Local-Global Alignment) algorithm commonly used in assessing protein models submitted to CASP. This unique combination of concepts implemented in LCS-TA provides a new perspective on structure quality assessment in local and quantitative aspect. A series of computational experiments show the first results of applying our method to comparison of RNA 3D models. LCS-TA can be used for identifying strengths and weaknesses in the prediction of RNA tertiary structures.

  19. Ontology Matching with Semantic Verification.

    PubMed

    Jean-Mary, Yves R; Shironoshita, E Patrick; Kabuka, Mansur R

    2009-09-01

    ASMOV (Automated Semantic Matching of Ontologies with Verification) is a novel algorithm that uses lexical and structural characteristics of two ontologies to iteratively calculate a similarity measure between them, derives an alignment, and then verifies it to ensure that it does not contain semantic inconsistencies. In this paper, we describe the ASMOV algorithm, and then present experimental results that measure its accuracy using the OAEI 2008 tests, and that evaluate its use with two different thesauri: WordNet, and the Unified Medical Language System (UMLS). These results show the increased accuracy obtained by combining lexical, structural and extensional matchers with semantic verification, and demonstrate the advantage of using a domain-specific thesaurus for the alignment of specialized ontologies.

  20. Knowledge-Guided Docking of WW Domain Proteins and Flexible Ligands

    NASA Astrophysics Data System (ADS)

    Lu, Haiyun; Li, Hao; Banu Bte Sm Rashid, Shamima; Leow, Wee Kheng; Liou, Yih-Cherng

    Studies of interactions between protein domains and ligands are important in many aspects such as cellular signaling. We present a knowledge-guided approach for docking protein domains and flexible ligands. The approach is applied to the WW domain, a small protein module mediating signaling complexes which have been implicated in diseases such as muscular dystrophy and Liddle’s syndrome. The first stage of the approach employs a substring search for two binding grooves of WW domains and possible binding motifs of peptide ligands based on known features. The second stage aligns the ligand’s peptide backbone to the two binding grooves using a quasi-Newton constrained optimization algorithm. The backbone-aligned ligands produced serve as good starting points to the third stage which uses any flexible docking algorithm to perform the docking. The experimental results demonstrate that the backbone alignment method in the second stage performs better than conventional rigid superposition given two binding constraints. It is also shown that using the backbone-aligned ligands as initial configurations improves the flexible docking in the third stage. The presented approach can also be applied to other protein domains that involve binding of flexible ligand to two or more binding sites.

  1. Alignment of the measurement scale mark during immersion hydrometer calibration using an image processing system.

    PubMed

    Peña-Perez, Luis Manuel; Pedraza-Ortega, Jesus Carlos; Ramos-Arreguin, Juan Manuel; Arriaga, Saul Tovar; Fernandez, Marco Antonio Aceves; Becerra, Luis Omar; Hurtado, Efren Gorrostieta; Vargas-Soto, Jose Emilio

    2013-10-24

    The present work presents an improved method to align the measurement scale mark in an immersion hydrometer calibration system of CENAM, the National Metrology Institute (NMI) of Mexico, The proposed method uses a vision system to align the scale mark of the hydrometer to the surface of the liquid where it is immersed by implementing image processing algorithms. This approach reduces the variability in the apparent mass determination during the hydrostatic weighing in the calibration process, therefore decreasing the relative uncertainty of calibration.

  2. Alignment of the Measurement Scale Mark during Immersion Hydrometer Calibration Using an Image Processing System

    PubMed Central

    Peña-Perez, Luis Manuel; Pedraza-Ortega, Jesus Carlos; Ramos-Arreguin, Juan Manuel; Arriaga, Saul Tovar; Fernandez, Marco Antonio Aceves; Becerra, Luis Omar; Hurtado, Efren Gorrostieta; Vargas-Soto, Jose Emilio

    2013-01-01

    The present work presents an improved method to align the measurement scale mark in an immersion hydrometer calibration system of CENAM, the National Metrology Institute (NMI) of Mexico, The proposed method uses a vision system to align the scale mark of the hydrometer to the surface of the liquid where it is immersed by implementing image processing algorithms. This approach reduces the variability in the apparent mass determination during the hydrostatic weighing in the calibration process, therefore decreasing the relative uncertainty of calibration. PMID:24284770

  3. Enhanced Energy Localization in Hyperthermia Treatment Based on Hybrid Electromagnetic and Ultrasonic System: Proof of Concept with Numerical Simulations.

    PubMed

    Nizam-Uddin, N; Elshafiey, Ibrahim

    2017-01-01

    This paper proposes a hybrid hyperthermia treatment system, utilizing two noninvasive modalities for treating brain tumors. The proposed system depends on focusing electromagnetic (EM) and ultrasound (US) energies. The EM hyperthermia subsystem enhances energy localization by incorporating a multichannel wideband setting and coherent-phased-array technique. A genetic algorithm based optimization tool is developed to enhance the specific absorption rate (SAR) distribution by reducing hotspots and maximizing energy deposition at tumor regions. The treatment performance is also enhanced by augmenting an ultrasonic subsystem to allow focused energy deposition into deep tumors. The therapeutic faculty of ultrasonic energy is assessed by examining the control of mechanical alignment of transducer array elements. A time reversal (TR) approach is then investigated to address challenges in energy focus in both subsystems. Simulation results of the synergetic effect of both modalities assuming a simplified model of human head phantom demonstrate the feasibility of the proposed hybrid technique as a noninvasive tool for thermal treatment of brain tumors.

  4. Simultaneous Optimization of Tooth Flank Form of Involute Helical Gears in Terms of Both Vibration and Load Carrying Capacity

    NASA Astrophysics Data System (ADS)

    Komori, Masaharu; Kubo, Aizoh; Suzuki, Yoshitomo

    The alignment condition of automotive gears changes considerably during operation due to the deformation of shafts, bearings, and gear box by transmission of load. Under such conditions, the gears are required to satisfy not only reliability in strength and durability under maximum loading conditions, but also low vibrational characteristics under light loading conditions during the cruising of a car. In this report, the characteristics of the optimum tooth flank form of gears in terms of both vibration and load carrying capacity are clarified. The local optimum tooth flank form appears in each excitation valley, where the vibrational excitation is low and the actual contact ratio takes a specific value. The influence of the choice of different local optimum solutions on the vibrational performance of the optimized gears is investigated. The practical design algorithm for the optimum tooth flank form of a gear set in terms of both vibration and load carrying capacity is then proposed and its result is evaluated by field experience.

  5. Enhanced Energy Localization in Hyperthermia Treatment Based on Hybrid Electromagnetic and Ultrasonic System: Proof of Concept with Numerical Simulations

    PubMed Central

    Elshafiey, Ibrahim

    2017-01-01

    This paper proposes a hybrid hyperthermia treatment system, utilizing two noninvasive modalities for treating brain tumors. The proposed system depends on focusing electromagnetic (EM) and ultrasound (US) energies. The EM hyperthermia subsystem enhances energy localization by incorporating a multichannel wideband setting and coherent-phased-array technique. A genetic algorithm based optimization tool is developed to enhance the specific absorption rate (SAR) distribution by reducing hotspots and maximizing energy deposition at tumor regions. The treatment performance is also enhanced by augmenting an ultrasonic subsystem to allow focused energy deposition into deep tumors. The therapeutic faculty of ultrasonic energy is assessed by examining the control of mechanical alignment of transducer array elements. A time reversal (TR) approach is then investigated to address challenges in energy focus in both subsystems. Simulation results of the synergetic effect of both modalities assuming a simplified model of human head phantom demonstrate the feasibility of the proposed hybrid technique as a noninvasive tool for thermal treatment of brain tumors. PMID:28840125

  6. Transformation diffusion reconstruction of three-dimensional histology volumes from two-dimensional image stacks.

    PubMed

    Casero, Ramón; Siedlecka, Urszula; Jones, Elizabeth S; Gruscheski, Lena; Gibb, Matthew; Schneider, Jürgen E; Kohl, Peter; Grau, Vicente

    2017-05-01

    Traditional histology is the gold standard for tissue studies, but it is intrinsically reliant on two-dimensional (2D) images. Study of volumetric tissue samples such as whole hearts produces a stack of misaligned and distorted 2D images that need to be reconstructed to recover a congruent volume with the original sample's shape. In this paper, we develop a mathematical framework called Transformation Diffusion (TD) for stack alignment refinement as a solution to the heat diffusion equation. This general framework does not require contour segmentation, is independent of the registration method used, and is trivially parallelizable. After the first stack sweep, we also replace registration operations by operations in the space of transformations, several orders of magnitude faster and less memory-consuming. Implementing TD with operations in the space of transformations produces our Transformation Diffusion Reconstruction (TDR) algorithm, applicable to general transformations that are closed under inversion and composition. In particular, we provide formulas for translation and affine transformations. We also propose an Approximated TDR (ATDR) algorithm that extends the same principles to tensor-product B-spline transformations. Using TDR and ATDR, we reconstruct a full mouse heart at pixel size 0.92µm×0.92µm, cut 10µm thick, spaced 20µm (84G). Our algorithms employ only local information from transformations between neighboring slices, but the TD framework allows theoretical analysis of the refinement as applying a global Gaussian low-pass filter to the unknown stack misalignments. We also show that reconstruction without an external reference produces large shape artifacts in a cardiac specimen while still optimizing slice-to-slice alignment. To overcome this problem, we use a pre-cutting blockface imaging process previously developed by our group that takes advantage of Brewster's angle and a polarizer to capture the outline of only the topmost layer of wax in the block containing embedded tissue for histological sectioning. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  7. High precision wavefront control in point spread function engineering for single emitter localization

    NASA Astrophysics Data System (ADS)

    Siemons, M.; Hulleman, C. N.; Thorsen, R. Ø.; Smith, C. S.; Stallinga, S.

    2018-04-01

    Point Spread Function (PSF) engineering is used in single emitter localization to measure the emitter position in 3D and possibly other parameters such as the emission color or dipole orientation as well. Advanced PSF models such as spline fits to experimental PSFs or the vectorial PSF model can be used in the corresponding localization algorithms in order to model the intricate spot shape and deformations correctly. The complexity of the optical architecture and fit model makes PSF engineering approaches particularly sensitive to optical aberrations. Here, we present a calibration and alignment protocol for fluorescence microscopes equipped with a spatial light modulator (SLM) with the goal of establishing a wavefront error well below the diffraction limit for optimum application of complex engineered PSFs. We achieve high-precision wavefront control, to a level below 20 m$\\lambda$ wavefront aberration over a 30 minute time window after the calibration procedure, using a separate light path for calibrating the pixel-to-pixel variations of the SLM, and alignment of the SLM with respect to the optical axis and Fourier plane within 3 $\\mu$m ($x/y$) and 100 $\\mu$m ($z$) error. Aberrations are retrieved from a fit of the vectorial PSF model to a bead $z$-stack and compensated with a residual wavefront error comparable to the error of the SLM calibration step. This well-calibrated and corrected setup makes it possible to create complex `3D+$\\lambda$' PSFs that fit very well to the vectorial PSF model. Proof-of-principle bead experiments show precisions below 10~nm in $x$, $y$, and $\\lambda$, and below 20~nm in $z$ over an axial range of 1 $\\mu$m with 2000 signal photons and 12 background photons.

  8. BatMis: a fast algorithm for k-mismatch mapping.

    PubMed

    Tennakoon, Chandana; Purbojati, Rikky W; Sung, Wing-Kin

    2012-08-15

    Second-generation sequencing (SGS) generates millions of reads that need to be aligned to a reference genome allowing errors. Although current aligners can efficiently map reads allowing a small number of mismatches, they are not well suited for handling a large number of mismatches. The efficiency of aligners can be improved using various heuristics, but the sensitivity and accuracy of the alignments are sacrificed. In this article, we introduce Basic Alignment tool for Mismatches (BatMis)--an efficient method to align short reads to a reference allowing k mismatches. BatMis is a Burrows-Wheeler transformation based aligner that uses a seed and extend approach, and it is an exact method. Benchmark tests show that BatMis performs better than competing aligners in solving the k-mismatch problem. Furthermore, it can compete favorably even when compared with the heuristic modes of the other aligners. BatMis is a useful alternative for applications where fast k-mismatch mappings, unique mappings or multiple mappings of SGS data are required. BatMis is written in C/C++ and is freely available from http://code.google.com/p/batmis/

  9. ChimeRScope: a novel alignment-free algorithm for fusion transcript prediction using paired-end RNA-Seq data

    PubMed Central

    Li, You; Heavican, Tayla B.; Vellichirammal, Neetha N.; Iqbal, Javeed

    2017-01-01

    Abstract The RNA-Seq technology has revolutionized transcriptome characterization not only by accurately quantifying gene expression, but also by the identification of novel transcripts like chimeric fusion transcripts. The ‘fusion’ or ‘chimeric’ transcripts have improved the diagnosis and prognosis of several tumors, and have led to the development of novel therapeutic regimen. The fusion transcript detection is currently accomplished by several software packages, primarily relying on sequence alignment algorithms. The alignment of sequencing reads from fusion transcript loci in cancer genomes can be highly challenging due to the incorrect mapping induced by genomic alterations, thereby limiting the performance of alignment-based fusion transcript detection methods. Here, we developed a novel alignment-free method, ChimeRScope that accurately predicts fusion transcripts based on the gene fingerprint (as k-mers) profiles of the RNA-Seq paired-end reads. Results on published datasets and in-house cancer cell line datasets followed by experimental validations demonstrate that ChimeRScope consistently outperforms other popular methods irrespective of the read lengths and sequencing depth. More importantly, results on our in-house datasets show that ChimeRScope is a better tool that is capable of identifying novel fusion transcripts with potential oncogenic functions. ChimeRScope is accessible as a standalone software at (https://github.com/ChimeRScope/ChimeRScope/wiki) or via the Galaxy web-interface at (https://galaxy.unmc.edu/). PMID:28472320

  10. Robust group-wise rigid registration of point sets using t-mixture model

    NASA Astrophysics Data System (ADS)

    Ravikumar, Nishant; Gooya, Ali; Frangi, Alejandro F.; Taylor, Zeike A.

    2016-03-01

    A probabilistic framework for robust, group-wise rigid alignment of point-sets using a mixture of Students t-distribution especially when the point sets are of varying lengths, are corrupted by an unknown degree of outliers or in the presence of missing data. Medical images (in particular magnetic resonance (MR) images), their segmentations and consequently point-sets generated from these are highly susceptible to corruption by outliers. This poses a problem for robust correspondence estimation and accurate alignment of shapes, necessary for training statistical shape models (SSMs). To address these issues, this study proposes to use a t-mixture model (TMM), to approximate the underlying joint probability density of a group of similar shapes and align them to a common reference frame. The heavy-tailed nature of t-distributions provides a more robust registration framework in comparison to state of the art algorithms. Significant reduction in alignment errors is achieved in the presence of outliers, using the proposed TMM-based group-wise rigid registration method, in comparison to its Gaussian mixture model (GMM) counterparts. The proposed TMM-framework is compared with a group-wise variant of the well-known Coherent Point Drift (CPD) algorithm and two other group-wise methods using GMMs, using both synthetic and real data sets. Rigid alignment errors for groups of shapes are quantified using the Hausdorff distance (HD) and quadratic surface distance (QSD) metrics.

  11. Embedding strategies for effective use of information from multiple sequence alignments.

    PubMed Central

    Henikoff, S.; Henikoff, J. G.

    1997-01-01

    We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain. PMID:9070452

  12. Attitude sensor alignment calibration for the solar maximum mission

    NASA Technical Reports Server (NTRS)

    Pitone, Daniel S.; Shuster, Malcolm D.

    1990-01-01

    An earlier heuristic study of the fine attitude sensors for the Solar Maximum Mission (SMM) revealed a temperature dependence of the alignment about the yaw axis of the pair of fixed-head star trackers relative to the fine pointing Sun sensor. Here, new sensor alignment algorithms which better quantify the dependence of the alignments on the temperature are developed and applied to the SMM data. Comparison with the results from the previous study reveals the limitations of the heuristic approach. In addition, some of the basic assumptions made in the prelaunch analysis of the alignments of the SMM are examined. The results of this work have important consequences for future missions with stringent attitude requirements and where misalignment variations due to variations in the temperature will be significant.

  13. Alignment of galaxies relative to their local environment in SDSS-DR8

    NASA Astrophysics Data System (ADS)

    Hirv, A.; Pelt, J.; Saar, E.; Tago, E.; Tamm, A.; Tempel, E.; Einasto, M.

    2017-03-01

    Aims: We study the alignment of galaxies relative to their local environment in SDSS-DR8 and, using these data, we discuss evolution scenarios for different types of galaxies. Methods: We defined a vector field of the direction of anisotropy of the local environment of galaxies. We summed the unit direction vectors of all close neighbours of a given galaxy in a particular way to estimate this field. We found the alignment angles between the spin axes of disc galaxies, or the minor axes of elliptical galaxies, and the direction of anisotropy. The distributions of cosines of these angles are compared to the random distributions to analyse the alignment of galaxies. Results: Sab galaxies show perpendicular alignment relative to the direction of anisotropy in a sparse environment, for single galaxies and galaxies of low luminosity. Most of the parallel alignment of Scd galaxies comes from dense regions, from 2...3 member groups and from galaxies with low luminosity. The perpendicular alignment of S0 galaxies does not depend strongly on environmental density nor luminosity; it is detected for single and 2...3 member group galaxies, and for main galaxies of 4...10 member groups. The perpendicular alignment of elliptical galaxies is clearly detected for single galaxies and for members of ≤10 member groups; the alignment increases with environmental density and luminosity. Conclusions: We confirm the existence of fossil tidally induced alignment of Sab galaxies at low z. The alignment of Scd galaxies can be explained via the infall of matter to filaments. S0 galaxies may have encountered relatively massive mergers along the direction of anisotropy. Major mergers along this direction can explain the alignment of elliptical galaxies. Less massive, but repeated mergers are possibly responsible for the formation of elliptical galaxies in sparser areas and for less luminous elliptical galaxies.

  14. Machinery running state identification based on discriminant semi-supervised local tangent space alignment for feature fusion and extraction

    NASA Astrophysics Data System (ADS)

    Su, Zuqiang; Xiao, Hong; Zhang, Yi; Tang, Baoping; Jiang, Yonghua

    2017-04-01

    Extraction of sensitive features is a challenging but key task in data-driven machinery running state identification. Aimed at solving this problem, a method for machinery running state identification that applies discriminant semi-supervised local tangent space alignment (DSS-LTSA) for feature fusion and extraction is proposed. Firstly, in order to extract more distinct features, the vibration signals are decomposed by wavelet packet decomposition WPD, and a mixed-domain feature set consisted of statistical features, autoregressive (AR) model coefficients, instantaneous amplitude Shannon entropy and WPD energy spectrum is extracted to comprehensively characterize the properties of machinery running state(s). Then, the mixed-dimension feature set is inputted into DSS-LTSA for feature fusion and extraction to eliminate redundant information and interference noise. The proposed DSS-LTSA can extract intrinsic structure information of both labeled and unlabeled state samples, and as a result the over-fitting problem of supervised manifold learning and blindness problem of unsupervised manifold learning are overcome. Simultaneously, class discrimination information is integrated within the dimension reduction process in a semi-supervised manner to improve sensitivity of the extracted fusion features. Lastly, the extracted fusion features are inputted into a pattern recognition algorithm to achieve the running state identification. The effectiveness of the proposed method is verified by a running state identification case in a gearbox, and the results confirm the improved accuracy of the running state identification.

  15. MetalS(3), a database-mining tool for the identification of structurally similar metal sites.

    PubMed

    Valasatava, Yana; Rosato, Antonio; Cavallaro, Gabriele; Andreini, Claudia

    2014-08-01

    We have developed a database search tool to identify metal sites having structural similarity to a query metal site structure within the MetalPDB database of minimal functional sites (MFSs) contained in metal-binding biological macromolecules. MFSs describe the local environment around the metal(s) independently of the larger context of the macromolecular structure. Such a local environment has a determinant role in tuning the chemical reactivity of the metal, ultimately contributing to the functional properties of the whole system. The database search tool, which we called MetalS(3) (Metal Sites Similarity Search), can be accessed through a Web interface at http://metalweb.cerm.unifi.it/tools/metals3/ . MetalS(3) uses a suitably adapted version of an algorithm that we previously developed to systematically compare the structure of the query metal site with each MFS in MetalPDB. For each MFS, the best superposition is kept. All these superpositions are then ranked according to the MetalS(3) scoring function and are presented to the user in tabular form. The user can interact with the output Web page to visualize the structural alignment or the sequence alignment derived from it. Options to filter the results are available. Test calculations show that the MetalS(3) output correlates well with expectations from protein homology considerations. Furthermore, we describe some usage scenarios that highlight the usefulness of MetalS(3) to obtain mechanistic and functional hints regardless of homology.

  16. Intra-operative registration for image enhanced endoscopic sinus surgery using photo-consistency.

    PubMed

    Chen, Min Si; Gonzalez, Gerardo; Lapeer, Rudy

    2007-01-01

    The purpose of this paper is to present an intensity based algorithm for aligning 2D endoscopic images with virtual images generated from pre-operative 3D data. The proposed algorithm uses photo-consistency as the measurement of similarity between images, provided the illumination is independent from the viewing direction.

  17. Feature-based three-dimensional registration for repetitive geometry in machine vision

    PubMed Central

    Gong, Yuanzheng; Seibel, Eric J.

    2016-01-01

    As an important step in three-dimensional (3D) machine vision, 3D registration is a process of aligning two or multiple 3D point clouds that are collected from different perspectives together into a complete one. The most popular approach to register point clouds is to minimize the difference between these point clouds iteratively by Iterative Closest Point (ICP) algorithm. However, ICP does not work well for repetitive geometries. To solve this problem, a feature-based 3D registration algorithm is proposed to align the point clouds that are generated by vision-based 3D reconstruction. By utilizing texture information of the object and the robustness of image features, 3D correspondences can be retrieved so that the 3D registration of two point clouds is to solve a rigid transformation. The comparison of our method and different ICP algorithms demonstrates that our proposed algorithm is more accurate, efficient and robust for repetitive geometry registration. Moreover, this method can also be used to solve high depth uncertainty problem caused by little camera baseline in vision-based 3D reconstruction. PMID:28286703

  18. SnapDock—template-based docking by Geometric Hashing

    PubMed Central

    Estrin, Michael; Wolfson, Haim J.

    2017-01-01

    Abstract Motivation: A highly efficient template-based protein–protein docking algorithm, nicknamed SnapDock, is presented. It employs a Geometric Hashing-based structural alignment scheme to align the target proteins to the interfaces of non-redundant protein–protein interface libraries. Docking of a pair of proteins utilizing the 22 600 interface PIFACE library is performed in < 2 min on the average. A flexible version of the algorithm allowing hinge motion in one of the proteins is presented as well. Results: To evaluate the performance of the algorithm a blind re-modelling of 3547 PDB complexes, which have been uploaded after the PIFACE publication has been performed with success ratio of about 35%. Interestingly, a similar experiment with the template free PatchDock docking algorithm yielded a success rate of about 23% with roughly 1/3 of the solutions different from those of SnapDock. Consequently, the combination of the two methods gave a 42% success ratio. Availability and implementation: A web server of the application is under development. Contact: michaelestrin@gmail.com or wolfson@tau.ac.il PMID:28881968

  19. QuickProbs—A Fast Multiple Sequence Alignment Algorithm Designed for Graphics Processors

    PubMed Central

    Gudyś, Adam; Deorowicz, Sebastian

    2014-01-01

    Multiple sequence alignment is a crucial task in a number of biological analyses like secondary structure prediction, domain searching, phylogeny, etc. MSAProbs is currently the most accurate alignment algorithm, but its effectiveness is obtained at the expense of computational time. In the paper we present QuickProbs, the variant of MSAProbs customised for graphics processors. We selected the two most time consuming stages of MSAProbs to be redesigned for GPU execution: the posterior matrices calculation and the consistency transformation. Experiments on three popular benchmarks (BAliBASE, PREFAB, OXBench-X) on quad-core PC equipped with high-end graphics card show QuickProbs to be 5.7 to 9.7 times faster than original CPU-parallel MSAProbs. Additional tests performed on several protein families from Pfam database give overall speed-up of 6.7. Compared to other algorithms like MAFFT, MUSCLE, or ClustalW, QuickProbs proved to be much more accurate at similar speed. Additionally we introduce a tuned variant of QuickProbs which is significantly more accurate on sets of distantly related sequences than MSAProbs without exceeding its computation time. The GPU part of QuickProbs was implemented in OpenCL, thus the package is suitable for graphics processors produced by all major vendors. PMID:24586435

  20. Search algorithm complexity modeling with application to image alignment and matching

    NASA Astrophysics Data System (ADS)

    DelMarco, Stephen

    2014-05-01

    Search algorithm complexity modeling, in the form of penetration rate estimation, provides a useful way to estimate search efficiency in application domains which involve searching over a hypothesis space of reference templates or models, as in model-based object recognition, automatic target recognition, and biometric recognition. The penetration rate quantifies the expected portion of the database that must be searched, and is useful for estimating search algorithm computational requirements. In this paper we perform mathematical modeling to derive general equations for penetration rate estimates that are applicable to a wide range of recognition problems. We extend previous penetration rate analyses to use more general probabilistic modeling assumptions. In particular we provide penetration rate equations within the framework of a model-based image alignment application domain in which a prioritized hierarchical grid search is used to rank subspace bins based on matching probability. We derive general equations, and provide special cases based on simplifying assumptions. We show how previously-derived penetration rate equations are special cases of the general formulation. We apply the analysis to model-based logo image alignment in which a hierarchical grid search is used over a geometric misalignment transform hypothesis space. We present numerical results validating the modeling assumptions and derived formulation.

  1. Group sparse multiview patch alignment framework with view consistency for image classification.

    PubMed

    Gui, Jie; Tao, Dacheng; Sun, Zhenan; Luo, Yong; You, Xinge; Tang, Yuan Yan

    2014-07-01

    No single feature can satisfactorily characterize the semantic concepts of an image. Multiview learning aims to unify different kinds of features to produce a consensual and efficient representation. This paper redefines part optimization in the patch alignment framework (PAF) and develops a group sparse multiview patch alignment framework (GSM-PAF). The new part optimization considers not only the complementary properties of different views, but also view consistency. In particular, view consistency models the correlations between all possible combinations of any two kinds of view. In contrast to conventional dimensionality reduction algorithms that perform feature extraction and feature selection independently, GSM-PAF enjoys joint feature extraction and feature selection by exploiting l(2,1)-norm on the projection matrix to achieve row sparsity, which leads to the simultaneous selection of relevant features and learning transformation, and thus makes the algorithm more discriminative. Experiments on two real-world image data sets demonstrate the effectiveness of GSM-PAF for image classification.

  2. Elaborate analysis and design of filter-bank-based sensing for wideband cognitive radios

    NASA Astrophysics Data System (ADS)

    Maliatsos, Konstantinos; Adamis, Athanasios; Kanatas, Athanasios G.

    2014-12-01

    The successful operation of a cognitive radio system strongly depends on its ability to sense the radio environment. With the use of spectrum sensing algorithms, the cognitive radio is required to detect co-existing licensed primary transmissions and to protect them from interference. This paper focuses on filter-bank-based sensing and provides a solid theoretical background for the design of these detectors. Optimum detectors based on the Neyman-Pearson theorem are developed for uniform discrete Fourier transform (DFT) and modified DFT filter banks with root-Nyquist filters. The proposed sensing framework does not require frequency alignment between the filter bank of the sensor and the primary signal. Each wideband primary channel is spanned and monitored by several sensor subchannels that analyse it in narrowband signals. Filter-bank-based sensing is proved to be robust and efficient under coloured noise. Moreover, the performance of the weighted energy detector as a sensing technique is evaluated. Finally, based on the Locally Most Powerful and the Generalized Likelihood Ratio test, real-world sensing algorithms that do not require a priori knowledge are proposed and tested.

  3. Prediction of subcellular localization of eukaryotic proteins using position-specific profiles and neural network with weighted inputs.

    PubMed

    Zou, Lingyun; Wang, Zhengzhi; Huang, Jiaomin

    2007-12-01

    Subcellular location is one of the key biological characteristics of proteins. Position-specific profiles (PSP) have been introduced as important characteristics of proteins in this article. In this study, to obtain position-specific profiles, the Position Specific Iterative-Basic Local Alignment Search Tool (PSI-BLAST) has been used to search for protein sequences in a database. Position-specific scoring matrices are extracted from the profiles as one class of characteristics. Four-part amino acid compositions and 1st-7th order dipeptide compositions have also been calculated as the other two classes of characteristics. Therefore, twelve characteristic vectors are extracted from each of the protein sequences. Next, the characteristic vectors are weighed by a simple weighing function and inputted into a BP neural network predictor named PSP-Weighted Neural Network (PSP-WNN). The Levenberg-Marquardt algorithm is employed to adjust the weight matrices and thresholds during the network training instead of the error back propagation algorithm. With a jackknife test on the RH2427 dataset, PSP-WNN has achieved a higher overall prediction accuracy of 88.4% rather than the prediction results by the general BP neural network, Markov model, and fuzzy k-nearest neighbors algorithm on this dataset. In addition, the prediction performance of PSP-WNN has been evaluated with a five-fold cross validation test on the PK7579 dataset and the prediction results have been consistently better than those of the previous method on the basis of several support vector machines, using compositions of both amino acids and amino acid pairs. These results indicate that PSP-WNN is a powerful tool for subcellular localization prediction. At the end of the article, influences on prediction accuracy using different weighting proportions among three characteristic vector categories have been discussed. An appropriate proportion is considered by increasing the prediction accuracy.

  4. In-Flight Alignment Using H ∞ Filter for Strapdown INS on Aircraft

    PubMed Central

    Pei, Fu-Jun; Liu, Xuan; Zhu, Li

    2014-01-01

    In-flight alignment is an effective way to improve the accuracy and speed of initial alignment for strapdown inertial navigation system (INS). During the aircraft flight, strapdown INS alignment was disturbed by lineal and angular movements of the aircraft. To deal with the disturbances in dynamic initial alignment, a novel alignment method for SINS is investigated in this paper. In this method, an initial alignment error model of SINS in the inertial frame is established. The observability of the system is discussed by piece-wise constant system (PWCS) theory and observable degree is computed by the singular value decomposition (SVD) theory. It is demonstrated that the system is completely observable, and all the system state parameters can be estimated by optimal filter. Then a H ∞ filter was designed to resolve the uncertainty of measurement noise. The simulation results demonstrate that the proposed algorithm can reach a better accuracy under the dynamic disturbance condition. PMID:24511300

  5. Collaborative Beamfocusing Radio (COBRA)

    NASA Astrophysics Data System (ADS)

    Rode, Jeremy P.; Hsu, Mark J.; Smith, David; Husain, Anis

    2013-05-01

    A Ziva team has recently demonstrated a novel technique called Collaborative Beamfocusing Radios (COBRA) which enables an ad-hoc collection of distributed commercial off-the-shelf software defined radios to coherently align and beamform to a remote radio. COBRA promises to operate even in high multipath and non-line-of-sight environments as well as mobile applications without resorting to computationally expensive closed loop techniques that are currently unable to operate with significant movement. COBRA exploits two key technologies to achieve coherent beamforming. The first is Time Reversal (TR) which compensates for multipath and automatically discovers the optimal spatio-temporal matched filter to enable peak signal gains (up to 20 dB) and diffraction-limited focusing at the intended receiver in NLOS and severe multipath environments. The second is time-aligned buffering which enables TR to synchronize distributed transmitters into a collaborative array. This time alignment algorithm avoids causality violations through the use of reciprocal buffering. Preserving spatio-temporal reciprocity through the TR capture and retransmission process achieves coherent alignment across multiple radios at ~GHz carriers using only standard quartz-oscillators. COBRA has been demonstrated in the lab, aligning two off-the-shelf software defined radios over-the-air to an accuracy of better than 2 degrees of carrier alignment at 450 MHz. The COBRA algorithms are lightweight, with computation in 5 ms on a smartphone class microprocessor. COBRA also has low start-up latency, achieving high accuracy from a cold-start in 30 ms. The COBRA technique opens up a large number of new capabilities in communications, and electronic warfare including selective spatial jamming, geolocation and anti-geolocation.

  6. SPHERE: SPherical Harmonic Elastic REgistration of HARDI Data

    PubMed Central

    Yap, Pew-Thian; Chen, Yasheng; An, Hongyu; Yang, Yang; Gilmore, John H.; Lin, Weili

    2010-01-01

    In contrast to the more common Diffusion Tensor Imaging (DTI), High Angular Resolution Diffusion Imaging (HARDI) allows superior delineation of angular microstructures of brain white matter, and makes possible multiple-fiber modeling of each voxel for better characterization of brain connectivity. However, the complex orientation information afforded by HARDI makes registration of HARDI images more complicated than scalar images. In particular, the question of how much orientation information is needed for satisfactory alignment has not been sufficiently addressed. Low order orientation representation is generally more robust than high order representation, although the latter provides more information for correct alignment of fiber pathways. However, high order representation, when naïvely utilized, might not necessarily be conducive to improving registration accuracy since similar structures with significant orientation differences prior to proper alignment might be mistakenly taken as non-matching structures. We present in this paper a HARDI registration algorithm, called SPherical Harmonic Elastic REgistration (SPHERE), which in a principled means hierarchically extracts orientation information from HARDI data for structural alignment. The image volumes are first registered using robust, relatively direction invariant features derived from the Orientation Distribution Function (ODF), and the alignment is then further refined using spherical harmonic (SH) representation with gradually increasing orders. This progression from non-directional, single-directional to multi-directional representation provides a systematic means of extracting directional information given by diffusion-weighted imaging. Coupled with a template-subject-consistent soft-correspondence-matching scheme, this approach allows robust and accurate alignment of HARDI data. Experimental results show marked increase in accuracy over a state-of-the-art DTI registration algorithm. PMID:21147231

  7. The protein structure prediction problem could be solved using the current PDB library

    PubMed Central

    Zhang, Yang; Skolnick, Jeffrey

    2005-01-01

    For single-domain proteins, we examine the completeness of the structures in the current Protein Data Bank (PDB) library for use in full-length model construction of unknown sequences. To address this issue, we employ a comprehensive benchmark set of 1,489 medium-size proteins that cover the PDB at the level of 35% sequence identity and identify templates by structure alignment. With homologous proteins excluded, we can always find similar folds to native with an average rms deviation (RMSD) from native of 2.5 Å with ≈82% alignment coverage. These template structures often contain a significant number of insertions/deletions. The tasser algorithm was applied to build full-length models, where continuous fragments are excised from the top-scoring templates and reassembled under the guide of an optimized force field, which includes consensus restraints taken from the templates and knowledge-based statistical potentials. For almost all targets (except for 2/1,489), the resultant full-length models have an RMSD to native below 6 Å (97% of them below 4 Å). On average, the RMSD of full-length models is 2.25 Å, with aligned regions improved from 2.5 Å to 1.88 Å, comparable with the accuracy of low-resolution experimental structures. Furthermore, starting from state-of-the-art structural alignments, we demonstrate a methodology that can consistently bring template-based alignments closer to native. These results are highly suggestive that the protein-folding problem can in principle be solved based on the current PDB library by developing efficient fold recognition algorithms that can recover such initial alignments. PMID:15653774

  8. Acceptance test of a commercially available software for automatic image registration of computed tomography (CT), magnetic resonance imaging (MRI) and 99mTc-methoxyisobutylisonitrile (MIBI) single-photon emission computed tomography (SPECT) brain images.

    PubMed

    Loi, Gianfranco; Dominietto, Marco; Manfredda, Irene; Mones, Eleonora; Carriero, Alessandro; Inglese, Eugenio; Krengli, Marco; Brambilla, Marco

    2008-09-01

    This note describes a method to characterize the performances of image fusion software (Syntegra) with respect to accuracy and robustness. Computed tomography (CT), magnetic resonance imaging (MRI), and single-photon emission computed tomography (SPECT) studies were acquired from two phantoms and 10 patients. Image registration was performed independently by two couples composed of one radiotherapist and one physicist by means of superposition of anatomic landmarks. Each couple performed jointly and saved the registration. The two solutions were averaged to obtain the gold standard registration. A new set of estimators was defined to identify translation and rotation errors in the coordinate axes, independently from point position in image field of view (FOV). Algorithms evaluated were local correlation (LC) for CT-MRI, normalized mutual information (MI) for CT-MRI, and CT-SPECT registrations. To evaluate accuracy, estimator values were compared to limiting values for the algorithms employed, both in phantoms and in patients. To evaluate robustness, different alignments between images taken from a sample patient were produced and registration errors determined. LC algorithm resulted accurate in CT-MRI registrations in phantoms, but exceeded limiting values in 3 of 10 patients. MI algorithm resulted accurate in CT-MRI and CT-SPECT registrations in phantoms; limiting values were exceeded in one case in CT-MRI and never reached in CT-SPECT registrations. Thus, the evaluation of robustness was restricted to the algorithm of MI both for CT-MRI and CT-SPECT registrations. The algorithm of MI proved to be robust: limiting values were not exceeded with translation perturbations up to 2.5 cm, rotation perturbations up to 10 degrees and roto-translational perturbation up to 3 cm and 5 degrees.

  9. Global Network Alignment in the Context of Aging.

    PubMed

    Faisal, Fazle Elahi; Zhao, Han; Milenkovic, Tijana

    2015-01-01

    Analogous to sequence alignment, network alignment (NA) can be used to transfer biological knowledge across species between conserved network regions. NA faces two algorithmic challenges: 1) Which cost function to use to capture "similarities" between nodes in different networks? 2) Which alignment strategy to use to rapidly identify "high-scoring" alignments from all possible alignments? We "break down" existing state-of-the-art methods that use both different cost functions and different alignment strategies to evaluate each combination of their cost functions and alignment strategies. We find that a combination of the cost function of one method and the alignment strategy of another method beats the existing methods. Hence, we propose this combination as a novel superior NA method. Then, since human aging is hard to study experimentally due to long lifespan, we use NA to transfer aging-related knowledge from well annotated model species to poorly annotated human. By doing so, we produce novel human aging-related knowledge, which complements currently available knowledge about aging that has been obtained mainly by sequence alignment. We demonstrate significant similarity between topological and functional properties of our novel predictions and those of known aging-related genes. We are the first to use NA to learn more about aging.

  10. libgapmis: extending short-read alignments

    PubMed Central

    2013-01-01

    Background A wide variety of short-read alignment programmes have been published recently to tackle the problem of mapping millions of short reads to a reference genome, focusing on different aspects of the procedure such as time and memory efficiency, sensitivity, and accuracy. These tools allow for a small number of mismatches in the alignment; however, their ability to allow for gaps varies greatly, with many performing poorly or not allowing them at all. The seed-and-extend strategy is applied in most short-read alignment programmes. After aligning a substring of the reference sequence against the high-quality prefix of a short read--the seed--an important problem is to find the best possible alignment between a substring of the reference sequence succeeding and the remaining suffix of low quality of the read--extend. The fact that the reads are rather short and that the gap occurrence frequency observed in various studies is rather low suggest that aligning (parts of) those reads with a single gap is in fact desirable. Results In this article, we present libgapmis, a library for extending pairwise short-read alignments. Apart from the standard CPU version, it includes ultrafast SSE- and GPU-based implementations. libgapmis is based on an algorithm computing a modified version of the traditional dynamic-programming matrix for sequence alignment. Extensive experimental results demonstrate that the functions of the CPU version provided in this library accelerate the computations by a factor of 20 compared to other programmes. The analogous SSE- and GPU-based implementations accelerate the computations by a factor of 6 and 11, respectively, compared to the CPU version. The library also provides the user the flexibility to split the read into fragments, based on the observed gap occurrence frequency and the length of the read, thereby allowing for a variable, but bounded, number of gaps in the alignment. Conclusions We present libgapmis, a library for extending pairwise short-read alignments. We show that libgapmis is better-suited and more efficient than existing algorithms for this task. The importance of our contribution is underlined by the fact that the provided functions may be seamlessly integrated into any short-read alignment pipeline. The open-source code of libgapmis is available at http://www.exelixis-lab.org/gapmis. PMID:24564250

  11. The relative phases of basal ganglia activities dynamically shape effective connectivity in Parkinson's disease.

    PubMed

    Cagnan, Hayriye; Duff, Eugene Paul; Brown, Peter

    2015-06-01

    Optimal phase alignment between oscillatory neural circuits is hypothesized to optimize information flow and enhance system performance. This theory is known as communication-through-coherence. The basal ganglia motor circuit exhibits exaggerated oscillatory and coherent activity patterns in Parkinson's disease. Such activity patterns are linked to compromised motor system performance as evinced by bradykinesia, rigidity and tremor, suggesting that network function might actually deteriorate once a certain level of net synchrony is exceeded in the motor circuit. Here, we characterize the processes underscoring excessive synchronization and its termination. To this end, we analysed local field potential recordings from the subthalamic nucleus and globus pallidus of five patients with Parkinson's disease (four male and one female, aged 37-64 years). We observed that certain phase alignments between subthalamic nucleus and globus pallidus amplified local neural synchrony in the beta frequency band while others either suppressed it or did not induce any significant change with respect to surrogates. The increase in local beta synchrony directly correlated with how long the two nuclei locked to beta-amplifying phase alignments. Crucially, administration of the dopamine prodrug, levodopa, reduced the frequency and duration of periods during which subthalamic and pallidal populations were phase-locked to beta-amplifying alignments. Conversely ON dopamine, the total duration over which subthalamic and pallidal populations were aligned to phases that left beta-amplitude unchanged with respect to surrogates increased. Thus dopaminergic input shifted circuit dynamics from persistent periods of locking to amplifying phase alignments, associated with compromised motoric function, to more dynamic phase alignment and improved motoric function. This effect of dopamine on local circuit resonance suggests means by which novel electrical interventions might prevent resonance-related pathological circuit interactions. © The Author (2015). Published by Oxford University Press on behalf of the Guarantors of Brain.

  12. Chemometric Analysis of Gas Chromatography – Mass Spectrometry Data using Fast Retention Time Alignment via a Total Ion Current Shift Function

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nadeau, Jeremy S.; Wright, Bob W.; Synovec, Robert E.

    2010-04-15

    A critical comparison of methods for correcting severely retention time shifted gas chromatography-mass spectrometry (GC-MS) data is presented. The method reported herein is an adaptation to the Piecewise Alignment Algorithm to quickly align severely shifted one-dimensional (1D) total ion current (TIC) data, then applying these shifts to broadly align all mass channels throughout the separation, referred to as a TIC shift function (SF). The maximum shift varied from (-) 5 s in the beginning of the chromatographic separation to (+) 20 s toward the end of the separation, equivalent to a maximum shift of over 5 peak widths. Implementing themore » TIC shift function (TIC SF) prior to Fisher Ratio (F-Ratio) feature selection and then principal component analysis (PCA) was found to be a viable approach to classify complex chromatograms, that in this study were obtained from GC-MS separations of three gasoline samples serving as complex test mixtures, referred to as types C, M and S. The reported alignment algorithm via the TIC SF approach corrects for large dynamic shifting in the data as well as subtle peak-to-peak shifts. The benefits of the overall TIC SF alignment and feature selection approach were quantified using the degree-of-class separation (DCS) metric of the PCA scores plots using the type C and M samples, since they were the most similar, and thus the most challenging samples to properly classify. The DCS values showed an increase from an initial value of essentially zero for the unaligned GC-TIC data to a value of 7.9 following alignment; however, the DCS was unchanged by feature selection using F-Ratios for the GC-TIC data. The full mass spectral data provided an increase to a final DCS of 13.7 after alignment and two-dimensional (2D) F-Ratio feature selection.« less

  13. Prospective MR image alignment between breath-holds: Application to renal BOLD MRI.

    PubMed

    Kalis, Inge M; Pilutti, David; Krafft, Axel J; Hennig, Jürgen; Bock, Michael

    2017-04-01

    To present an image registration method for renal blood oxygen level-dependent (BOLD) measurements that enables semiautomatic assessment of parenchymal and medullary R2* changes under a functional challenge. In a series of breath-hold acquisitions, three-dimensional data were acquired initially for prospective image registration of subsequent BOLD measurements. An algorithm for kidney alignment for BOLD renal imaging (KALIBRI) was implemented to detect the positions of the left and right kidney so that the kidneys were acquired in the subsequent BOLD measurement at consistent anatomical locations. Residual in-plane distortions were corrected retrospectively so that semiautomatic dynamic R2* measurements of the renal cortex and medulla become feasible. KALIBRI was tested in six healthy volunteers during a series of BOLD experiments, which included a 600- to 1000-mL water challenge. Prospective image registration and BOLD imaging of each kidney was achieved within a total measurement time of about 17 s, enabling its execution within a single breath-hold. KALIBRI improved the registration by up to 35% as found with mutual information measures. In four volunteers, a medullary R2* decrease of up to 40% was observed after water ingestion. KALIBRI improves the quality of two-dimensional time-resolved renal BOLD MRI by aligning local renal anatomy, which allows for consistent R2* measurements over many breath-holds. Magn Reson Med 77:1573-1582, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.

  14. Simulations in the Analysis of Experimental Data Measured by BM@N Drift Chambers

    NASA Astrophysics Data System (ADS)

    Fedorišin, Ján

    2018-02-01

    The drift chambers (DCH's) are an important part of the tracking system of the BM@N experiment designed to study the production of baryonic matter at the Nuclotron energies. The method of particle hit and track reconstruction in the drift chambers has been already proposed and tested on the BM@N deuteron beam data. In this study the DCH's are first locally and globally aligned, and subsequently the consistency of the track reconstruction chain is tested by two methods. The first one is based on the backward extrapolation of the DCH reconstructed deuteron beam to a position where its deflection in the BM@N magnetic field begins. The second method reconstructs the deuteron beam momentum through its deflection angle. Both methods confirm correctness of the track reconstruction algorithm.

  15. A distributed system for fast alignment of next-generation sequencing data.

    PubMed

    Srimani, Jaydeep K; Wu, Po-Yen; Phan, John H; Wang, May D

    2010-12-01

    We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment results. Moreover, as the data and alignment algorithms become more prevalent, it will become necessary to examine the effect of the multitude of alignment parameters on various NGS systems. We validate the distributed software system by (1) computing simple timing results to show the speed-up gained by using multiple computers, (2) optimizing alignment parameters using simulated NGS data, and (3) computing NGS expression levels for a single biological sample using optimal parameters and comparing these expression levels to that of a microarray sample. Results indicate that the distributed alignment system achieves approximately a linear speed-up and correctly distributes sequence data to and gathers alignment results from multiple compute clients.

  16. An Accurate Scalable Template-based Alignment Algorithm

    PubMed Central

    Gardner, David P.; Xu, Weijia; Miranker, Daniel P.; Ozer, Stuart; Cannone, Jamie J.; Gutell, Robin R.

    2013-01-01

    The rapid determination of nucleic acid sequences is increasing the number of sequences that are available. Inherent in a template or seed alignment is the culmination of structural and functional constraints that are selecting those mutations that are viable during the evolution of the RNA. While we might not understand these structural and functional, template-based alignment programs utilize the patterns of sequence conservation to encapsulate the characteristics of viable RNA sequences that are aligned properly. We have developed a program that utilizes the different dimensions of information in rCAD, a large RNA informatics resource, to establish a profile for each position in an alignment. The most significant include sequence identity and column composition in different phylogenetic taxa. We have compared our methods with a maximum of eight alternative alignment methods on different sets of 16S and 23S rRNA sequences with sequence percent identities ranging from 50% to 100%. The results showed that CRWAlign outperformed the other alignment methods in both speed and accuracy. A web-based alignment server is available at http://www.rna.ccbb.utexas.edu/SAE/2F/CRWAlign. PMID:24772376

  17. LZW-Kernel: fast kernel utilizing variable length code blocks from LZW compressors for protein sequence classification.

    PubMed

    Filatov, Gleb; Bauwens, Bruno; Kertész-Farkas, Attila

    2018-05-07

    Bioinformatics studies often rely on similarity measures between sequence pairs, which often pose a bottleneck in large-scale sequence analysis. Here, we present a new convolutional kernel function for protein sequences called the LZW-Kernel. It is based on code words identified with the Lempel-Ziv-Welch (LZW) universal text compressor. The LZW-Kernel is an alignment-free method, it is always symmetric, is positive, always provides 1.0 for self-similarity and it can directly be used with Support Vector Machines (SVMs) in classification problems, contrary to normalized compression distance (NCD), which often violates the distance metric properties in practice and requires further techniques to be used with SVMs. The LZW-Kernel is a one-pass algorithm, which makes it particularly plausible for big data applications. Our experimental studies on remote protein homology detection and protein classification tasks reveal that the LZW-Kernel closely approaches the performance of the Local Alignment Kernel (LAK) and the SVM-pairwise method combined with Smith-Waterman (SW) scoring at a fraction of the time. Moreover, the LZW-Kernel outperforms the SVM-pairwise method when combined with BLAST scores, which indicates that the LZW code words might be a better basis for similarity measures than local alignment approximations found with BLAST. In addition, the LZW-Kernel outperforms n-gram based mismatch kernels, hidden Markov model based SAM and Fisher kernel, and protein family based PSI-BLAST, among others. Further advantages include the LZW-Kernel's reliance on a simple idea, its ease of implementation, and its high speed, three times faster than BLAST and several magnitudes faster than SW or LAK in our tests. LZW-Kernel is implemented as a standalone C code and is a free open-source program distributed under GPLv3 license and can be downloaded from https://github.com/kfattila/LZW-Kernel. akerteszfarkas@hse.ru. Supplementary data are available at Bioinformatics Online.

  18. Partial discharge localization in power transformers based on the sequential quadratic programming-genetic algorithm adopting acoustic emission techniques

    NASA Astrophysics Data System (ADS)

    Liu, Hua-Long; Liu, Hua-Dong

    2014-10-01

    Partial discharge (PD) in power transformers is one of the prime reasons resulting in insulation degradation and power faults. Hence, it is of great importance to study the techniques of the detection and localization of PD in theory and practice. The detection and localization of PD employing acoustic emission (AE) techniques, as a kind of non-destructive testing, plus due to the advantages of powerful capability of locating and high precision, have been paid more and more attention. The localization algorithm is the key factor to decide the localization accuracy in AE localization of PD. Many kinds of localization algorithms exist for the PD source localization adopting AE techniques including intelligent and non-intelligent algorithms. However, the existed algorithms possess some defects such as the premature convergence phenomenon, poor local optimization ability and unsuitability for the field applications. To overcome the poor local optimization ability and easily caused premature convergence phenomenon of the fundamental genetic algorithm (GA), a new kind of improved GA is proposed, namely the sequence quadratic programming-genetic algorithm (SQP-GA). For the hybrid optimization algorithm, SQP-GA, the sequence quadratic programming (SQP) algorithm which is used as a basic operator is integrated into the fundamental GA, so the local searching ability of the fundamental GA is improved effectively and the premature convergence phenomenon is overcome. Experimental results of the numerical simulations of benchmark functions show that the hybrid optimization algorithm, SQP-GA, is better than the fundamental GA in the convergence speed and optimization precision, and the proposed algorithm in this paper has outstanding optimization effect. At the same time, the presented SQP-GA in the paper is applied to solve the ultrasonic localization problem of PD in transformers, then the ultrasonic localization method of PD in transformers based on the SQP-GA is proposed. And localization results based on the SQP-GA are compared with some algorithms such as the GA, some other intelligent and non-intelligent algorithms. The results of calculating examples both stimulated and spot experiments demonstrate that the localization method based on the SQP-GA can effectively prevent the results from getting trapped into the local optimum values, and the localization method is of great feasibility and very suitable for the field applications, and the precision of localization is enhanced, and the effectiveness of localization is ideal and satisfactory.

  19. Whole brain diffeomorphic metric mapping via integration of sulcal and gyral curves, cortical surfaces, and images

    PubMed Central

    Du, Jia; Younes, Laurent; Qiu, Anqi

    2011-01-01

    This paper introduces a novel large deformation diffeomorphic metric mapping algorithm for whole brain registration where sulcal and gyral curves, cortical surfaces, and intensity images are simultaneously carried from one subject to another through a flow of diffeomorphisms. To the best of our knowledge, this is the first time that the diffeomorphic metric from one brain to another is derived in a shape space of intensity images and point sets (such as curves and surfaces) in a unified manner. We describe the Euler–Lagrange equation associated with this algorithm with respect to momentum, a linear transformation of the velocity vector field of the diffeomorphic flow. The numerical implementation for solving this variational problem, which involves large-scale kernel convolution in an irregular grid, is made feasible by introducing a class of computationally friendly kernels. We apply this algorithm to align magnetic resonance brain data. Our whole brain mapping results show that our algorithm outperforms the image-based LDDMM algorithm in terms of the mapping accuracy of gyral/sulcal curves, sulcal regions, and cortical and subcortical segmentation. Moreover, our algorithm provides better whole brain alignment than combined volumetric and surface registration (Postelnicu et al., 2009) and hierarchical attribute matching mechanism for elastic registration (HAMMER) (Shen and Davatzikos, 2002) in terms of cortical and subcortical volume segmentation. PMID:21281722

  20. A Secure Alignment Algorithm for Mapping Short Reads to Human Genome.

    PubMed

    Zhao, Yongan; Wang, Xiaofeng; Tang, Haixu

    2018-05-09

    The elastic and inexpensive computing resources such as clouds have been recognized as a useful solution to analyzing massive human genomic data (e.g., acquired by using next-generation sequencers) in biomedical researches. However, outsourcing human genome computation to public or commercial clouds was hindered due to privacy concerns: even a small number of human genome sequences contain sufficient information for identifying the donor of the genomic data. This issue cannot be directly addressed by existing security and cryptographic techniques (such as homomorphic encryption), because they are too heavyweight to carry out practical genome computation tasks on massive data. In this article, we present a secure algorithm to accomplish the read mapping, one of the most basic tasks in human genomic data analysis based on a hybrid cloud computing model. Comparing with the existing approaches, our algorithm delegates most computation to the public cloud, while only performing encryption and decryption on the private cloud, and thus makes the maximum use of the computing resource of the public cloud. Furthermore, our algorithm reports similar results as the nonsecure read mapping algorithms, including the alignment between reads and the reference genome, which can be directly used in the downstream analysis such as the inference of genomic variations. We implemented the algorithm in C++ and Python on a hybrid cloud system, in which the public cloud uses an Apache Spark system.

  1. Increasing feasibility of the field-programmable gate array implementation of an iterative image registration using a kernel-warping algorithm

    NASA Astrophysics Data System (ADS)

    Nguyen, An Hung; Guillemette, Thomas; Lambert, Andrew J.; Pickering, Mark R.; Garratt, Matthew A.

    2017-09-01

    Image registration is a fundamental image processing technique. It is used to spatially align two or more images that have been captured at different times, from different sensors, or from different viewpoints. There have been many algorithms proposed for this task. The most common of these being the well-known Lucas-Kanade (LK) and Horn-Schunck approaches. However, the main limitation of these approaches is the computational complexity required to implement the large number of iterations necessary for successful alignment of the images. Previously, a multi-pass image interpolation algorithm (MP-I2A) was developed to considerably reduce the number of iterations required for successful registration compared with the LK algorithm. This paper develops a kernel-warping algorithm (KWA), a modified version of the MP-I2A, which requires fewer iterations to successfully register two images and less memory space for the field-programmable gate array (FPGA) implementation than the MP-I2A. These reductions increase feasibility of the implementation of the proposed algorithm on FPGAs with very limited memory space and other hardware resources. A two-FPGA system rather than single FPGA system is successfully developed to implement the KWA in order to compensate insufficiency of hardware resources supported by one FPGA, and increase parallel processing ability and scalability of the system.

  2. Multispectra CWT-based algorithm (MCWT) in mass spectra for peak extraction.

    PubMed

    Hsueh, Huey-Miin; Kuo, Hsun-Chih; Tsai, Chen-An

    2008-01-01

    An important objective in mass spectrometry (MS) is to identify a set of biomarkers that can be used to potentially distinguish patients between distinct treatments (or conditions) from tens or hundreds of spectra. A common two-step approach involving peak extraction and quantification is employed to identify the features of scientific interest. The selected features are then used for further investigation to understand underlying biological mechanism of individual protein or for development of genomic biomarkers to early diagnosis. However, the use of inadequate or ineffective peak detection and peak alignment algorithms in peak extraction step may lead to a high rate of false positives. Also, it is crucial to reduce the false positive rate in detecting biomarkers from ten or hundreds of spectra. Here a new procedure is introduced for feature extraction in mass spectrometry data that extends the continuous wavelet transform-based (CWT-based) algorithm to multiple spectra. The proposed multispectra CWT-based algorithm (MCWT) not only can perform peak detection for multiple spectra but also carry out peak alignment at the same time. The author' MCWT algorithm constructs a reference, which integrates information of multiple raw spectra, for feature extraction. The algorithm is applied to a SELDI-TOF mass spectra data set provided by CAMDA 2006 with known polypeptide m/z positions. This new approach is easy to implement and it outperforms the existing peak extraction method from the Bioconductor PROcess package.

  3. NERBio: using selected word conjunctions, term normalization, and global patterns to improve biomedical named entity recognition.

    PubMed

    Tsai, Richard Tzong-Han; Sung, Cheng-Lung; Dai, Hong-Jie; Hung, Hsieh-Chuan; Sung, Ting-Yi; Hsu, Wen-Lian

    2006-12-18

    Biomedical named entity recognition (Bio-NER) is a challenging problem because, in general, biomedical named entities of the same category (e.g., proteins and genes) do not follow one standard nomenclature. They have many irregularities and sometimes appear in ambiguous contexts. In recent years, machine-learning (ML) approaches have become increasingly common and now represent the cutting edge of Bio-NER technology. This paper addresses three problems faced by ML-based Bio-NER systems. First, most ML approaches usually employ singleton features that comprise one linguistic property (e.g., the current word is capitalized) and at least one class tag (e.g., B-protein, the beginning of a protein name). However, such features may be insufficient in cases where multiple properties must be considered. Adding conjunction features that contain multiple properties can be beneficial, but it would be infeasible to include all conjunction features in an NER model since memory resources are limited and some features are ineffective. To resolve the problem, we use a sequential forward search algorithm to select an effective set of features. Second, variations in the numerical parts of biomedical terms (e.g., "2" in the biomedical term IL2) cause data sparseness and generate many redundant features. In this case, we apply numerical normalization, which solves the problem by replacing all numerals in a term with one representative numeral to help classify named entities. Third, the assignment of NE tags does not depend solely on the target word's closest neighbors, but may depend on words outside the context window (e.g., a context window of five consists of the current word plus two preceding and two subsequent words). We use global patterns generated by the Smith-Waterman local alignment algorithm to identify such structures and modify the results of our ML-based tagger. This is called pattern-based post-processing. To develop our ML-based Bio-NER system, we employ conditional random fields, which have performed effectively in several well-known tasks, as our underlying ML model. Adding selected conjunction features, applying numerical normalization, and employing pattern-based post-processing improve the F-scores by 1.67%, 1.04%, and 0.57%, respectively. The combined increase of 3.28% yields a total score of 72.98%, which is better than the baseline system that only uses singleton features. We demonstrate the benefits of using the sequential forward search algorithm to select effective conjunction feature groups. In addition, we show that numerical normalization can effectively reduce the number of redundant and unseen features. Furthermore, the Smith-Waterman local alignment algorithm can help ML-based Bio-NER deal with difficult cases that need longer context windows.

  4. Full-waveform data for building roof step edge localization

    NASA Astrophysics Data System (ADS)

    Słota, Małgorzata

    2015-08-01

    Airborne laser scanning data perfectly represent flat or gently sloped areas; to date, however, accurate breakline detection is the main drawback of this technique. This issue becomes particularly important in the case of modeling buildings, where accuracy higher than the footprint size is often required. This article covers several issues related to full-waveform data registered on building step edges. First, the full-waveform data simulator was developed and presented in this paper. Second, this article provides a full description of the changes in echo amplitude, echo width and returned power caused by the presence of edges within the laser footprint. Additionally, two important properties of step edge echoes, peak shift and echo asymmetry, were noted and described. It was shown that these properties lead to incorrect echo positioning along the laser center line and can significantly reduce the edge points' accuracy. For these reasons and because all points are aligned with the center of the beam, regardless of the actual target position within the beam footprint, we can state that step edge points require geometric corrections. This article presents a novel algorithm for the refinement of step edge points. The main distinguishing advantage of the developed algorithm is the fact that none of the additional data, such as emitted signal parameters, beam divergence, approximate edge geometry or scanning settings, are required. The proposed algorithm works only on georeferenced profiles of reflected laser energy. Another major advantage is the simplicity of the calculation, allowing for very efficient data processing. Additionally, the developed method of point correction allows for the accurate determination of points lying on edges and edge point densification. For this reason, fully automatic localization of building roof step edges based on LiDAR full-waveform data with higher accuracy than the size of the lidar footprint is feasible.

  5. Using Variable-Length Aligned Fragment Pairs and an Improved Transition Function for Flexible Protein Structure Alignment.

    PubMed

    Cao, Hu; Lu, Yonggang

    2017-01-01

    With the rapid growth of known protein 3D structures in number, how to efficiently compare protein structures becomes an essential and challenging problem in computational structural biology. At present, many protein structure alignment methods have been developed. Among all these methods, flexible structure alignment methods are shown to be superior to rigid structure alignment methods in identifying structure similarities between proteins, which have gone through conformational changes. It is also found that the methods based on aligned fragment pairs (AFPs) have a special advantage over other approaches in balancing global structure similarities and local structure similarities. Accordingly, we propose a new flexible protein structure alignment method based on variable-length AFPs. Compared with other methods, the proposed method possesses three main advantages. First, it is based on variable-length AFPs. The length of each AFP is separately determined to maximally represent a local similar structure fragment, which reduces the number of AFPs. Second, it uses local coordinate systems, which simplify the computation at each step of the expansion of AFPs during the AFP identification. Third, it decreases the number of twists by rewarding the situation where nonconsecutive AFPs share the same transformation in the alignment, which is realized by dynamic programming with an improved transition function. The experimental data show that compared with FlexProt, FATCAT, and FlexSnap, the proposed method can achieve comparable results by introducing fewer twists. Meanwhile, it can generate results similar to those of the FATCAT method in much less running time due to the reduced number of AFPs.

  6. Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms.

    PubMed

    Ortegon, Patricia; Poot-Hernández, Augusto C; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

    2015-01-01

    In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case.

  7. Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms

    PubMed Central

    Ortegon, Patricia; Poot-Hernández, Augusto C.; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

    2015-01-01

    In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case. PMID:25973143

  8. Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data

    PubMed Central

    Kosugi, Shunichi; Natsume, Satoshi; Yoshida, Kentaro; MacLean, Daniel; Cano, Liliana; Kamoun, Sophien; Terauchi, Ryohei

    2013-01-01

    Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/. PMID:24116042

  9. Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Admissibility and Termination Analysis.

    PubMed

    Wei, Qinglai; Liu, Derong; Lin, Qiao

    In this paper, a novel local value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon optimal control problems for discrete-time nonlinear systems. The focuses of this paper are to study admissibility properties and the termination criteria of discrete-time local value iteration ADP algorithms. In the discrete-time local value iteration ADP algorithm, the iterative value functions and the iterative control laws are both updated in a given subset of the state space in each iteration, instead of the whole state space. For the first time, admissibility properties of iterative control laws are analyzed for the local value iteration ADP algorithm. New termination criteria are established, which terminate the iterative local ADP algorithm with an admissible approximate optimal control law. Finally, simulation results are given to illustrate the performance of the developed algorithm.In this paper, a novel local value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon optimal control problems for discrete-time nonlinear systems. The focuses of this paper are to study admissibility properties and the termination criteria of discrete-time local value iteration ADP algorithms. In the discrete-time local value iteration ADP algorithm, the iterative value functions and the iterative control laws are both updated in a given subset of the state space in each iteration, instead of the whole state space. For the first time, admissibility properties of iterative control laws are analyzed for the local value iteration ADP algorithm. New termination criteria are established, which terminate the iterative local ADP algorithm with an admissible approximate optimal control law. Finally, simulation results are given to illustrate the performance of the developed algorithm.

  10. A Motion Detection Algorithm Using Local Phase Information

    PubMed Central

    Lazar, Aurel A.; Ukani, Nikul H.; Zhou, Yiyin

    2016-01-01

    Previous research demonstrated that global phase alone can be used to faithfully represent visual scenes. Here we provide a reconstruction algorithm by using only local phase information. We also demonstrate that local phase alone can be effectively used to detect local motion. The local phase-based motion detector is akin to models employed to detect motion in biological vision, for example, the Reichardt detector. The local phase-based motion detection algorithm introduced here consists of two building blocks. The first building block measures/evaluates the temporal change of the local phase. The temporal derivative of the local phase is shown to exhibit the structure of a second order Volterra kernel with two normalized inputs. We provide an efficient, FFT-based algorithm for implementing the change of the local phase. The second processing building block implements the detector; it compares the maximum of the Radon transform of the local phase derivative with a chosen threshold. We demonstrate examples of applying the local phase-based motion detection algorithm on several video sequences. We also show how the locally detected motion can be used for segmenting moving objects in video scenes and compare our local phase-based algorithm to segmentation achieved with a widely used optic flow algorithm. PMID:26880882

  11. Automatic Insall-Salvati ratio measurement on lateral knee x-ray images using model-guided landmark localization

    NASA Astrophysics Data System (ADS)

    Chen, Hsin-Chen; Lin, Chii-Jeng; Wu, Chia-Hsing; Wang, Chien-Kuo; Sun, Yung-Nien

    2010-11-01

    The Insall-Salvati ratio (ISR) is important for detecting two common clinical signs of knee disease: patella alta and patella baja. Furthermore, large inter-operator differences in ISR measurement make an objective measurement system necessary for better clinical evaluation. In this paper, we define three specific bony landmarks for determining the ISR and then propose an x-ray image analysis system to localize these landmarks and measure the ISR. Due to inherent artifacts in x-ray images, such as unevenly distributed intensities, which make landmark localization difficult, we hence propose a registration-assisted active-shape model (RAASM) to localize these landmarks. We first construct a statistical model from a set of training images based on x-ray image intensity and patella shape. Since a knee x-ray image contains specific anatomical structures, we then design an algorithm, based on edge tracing, for patella feature extraction in order to automatically align the model to the patella image. We can estimate the landmark locations as well as the ISR after registration-assisted model fitting. Our proposed method successfully overcomes drawbacks caused by x-ray image artifacts. Experimental results show great agreement between the ISRs measured by the proposed method and by orthopedic clinicians.

  12. YAHA: fast and flexible long-read alignment with optimal breakpoint detection.

    PubMed

    Faust, Gregory G; Hall, Ira M

    2012-10-01

    With improved short-read assembly algorithms and the recent development of long-read sequencers, split mapping will soon be the preferred method for structural variant (SV) detection. Yet, current alignment tools are not well suited for this. We present YAHA, a fast and flexible hash-based aligner. YAHA is as fast and accurate as BWA-SW at finding the single best alignment per query and is dramatically faster and more sensitive than both SSAHA2 and MegaBLAST at finding all possible alignments. Unlike other aligners that report all, or one, alignment per query, or that use simple heuristics to select alignments, YAHA uses a directed acyclic graph to find the optimal set of alignments that cover a query using a biologically relevant breakpoint penalty. YAHA can also report multiple mappings per defined segment of the query. We show that YAHA detects more breakpoints in less time than BWA-SW across all SV classes, and especially excels at complex SVs comprising multiple breakpoints. YAHA is currently supported on 64-bit Linux systems. Binaries and sample data are freely available for download from http://faculty.virginia.edu/irahall/YAHA. imh4y@virginia.edu.

  13. Use of a Closed-Loop Tracking Algorithm for Orientation Bias Determination of an S-Band Ground Station

    NASA Technical Reports Server (NTRS)

    Welch, Bryan W.; Piasecki, Marie T.; Schrage, Dean S.

    2015-01-01

    The Space Communications and Navigation (SCaN) Testbed project completed installation and checkout testing of a new S-Band ground station at the NASA Glenn Research Center in Cleveland, Ohio in 2015. As with all ground stations, a key alignment process must be conducted to obtain offset angles in azimuth (AZ) and elevation (EL). In telescopes with AZ-EL gimbals, this is normally done with a two-star alignment process, where telescope-based pointing vectors are derived from catalogued locations with the AZ-EL bias angles derived from the pointing vector difference. For an antenna, the process is complicated without an optical asset. For the present study, the solution was to utilize the gimbal control algorithms closed-loop tracking capability to acquire the peak received power signal automatically from two distinct NASA Tracking and Data Relay Satellite (TDRS) spacecraft, without a human making the pointing adjustments. Briefly, the TDRS satellite acts as a simulated optical source and the alignment process proceeds exactly the same way as a one-star alignment. The data reduction process, which will be discussed in the paper, results in two bias angles which are retained for future pointing determination. Finally, the paper compares the test results and provides lessons learned from the activity.

  14. Data preprocessing method for liquid chromatography-mass spectrometry based metabolomics.

    PubMed

    Wei, Xiaoli; Shi, Xue; Kim, Seongho; Zhang, Li; Patrick, Jeffrey S; Binkley, Joe; McClain, Craig; Zhang, Xiang

    2012-09-18

    A set of data preprocessing algorithms for peak detection and peak list alignment are reported for analysis of liquid chromatography-mass spectrometry (LC-MS)-based metabolomics data. For spectrum deconvolution, peak picking is achieved at the selected ion chromatogram (XIC) level. To estimate and remove the noise in XICs, each XIC is first segmented into several peak groups based on the continuity of scan number, and the noise level is estimated by all the XIC signals, except the regions potentially with presence of metabolite ion peaks. After removing noise, the peaks of molecular ions are detected using both the first and the second derivatives, followed by an efficient exponentially modified Gaussian-based peak deconvolution method for peak fitting. A two-stage alignment algorithm is also developed, where the retention times of all peaks are first transferred into the z-score domain and the peaks are aligned based on the measure of their mixture scores after retention time correction using a partial linear regression. Analysis of a set of spike-in LC-MS data from three groups of samples containing 16 metabolite standards mixed with metabolite extract from mouse livers demonstrates that the developed data preprocessing method performs better than two of the existing popular data analysis packages, MZmine2.6 and XCMS(2), for peak picking, peak list alignment, and quantification.

  15. A Data Pre-processing Method for Liquid Chromatography Mass Spectrometry-based Metabolomics

    PubMed Central

    Wei, Xiaoli; Shi, Xue; Kim, Seongho; Zhang, Li; Patrick, Jeffrey S.; Binkley, Joe; McClain, Craig; Zhang, Xiang

    2012-01-01

    A set of data pre-processing algorithms for peak detection and peak list alignment are reported for analysis of LC-MS based metabolomics data. For spectrum deconvolution, peak picking is achieved at selected ion chromatogram (XIC) level. To estimate and remove the noise in XICs, each XIC is first segmented into several peak groups based on the continuity of scan number, and the noise level is estimated by all the XIC signals, except the regions potentially with presence of metabolite ion peaks. After removing noise, the peaks of molecular ions are detected using both the first and the second derivatives, followed by an efficient exponentially modified Gaussian-based peak deconvolution method for peak fitting. A two-stage alignment algorithm is also developed, where the retention times of all peaks are first transferred into z-score domain and the peaks are aligned based on the measure of their mixture scores after retention time correction using a partial linear regression. Analysis of a set of spike-in LC-MS data from three groups of samples containing 16 metabolite standards mixed with metabolite extract from mouse livers, demonstrates that the developed data pre-processing methods performs better than two of the existing popular data analysis packages, MZmine2.6 and XCMS2, for peak picking, peak list alignment and quantification. PMID:22931487

  16. A single camera photogrammetry system for multi-angle fast localization of EEG electrodes.

    PubMed

    Qian, Shuo; Sheng, Yang

    2011-11-01

    Photogrammetry has become an effective method for the determination of electroencephalography (EEG) electrode positions in three dimensions (3D). Capturing multi-angle images of the electrodes on the head is a fundamental objective in the design of photogrammetry system for EEG localization. Methods in previous studies are all based on the use of either a rotating camera or multiple cameras, which are time-consuming or not cost-effective. This study aims to present a novel photogrammetry system that can realize simultaneous acquisition of multi-angle head images in a single camera position. Aligning two planar mirrors with the angle of 51.4°, seven views of the head with 25 electrodes are captured simultaneously by the digital camera placed in front of them. A complete set of algorithms for electrode recognition, matching, and 3D reconstruction is developed. It is found that the elapsed time of the whole localization procedure is about 3 min, and camera calibration computation takes about 1 min, after the measurement of calibration points. The positioning accuracy with the maximum error of 1.19 mm is acceptable. Experimental results demonstrate that the proposed system provides a fast and cost-effective method for the EEG positioning.

  17. Alignment of leading-edge and peak-picking time of arrival methods to obtain accurate source locations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Roussel-Dupre, R.; Symbalisty, E.; Fox, C.

    2009-08-01

    The location of a radiating source can be determined by time-tagging the arrival of the radiated signal at a network of spatially distributed sensors. The accuracy of this approach depends strongly on the particular time-tagging algorithm employed at each of the sensors. If different techniques are used across the network, then the time tags must be referenced to a common fiducial for maximum location accuracy. In this report we derive the time corrections needed to temporally align leading-edge, time-tagging techniques with peak-picking algorithms. We focus on broadband radio frequency (RF) sources, an ionospheric propagation channel, and narrowband receivers, but themore » final results can be generalized to apply to any source, propagation environment, and sensor. Our analytic results are checked against numerical simulations for a number of representative cases and agree with the specific leading-edge algorithm studied independently by Kim and Eng (1995) and Pongratz (2005 and 2007).« less

  18. Simulations in site error estimation for direction finders

    NASA Astrophysics Data System (ADS)

    López, Raúl E.; Passi, Ranjit M.

    1991-08-01

    The performance of an algorithm for the recovery of site-specific errors of direction finder (DF) networks is tested under controlled simulated conditions. The simulations show that the algorithm has some inherent shortcomings for the recovery of site errors from the measured azimuth data. These limitations are fundamental to the problem of site error estimation using azimuth information. Several ways for resolving or ameliorating these basic complications are tested by means of simulations. From these it appears that for the effective implementation of the site error determination algorithm, one should design the networks with at least four DFs, improve the alignment of the antennas, and increase the gain of the DFs as much as it is compatible with other operational requirements. The use of a nonzero initial estimate of the site errors when working with data from networks of four or more DFs also improves the accuracy of the site error recovery. Even for networks of three DFs, reasonable site error corrections could be obtained if the antennas could be well aligned.

  19. An integrated workflow for robust alignment and simplified quantitative analysis of NMR spectrometry data.

    PubMed

    Vu, Trung N; Valkenborg, Dirk; Smets, Koen; Verwaest, Kim A; Dommisse, Roger; Lemière, Filip; Verschoren, Alain; Goethals, Bart; Laukens, Kris

    2011-10-20

    Nuclear magnetic resonance spectroscopy (NMR) is a powerful technique to reveal and compare quantitative metabolic profiles of biological tissues. However, chemical and physical sample variations make the analysis of the data challenging, and typically require the application of a number of preprocessing steps prior to data interpretation. For example, noise reduction, normalization, baseline correction, peak picking, spectrum alignment and statistical analysis are indispensable components in any NMR analysis pipeline. We introduce a novel suite of informatics tools for the quantitative analysis of NMR metabolomic profile data. The core of the processing cascade is a novel peak alignment algorithm, called hierarchical Cluster-based Peak Alignment (CluPA). The algorithm aligns a target spectrum to the reference spectrum in a top-down fashion by building a hierarchical cluster tree from peak lists of reference and target spectra and then dividing the spectra into smaller segments based on the most distant clusters of the tree. To reduce the computational time to estimate the spectral misalignment, the method makes use of Fast Fourier Transformation (FFT) cross-correlation. Since the method returns a high-quality alignment, we can propose a simple methodology to study the variability of the NMR spectra. For each aligned NMR data point the ratio of the between-group and within-group sum of squares (BW-ratio) is calculated to quantify the difference in variability between and within predefined groups of NMR spectra. This differential analysis is related to the calculation of the F-statistic or a one-way ANOVA, but without distributional assumptions. Statistical inference based on the BW-ratio is achieved by bootstrapping the null distribution from the experimental data. The workflow performance was evaluated using a previously published dataset. Correlation maps, spectral and grey scale plots show clear improvements in comparison to other methods, and the down-to-earth quantitative analysis works well for the CluPA-aligned spectra. The whole workflow is embedded into a modular and statistically sound framework that is implemented as an R package called "speaq" ("spectrum alignment and quantitation"), which is freely available from http://code.google.com/p/speaq/.

  20. rasbhari: Optimizing Spaced Seeds for Database Searching, Read Mapping and Alignment-Free Sequence Comparison.

    PubMed

    Hahn, Lars; Leimeister, Chris-André; Ounit, Rachid; Lonardi, Stefano; Morgenstern, Burkhard

    2016-10-01

    Many algorithms for sequence analysis rely on word matching or word statistics. Often, these approaches can be improved if binary patterns representing match and don't-care positions are used as a filter, such that only those positions of words are considered that correspond to the match positions of the patterns. The performance of these approaches, however, depends on the underlying patterns. Herein, we show that the overlap complexity of a pattern set that was introduced by Ilie and Ilie is closely related to the variance of the number of matches between two evolutionarily related sequences with respect to this pattern set. We propose a modified hill-climbing algorithm to optimize pattern sets for database searching, read mapping and alignment-free sequence comparison of nucleic-acid sequences; our implementation of this algorithm is called rasbhari. Depending on the application at hand, rasbhari can either minimize the overlap complexity of pattern sets, maximize their sensitivity in database searching or minimize the variance of the number of pattern-based matches in alignment-free sequence comparison. We show that, for database searching, rasbhari generates pattern sets with slightly higher sensitivity than existing approaches. In our Spaced Words approach to alignment-free sequence comparison, pattern sets calculated with rasbhari led to more accurate estimates of phylogenetic distances than the randomly generated pattern sets that we previously used. Finally, we used rasbhari to generate patterns for short read classification with CLARK-S. Here too, the sensitivity of the results could be improved, compared to the default patterns of the program. We integrated rasbhari into Spaced Words; the source code of rasbhari is freely available at http://rasbhari.gobics.de/.

  1. Assessment of Homology Templates and an Anesthetic Binding Site within the γ-Aminobutyric Acid Receptor

    PubMed Central

    Bertaccini, Edward J.; Yoluk, Ozge; Lindahl, Erik R.; Trudell, James R.

    2013-01-01

    Background Anesthetics mediate portions of their activity via modulation of the γ-aminobutyric acid receptor (GABAaR). While its molecular structure remains unknown, significant progress has been made towards understanding its interactions with anesthetics via molecular modeling. Methods The structure of the torpedo acetylcholine receptor (nAChRα), the structures of the α4 and β2 subunits of the human nAChR, the structures of the eukaryotic glutamate-gated chloride channel (GluCl), and the prokaryotic pH sensing channels, from Gloeobacter violaceus and Erwinia chrysanthemi, were aligned with the SAlign and 3DMA algorithms. A multiple sequence alignment from these structures and those of the GABAaR was performed with ClustalW. The Modeler and Rosetta algorithms independently created three-dimensional constructs of the GABAaR from the GluCl template. The CDocker algorithm docked a congeneric series of propofol derivatives into the binding pocket and scored calculated binding affinities for correlation with known GABAaR potentiation EC50’s. Results Multiple structure alignments of templates revealed a clear consensus of residue locations relevant to anesthetic effects except for torpedo nAChR. Within the GABAaR models generated from GluCl, the residues notable for modulating anesthetic action within transmembrane segments 1, 2, and 3 converged on the intersubunit interface between alpha and beta subunits. Docking scores of a propofol derivative series into this binding site showed strong linear correlation with GABAaR potentiation EC50. Conclusion Consensus structural alignment based on homologous templates revealed an intersubunit anesthetic binding cavity within the transmembrane domain of the GABAaR, which showed correlation of ligand docking scores with experimentally measured GABAaR potentiation. PMID:23770602

  2. Assessment of homology templates and an anesthetic binding site within the γ-aminobutyric acid receptor.

    PubMed

    Bertaccini, Edward J; Yoluk, Ozge; Lindahl, Erik R; Trudell, James R

    2013-11-01

    Anesthetics mediate portions of their activity via modulation of the γ-aminobutyric acid receptor (GABAaR). Although its molecular structure remains unknown, significant progress has been made toward understanding its interactions with anesthetics via molecular modeling. The structure of the torpedo acetylcholine receptor (nAChRα), the structures of the α4 and β2 subunits of the human nAChR, the structures of the eukaryotic glutamate-gated chloride channel (GluCl), and the prokaryotic pH-sensing channels, from Gloeobacter violaceus and Erwinia chrysanthemi, were aligned with the SAlign and 3DMA algorithms. A multiple sequence alignment from these structures and those of the GABAaR was performed with ClustalW. The Modeler and Rosetta algorithms independently created three-dimensional constructs of the GABAaR from the GluCl template. The CDocker algorithm docked a congeneric series of propofol derivatives into the binding pocket and scored calculated binding affinities for correlation with known GABAaR potentiation EC50s. Multiple structure alignments of templates revealed a clear consensus of residue locations relevant to anesthetic effects except for torpedo nAChR. Within the GABAaR models generated from GluCl, the residues notable for modulating anesthetic action within transmembrane segments 1, 2, and 3 converged on the intersubunit interface between α and β subunits. Docking scores of a propofol derivative series into this binding site showed strong linear correlation with GABAaR potentiation EC50. Consensus structural alignment based on homologous templates revealed an intersubunit anesthetic binding cavity within the transmembrane domain of the GABAaR, which showed a correlation of ligand docking scores with experimentally measured GABAaR potentiation.

  3. A Dimensionally Aligned Signal Projection for Classification of Unintended Radiated Emissions

    DOE PAGES

    Vann, Jason Michael; Karnowski, Thomas P.; Kerekes, Ryan; ...

    2017-04-24

    Characterization of unintended radiated emissions (URE) from electronic devices plays an important role in many research areas from electromagnetic interference to nonintrusive load monitoring to information system security. URE can provide insights for applications ranging from load disaggregation and energy efficiency to condition-based maintenance of equipment-based upon detected fault conditions. URE characterization often requires subject matter expertise to tailor transforms and feature extractors for the specific electrical devices of interest. We present a novel approach, named dimensionally aligned signal projection (DASP), for projecting aligned signal characteristics that are inherent to the physical implementation of many commercial electronic devices. These projectionsmore » minimize the need for an intimate understanding of the underlying physical circuitry and significantly reduce the number of features required for signal classification. We present three possible DASP algorithms that leverage frequency harmonics, modulation alignments, and frequency peak spacings, along with a two-dimensional image manipulation method for statistical feature extraction. To demonstrate the ability of DASP to generate relevant features from URE, we measured the conducted URE from 14 residential electronic devices using a 2 MS/s collection system. Furthermore, a linear discriminant analysis classifier was trained using DASP generated features and was blind tested resulting in a greater than 90% classification accuracy for each of the DASP algorithms and an accuracy of 99.1% when DASP features are used in combination. Furthermore, we show that a rank reduced feature set of the combined DASP algorithms provides a 98.9% classification accuracy with only three features and outperforms a set of spectral features in terms of general classification as well as applicability across a broad number of devices.« less

  4. Defining ‘Unhealthy’: A Systematic Analysis of Alignment between the Australian Dietary Guidelines and the Health Star Rating System

    PubMed Central

    Rådholm, Karin; Neal, Bruce

    2018-01-01

    The Australian Dietary Guidelines (ADGs) and Health Star Rating (HSR) front-of-pack labelling system are two national interventions to promote healthier diets. Our aim was to assess the degree of alignment between the two policies. Methods: Nutrition information was extracted for 65,660 packaged foods available in The George Institute’s Australian FoodSwitch database. Products were classified ‘core’ or ‘discretionary’ based on the ADGs, and a HSR generated irrespective of whether currently displayed on pack. Apparent outliers were identified as those products classified ‘core’ that received HSR ≤ 2.0; and those classified ‘discretionary’ that received HSR ≥ 3.5. Nutrient cut-offs were applied to determine whether apparent outliers were ‘high in’ salt, total sugar or saturated fat, and outlier status thereby attributed to a failure of the ADGs or HSR algorithm. Results: 47,116 products (23,460 core; 23,656 discretionary) were included. Median (Q1, Q3) HSRs were 4.0 (3.0 to 4.5) for core and 2.0 (1.0 to 3.0) for discretionary products. Overall alignment was good: 86.6% of products received a HSR aligned with their ADG classification. Among 6324 products identified as apparent outliers, 5246 (83.0%) were ultimately determined to be ADG failures, largely caused by challenges in defining foods as ‘core’ or ‘discretionary’. In total, 1078 (17.0%) were determined to be true failures of the HSR algorithm. Conclusion: The scope of genuine misalignment between the ADGs and HSR algorithm is very small. We provide evidence-informed recommendations for strengthening both policies to more effectively guide Australians towards healthier choices. PMID:29670024

  5. A Dimensionally Aligned Signal Projection for Classification of Unintended Radiated Emissions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vann, Jason Michael; Karnowski, Thomas P.; Kerekes, Ryan

    Characterization of unintended radiated emissions (URE) from electronic devices plays an important role in many research areas from electromagnetic interference to nonintrusive load monitoring to information system security. URE can provide insights for applications ranging from load disaggregation and energy efficiency to condition-based maintenance of equipment-based upon detected fault conditions. URE characterization often requires subject matter expertise to tailor transforms and feature extractors for the specific electrical devices of interest. We present a novel approach, named dimensionally aligned signal projection (DASP), for projecting aligned signal characteristics that are inherent to the physical implementation of many commercial electronic devices. These projectionsmore » minimize the need for an intimate understanding of the underlying physical circuitry and significantly reduce the number of features required for signal classification. We present three possible DASP algorithms that leverage frequency harmonics, modulation alignments, and frequency peak spacings, along with a two-dimensional image manipulation method for statistical feature extraction. To demonstrate the ability of DASP to generate relevant features from URE, we measured the conducted URE from 14 residential electronic devices using a 2 MS/s collection system. Furthermore, a linear discriminant analysis classifier was trained using DASP generated features and was blind tested resulting in a greater than 90% classification accuracy for each of the DASP algorithms and an accuracy of 99.1% when DASP features are used in combination. Furthermore, we show that a rank reduced feature set of the combined DASP algorithms provides a 98.9% classification accuracy with only three features and outperforms a set of spectral features in terms of general classification as well as applicability across a broad number of devices.« less

  6. Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

    NASA Astrophysics Data System (ADS)

    Roche-Lima, Abiel; Thulasiram, Ruppa K.

    2012-02-01

    Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.

  7. Coordinate alignment of combined measurement systems using a modified common points method

    NASA Astrophysics Data System (ADS)

    Zhao, G.; Zhang, P.; Xiao, W.

    2018-03-01

    The co-ordinate metrology has been extensively researched for its outstanding advantages in measurement range and accuracy. The alignment of different measurement systems is usually achieved by integrating local coordinates via common points before measurement. The alignment errors would accumulate and significantly reduce the global accuracy, thus need to be minimized. In this thesis, a modified common points method (MCPM) is proposed to combine different traceable system errors of the cooperating machines, and optimize the global accuracy by introducing mutual geometric constraints. The geometric constraints, obtained by measuring the common points in individual local coordinate systems, provide the possibility to reduce the local measuring uncertainty whereby enhance the global measuring certainty. A simulation system is developed in Matlab to analyze the feature of MCPM using the Monto-Carlo method. An exemplary setup is constructed to verify the feasibility and efficiency of the proposed method associated with laser tracker and indoor iGPS systems. Experimental results show that MCPM could significantly improve the alignment accuracy.

  8. SPIN ALIGNMENTS OF SPIRAL GALAXIES WITHIN THE LARGE-SCALE STRUCTURE FROM SDSS DR7

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Youcai; Yang, Xiaohu; Luo, Wentao

    Using a sample of spiral galaxies selected from the Sloan Digital Sky Survey Data Release 7 and Galaxy Zoo 2, we investigate the alignment of spin axes of spiral galaxies with their surrounding large-scale structure, which is characterized by the large-scale tidal field reconstructed from the data using galaxy groups above a certain mass threshold. We find that the spin axes only have weak tendencies to be aligned with (or perpendicular to) the intermediate (or minor) axis of the local tidal tensor. The signal is the strongest in a cluster environment where all three eigenvalues of the local tidal tensor aremore » positive. Compared to the alignments between halo spins and the local tidal field obtained in N-body simulations, the above observational results are in best agreement with those for the spins of inner regions of halos, suggesting that the disk material traces the angular momentum of dark matter halos in the inner regions.« less

  9. A 2D eye gaze estimation system with low-resolution webcam images

    NASA Astrophysics Data System (ADS)

    Ince, Ibrahim Furkan; Kim, Jin Woo

    2011-12-01

    In this article, a low-cost system for 2D eye gaze estimation with low-resolution webcam images is presented. Two algorithms are proposed for this purpose, one for the eye-ball detection with stable approximate pupil-center and the other one for the eye movements' direction detection. Eyeball is detected using deformable angular integral search by minimum intensity (DAISMI) algorithm. Deformable template-based 2D gaze estimation (DTBGE) algorithm is employed as a noise filter for deciding the stable movement decisions. While DTBGE employs binary images, DAISMI employs gray-scale images. Right and left eye estimates are evaluated separately. DAISMI finds the stable approximate pupil-center location by calculating the mass-center of eyeball border vertices to be employed for initial deformable template alignment. DTBGE starts running with initial alignment and updates the template alignment with resulting eye movements and eyeball size frame by frame. The horizontal and vertical deviation of eye movements through eyeball size is considered as if it is directly proportional with the deviation of cursor movements in a certain screen size and resolution. The core advantage of the system is that it does not employ the real pupil-center as a reference point for gaze estimation which is more reliable against corneal reflection. Visual angle accuracy is used for the evaluation and benchmarking of the system. Effectiveness of the proposed system is presented and experimental results are shown.

  10. Comparing Four Age Model Techniques using Nine Sediment Cores from the Iberian Margin

    NASA Astrophysics Data System (ADS)

    Lisiecki, L. E.; Jones, A. M.; Lawrence, C.

    2017-12-01

    Interpretations of paleoclimate records from ocean sediment cores rely on age models, which provide estimates of age as a function of core depth. Here we compare four methods used to generate age models for sediment cores for the past 140 kyr. The first method is based on radiocarbon dating using the Bayesian statistical software, Bacon [Blaauw and Christen, 2011]. The second method aligns benthic δ18O to a target core using the probabilistic alignment algorithm, HMM-Match, which also generates age uncertainty estimates [Lin et al., 2014]. The third and fourth methods are planktonic δ18O and sea surface temperature (SST) alignments to the same target core, using the alignment algorithm Match [Lisiecki and Lisiecki, 2002]. Unlike HMM-Match, Match requires parameter tuning and does not produce uncertainty estimates. The results of these four age model techniques are compared for nine high-resolution cores from the Iberian margin. The root mean square error between the individual age model results and each core's average estimated age is 1.4 kyr. Additionally, HMM-Match and Bacon age estimates agree to within uncertainty and have similar 95% confidence widths of 1-2 kyr for the highest resolution records. In one core, the planktonic and SST alignments did not fall within the 95% confidence intervals from HMM-Match. For this core, the surface proxy alignments likely produce more reliable results due to millennial-scale SST variability and the presence of several gaps in the benthic δ18O data. Similar studies of other oceanographic regions are needed to determine the spatial extents over which these climate proxies may be stratigraphically correlated.

  11. ASPIC: a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences.

    PubMed

    Bonizzoni, Paola; Rizzi, Raffaella; Pesole, Graziano

    2005-10-05

    Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs) to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems--hence the need to develop novel strategies. We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites. To avoid limitations of splice sites prediction (mainly, over-predictions) due to independent single EST alignments to the genomic sequence our approach performs a multiple alignment of transcript data to the genomic sequence based on the combined analysis of all available data. We recast the problem of predicting constitutive and alternative splicing as an optimization problem, where the optimal multiple transcript alignment minimizes the number of exons and hence of splice site observations. We have implemented a splice site predictor based on this algorithm in the software tool ASPIC (Alternative Splicing PredICtion). It is distinguished from other methods based on BLAST-like tools by the incorporation of entirely new ad hoc procedures for accurate and computationally efficient transcript alignment and adopts dynamic programming for the refinement of intron boundaries. ASPIC also provides the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events. The ASPIC web resource is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility. Extensive bench marking shows that ASPIC outperforms other existing methods in the detection of novel splicing isoforms and in the minimization of over-predictions. ASPIC also requires a lower computation time for processing a single gene and an EST cluster. The ASPIC web resource is available at http://aspic.algo.disco.unimib.it/aspic-devel/.

  12. Polarization-dependent DANES study on vertically-aligned ZnO nanorods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sun, Chengjun; Park, Chang-In; Jin, Zhenlan

    2016-05-01

    The local structural and local density of states of vertically-aligned ZnO nanorods were examined by using a polarization-dependent diffraction anomalous near edge structure (DANES) measurements from c-oriented ZnO nanorods at the Zn K edge with the incident x-ray electric field parallel and perpendicular to the x-ray momentum transfer direction. Orientation-dependent local structures determined by DANES were comparable with polarization-dependent EXAFS results. Unlike other techniques, polarization-dependent DANES can uniquely describe the orientation-dependent local structural properties and the local density of states of a selected element in selected-phased crystals of compounds or mixed-phased structures.

  13. Flexible, fast and accurate sequence alignment profiling on GPGPU with PaSWAS.

    PubMed

    Warris, Sven; Yalcin, Feyruz; Jackson, Katherine J L; Nap, Jan Peter

    2015-01-01

    To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis. With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation.

  14. Cross contrast multi-channel image registration using image synthesis for MR brain images.

    PubMed

    Chen, Min; Carass, Aaron; Jog, Amod; Lee, Junghoon; Roy, Snehashis; Prince, Jerry L

    2017-02-01

    Multi-modal deformable registration is important for many medical image analysis tasks such as atlas alignment, image fusion, and distortion correction. Whereas a conventional method would register images with different modalities using modality independent features or information theoretic metrics such as mutual information, this paper presents a new framework that addresses the problem using a two-channel registration algorithm capable of using mono-modal similarity measures such as sum of squared differences or cross-correlation. To make it possible to use these same-modality measures, image synthesis is used to create proxy images for the opposite modality as well as intensity-normalized images from each of the two available images. The new deformable registration framework was evaluated by performing intra-subject deformation recovery, intra-subject boundary alignment, and inter-subject label transfer experiments using multi-contrast magnetic resonance brain imaging data. Three different multi-channel registration algorithms were evaluated, revealing that the framework is robust to the multi-channel deformable registration algorithm that is used. With a single exception, all results demonstrated improvements when compared against single channel registrations using the same algorithm with mutual information. Copyright © 2016 Elsevier B.V. All rights reserved.

  15. MUSCLE: multiple sequence alignment with high accuracy and high throughput.

    PubMed

    Edgar, Robert C

    2004-01-01

    We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

  16. Primer-BLAST: A tool to design target-specific primers for polymerase chain reaction

    PubMed Central

    2012-01-01

    Background Choosing appropriate primers is probably the single most important factor affecting the polymerase chain reaction (PCR). Specific amplification of the intended target requires that primers do not have matches to other targets in certain orientations and within certain distances that allow undesired amplification. The process of designing specific primers typically involves two stages. First, the primers flanking regions of interest are generated either manually or using software tools; then they are searched against an appropriate nucleotide sequence database using tools such as BLAST to examine the potential targets. However, the latter is not an easy process as one needs to examine many details between primers and targets, such as the number and the positions of matched bases, the primer orientations and distance between forward and reverse primers. The complexity of such analysis usually makes this a time-consuming and very difficult task for users, especially when the primers have a large number of hits. Furthermore, although the BLAST program has been widely used for primer target detection, it is in fact not an ideal tool for this purpose as BLAST is a local alignment algorithm and does not necessarily return complete match information over the entire primer range. Results We present a new software tool called Primer-BLAST to alleviate the difficulty in designing target-specific primers. This tool combines BLAST with a global alignment algorithm to ensure a full primer-target alignment and is sensitive enough to detect targets that have a significant number of mismatches to primers. Primer-BLAST allows users to design new target-specific primers in one step as well as to check the specificity of pre-existing primers. Primer-BLAST also supports placing primers based on exon/intron locations and excluding single nucleotide polymorphism (SNP) sites in primers. Conclusions We describe a robust and fully implemented general purpose primer design tool that designs target-specific PCR primers. Primer-BLAST offers flexible options to adjust the specificity threshold and other primer properties. This tool is publicly available at http://www.ncbi.nlm.nih.gov/tools/primer-blast. PMID:22708584

  17. Local-global alignment for finding 3D similarities in protein structures

    DOEpatents

    Zemla, Adam T [Brentwood, CA

    2011-09-20

    A method of finding 3D similarities in protein structures of a first molecule and a second molecule. The method comprises providing preselected information regarding the first molecule and the second molecule. Comparing the first molecule and the second molecule using Longest Continuous Segments (LCS) analysis. Comparing the first molecule and the second molecule using Global Distance Test (GDT) analysis. Comparing the first molecule and the second molecule using Local Global Alignment Scoring function (LGA_S) analysis. Verifying constructed alignment and repeating the steps to find the regions of 3D similarities in protein structures.

  18. FoldMiner and LOCK 2: protein structure comparison and motif discovery on the web.

    PubMed

    Shapiro, Jessica; Brutlag, Douglas

    2004-07-01

    The FoldMiner web server (http://foldminer.stanford.edu/) provides remote access to methods for protein structure alignment and unsupervised motif discovery. FoldMiner is unique among such algorithms in that it improves both the motif definition and the sensitivity of a structural similarity search by combining the search and motif discovery methods and using information from each process to enhance the other. In a typical run, a query structure is aligned to all structures in one of several databases of single domain targets in order to identify its structural neighbors and to discover a motif that is the basis for the similarity among the query and statistically significant targets. This process is fully automated, but options for manual refinement of the results are available as well. The server uses the Chime plugin and customized controls to allow for visualization of the motif and of structural superpositions. In addition, we provide an interface to the LOCK 2 algorithm for rapid alignments of a query structure to smaller numbers of user-specified targets.

  19. [Motion control of moving mirror based on fixed-mirror adjustment in FTIR spectrometer].

    PubMed

    Li, Zhong-bing; Xu, Xian-ze; Le, Yi; Xu, Feng-qiu; Li, Jun-wei

    2012-08-01

    The performance of the uniform motion of the moving mirror, which is the only constant motion part in FTIR spectrometer, and the performance of the alignment of the fixed mirror play a key role in FTIR spectrometer, and affect the interference effect and the quality of the spectrogram and may restrict the precision and resolution of the instrument directly. The present article focuses on the research on the uniform motion of the moving mirror and the alignment of the fixed mirror. In order to improve the FTIR spectrometer, the maglev support system was designed for the moving mirror and the phase detection technology was adopted to adjust the tilt angle between the moving mirror and the fixed mirror. This paper also introduces an improved fuzzy PID control algorithm to get the accurate speed of the moving mirror and realize the control strategy from both hardware design and algorithm. The results show that the development of the moving mirror motion control system gets sufficient accuracy and real-time, which can ensure the uniform motion of the moving mirror and the alignment of the fixed mirror.

  20. A Coarse-Alignment Method Based on the Optimal-REQUEST Algorithm

    PubMed Central

    Zhu, Yongyun

    2018-01-01

    In this paper, we proposed a coarse-alignment method for strapdown inertial navigation systems based on attitude determination. The observation vectors, which can be obtained by inertial sensors, usually contain various types of noise, which affects the convergence rate and the accuracy of the coarse alignment. Given this drawback, we studied an attitude-determination method named optimal-REQUEST, which is an optimal method for attitude determination that is based on observation vectors. Compared to the traditional attitude-determination method, the filtering gain of the proposed method is tuned autonomously; thus, the convergence rate of the attitude determination is faster than in the traditional method. Within the proposed method, we developed an iterative method for determining the attitude quaternion. We carried out simulation and turntable tests, which we used to validate the proposed method’s performance. The experiment’s results showed that the convergence rate of the proposed optimal-REQUEST algorithm is faster and that the coarse alignment’s stability is higher. In summary, the proposed method has a high applicability to practical systems. PMID:29337895

  1. Local setup errors in image-guided radiotherapy for head and neck cancer patients immobilized with a custom-made device.

    PubMed

    Giske, Kristina; Stoiber, Eva M; Schwarz, Michael; Stoll, Armin; Muenter, Marc W; Timke, Carmen; Roeder, Falk; Debus, Juergen; Huber, Peter E; Thieke, Christian; Bendl, Rolf

    2011-06-01

    To evaluate the local positioning uncertainties during fractionated radiotherapy of head-and-neck cancer patients immobilized using a custom-made fixation device and discuss the effect of possible patient correction strategies for these uncertainties. A total of 45 head-and-neck patients underwent regular control computed tomography scanning using an in-room computed tomography scanner. The local and global positioning variations of all patients were evaluated by applying a rigid registration algorithm. One bounding box around the complete target volume and nine local registration boxes containing relevant anatomic structures were introduced. The resulting uncertainties for a stereotactic setup and the deformations referenced to one anatomic local registration box were determined. Local deformations of the patients immobilized using our custom-made device were compared with previously published results. Several patient positioning correction strategies were simulated, and the residual local uncertainties were calculated. The patient anatomy in the stereotactic setup showed local systematic positioning deviations of 1-4 mm. The deformations referenced to a particular anatomic local registration box were similar to the reported deformations assessed from patients immobilized with commercially available Aquaplast masks. A global correction, including the rotational error compensation, decreased the remaining local translational errors. Depending on the chosen patient positioning strategy, the remaining local uncertainties varied considerably. Local deformations in head-and-neck patients occur even if an elaborate, custom-made patient fixation method is used. A rotational error correction decreased the required margins considerably. None of the considered correction strategies achieved perfect alignment. Therefore, weighting of anatomic subregions to obtain the optimal correction vector should be investigated in the future. Copyright © 2011 Elsevier Inc. All rights reserved.

  2. An automatic approach for 3D registration of CT scans

    NASA Astrophysics Data System (ADS)

    Hu, Yang; Saber, Eli; Dianat, Sohail; Vantaram, Sreenath Rao; Abhyankar, Vishwas

    2012-03-01

    CT (Computed tomography) is a widely employed imaging modality in the medical field. Normally, a volume of CT scans is prescribed by a doctor when a specific region of the body (typically neck to groin) is suspected of being abnormal. The doctors are required to make professional diagnoses based upon the obtained datasets. In this paper, we propose an automatic registration algorithm that helps healthcare personnel to automatically align corresponding scans from 'Study' to 'Atlas'. The proposed algorithm is capable of aligning both 'Atlas' and 'Study' into the same resolution through 3D interpolation. After retrieving the scanned slice volume in the 'Study' and the corresponding volume in the original 'Atlas' dataset, a 3D cross correlation method is used to identify and register various body parts.

  3. Cover song identification by sequence alignment algorithms

    NASA Astrophysics Data System (ADS)

    Wang, Chih-Li; Zhong, Qian; Wang, Szu-Ying; Roychowdhury, Vwani

    2011-10-01

    Content-based music analysis has drawn much attention due to the rapidly growing digital music market. This paper describes a method that can be used to effectively identify cover songs. A cover song is a song that preserves only the crucial melody of its reference song but different in some other acoustic properties. Hence, the beat/chroma-synchronous chromagram, which is insensitive to the variation of the timber or rhythm of songs but sensitive to the melody, is chosen. The key transposition is achieved by cyclically shifting the chromatic domain of the chromagram. By using the Hidden Markov Model (HMM) to obtain the time sequences of songs, the system is made even more robust. Similar structure or length between the cover songs and its reference are not necessary by the Smith-Waterman Alignment Algorithm.

  4. The identification of complete domains within protein sequences using accurate E-values for semi-global alignment

    PubMed Central

    Kann, Maricel G.; Sheetlin, Sergey L.; Park, Yonil; Bryant, Stephen H.; Spouge, John L.

    2007-01-01

    The sequencing of complete genomes has created a pressing need for automated annotation of gene function. Because domains are the basic units of protein function and evolution, a gene can be annotated from a domain database by aligning domains to the corresponding protein sequence. Ideally, complete domains are aligned to protein subsequences, in a ‘semi-global alignment’. Local alignment, which aligns pieces of domains to subsequences, is common in high-throughput annotation applications, however. It is a mature technique, with the heuristics and accurate E-values required for screening large databases and evaluating the screening results. Hidden Markov models (HMMs) provide an alternative theoretical framework for semi-global alignment, but their use is limited because they lack heuristic acceleration and accurate E-values. Our new tool, GLOBAL, overcomes some limitations of previous semi-global HMMs: it has accurate E-values and the possibility of the heuristic acceleration required for high-throughput applications. Moreover, according to a standard of truth based on protein structure, two semi-global HMM alignment tools (GLOBAL and HMMer) had comparable performance in identifying complete domains, but distinctly outperformed two tools based on local alignment. When searching for complete protein domains, therefore, GLOBAL avoids disadvantages commonly associated with HMMs, yet maintains their superior retrieval performance. PMID:17596268

  5. Localization of ultra-low frequency waves in multi-ion plasmas of the planetary magnetosphere

    DOE PAGES

    Kim, Eun -Hwa; Johnson, Jay R.; Lee, Dong -Hun

    2015-01-01

    By adopting a 2D time-dependent wave code, we investigate how mode-converted waves at the Ion-Ion Hybrid (IIH) resonance and compressional waves propagate in 2D density structures with a wide range of field-aligned wavenumbers to background magnetic fields. The simulation results show that the mode-converted waves have continuous bands across the field line consistent with previous numerical studies. These waves also have harmonic structures in frequency domain and are localized in the field-aligned heavy ion density well. Lastly, our results thus emphasize the importance of a field-aligned heavy ion density structure for ultra-low frequency wave propagation, and suggest that IIH wavesmore » can be localized in different locations along the field line.« less

  6. Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation

    PubMed Central

    2011-01-01

    Background The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. Results A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Conclusions Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance. PMID:21631914

  7. Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation.

    PubMed

    Rognes, Torbjørn

    2011-06-01

    The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance.

  8. An evaluation of talker localization based on direction of arrival estimation and statistical sound source identification

    NASA Astrophysics Data System (ADS)

    Nishiura, Takanobu; Nakamura, Satoshi

    2002-11-01

    It is very important to capture distant-talking speech for a hands-free speech interface with high quality. A microphone array is an ideal candidate for this purpose. However, this approach requires localizing the target talker. Conventional talker localization algorithms in multiple sound source environments not only have difficulty localizing the multiple sound sources accurately, but also have difficulty localizing the target talker among known multiple sound source positions. To cope with these problems, we propose a new talker localization algorithm consisting of two algorithms. One is DOA (direction of arrival) estimation algorithm for multiple sound source localization based on CSP (cross-power spectrum phase) coefficient addition method. The other is statistical sound source identification algorithm based on GMM (Gaussian mixture model) for localizing the target talker position among localized multiple sound sources. In this paper, we particularly focus on the talker localization performance based on the combination of these two algorithms with a microphone array. We conducted evaluation experiments in real noisy reverberant environments. As a result, we confirmed that multiple sound signals can be identified accurately between ''speech'' or ''non-speech'' by the proposed algorithm. [Work supported by ATR, and MEXT of Japan.

  9. CoMFA analyses of C-2 position salvinorin A analogs at the kappa-opioid receptor provides insights into epimer selectivity.

    PubMed

    McGovern, Donna L; Mosier, Philip D; Roth, Bryan L; Westkaemper, Richard B

    2010-04-01

    The highly potent and kappa-opioid (KOP) receptor-selective hallucinogen Salvinorin A and selected analogs have been analyzed using the 3D quantitative structure-affinity relationship technique Comparative Molecular Field Analysis (CoMFA) in an effort to derive a statistically significant and predictive model of salvinorin affinity at the KOP receptor and to provide additional statistical support for the validity of previously proposed structure-based interaction models. Two CoMFA models of Salvinorin A analogs substituted at the C-2 position are presented. Separate models were developed based on the radioligand used in the kappa-opioid binding assay, [(3)H]diprenorphine or [(125)I]6 beta-iodo-3,14-dihydroxy-17-cyclopropylmethyl-4,5 alpha-epoxymorphinan ([(125)I]IOXY). For each dataset, three methods of alignment were employed: a receptor-docked alignment derived from the structure-based docking algorithm GOLD, another from the ligand-based alignment algorithm FlexS, and a rigid realignment of the poses from the receptor-docked alignment. The receptor-docked alignment produced statistically superior results compared to either the FlexS alignment or the realignment in both datasets. The [(125)I]IOXY set (Model 1) and [(3)H]diprenorphine set (Model 2) gave q(2) values of 0.592 and 0.620, respectively, using the receptor-docked alignment, and both models produced similar CoMFA contour maps that reflected the stereoelectronic features of the receptor model from which they were derived. Each model gave significantly predictive CoMFA statistics (Model 1 PSET r(2)=0.833; Model 2 PSET r(2)=0.813). Based on the CoMFA contour maps, a binding mode was proposed for amine-containing Salvinorin A analogs that provides a rationale for the observation that the beta-epimers (R-configuration) of protonated amines at the C-2 position have a higher affinity than the corresponding alpha-epimers (S-configuration). (c) 2010. Published by Elsevier Inc.

  10. Alignment methods: strategies, challenges, benchmarking, and comparative overview.

    PubMed

    Löytynoja, Ari

    2012-01-01

    Comparative evolutionary analyses of molecular sequences are solely based on the identities and differences detected between homologous characters. Errors in this homology statement, that is errors in the alignment of the sequences, are likely to lead to errors in the downstream analyses. Sequence alignment and phylogenetic inference are tightly connected and many popular alignment programs use the phylogeny to divide the alignment problem into smaller tasks. They then neglect the phylogenetic tree, however, and produce alignments that are not evolutionarily meaningful. The use of phylogeny-aware methods reduces the error but the resulting alignments, with evolutionarily correct representation of homology, can challenge the existing practices and methods for viewing and visualising the sequences. The inter-dependency of alignment and phylogeny can be resolved by joint estimation of the two; methods based on statistical models allow for inferring the alignment parameters from the data and correctly take into account the uncertainty of the solution but remain computationally challenging. Widely used alignment methods are based on heuristic algorithms and unlikely to find globally optimal solutions. The whole concept of one correct alignment for the sequences is questionable, however, as there typically exist vast numbers of alternative, roughly equally good alignments that should also be considered. This uncertainty is hidden by many popular alignment programs and is rarely correctly taken into account in the downstream analyses. The quest for finding and improving the alignment solution is complicated by the lack of suitable measures of alignment goodness. The difficulty of comparing alternative solutions also affects benchmarks of alignment methods and the results strongly depend on the measure used. As the effects of alignment error cannot be predicted, comparing the alignments' performance in downstream analyses is recommended.

  11. Minimap2: pairwise alignment for nucleotide sequences.

    PubMed

    Li, Heng

    2018-05-10

    Recent advances in sequencing technologies promise ultra-long reads of ∼100 kilo bases (kb) in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 mega bases (Mb) in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms. Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database. It works with accurate short reads of ≥ 100bp in length, ≥1kb genomic reads at error rate ∼15%, full-length noisy Direct RNA or cDNA reads, and assembly contigs or closely related full chromosomes of hundreds of megabases in length. Minimap2 does split-read alignment, employs concave gap cost for long insertions and deletions (INDELs) and introduces new heuristics to reduce spurious alignments. It is 3-4 times as fast as mainstream short-read mappers at comparable accuracy, and is ≥30 times faster than long-read genomic or cDNA mappers at higher accuracy, surpassing most aligners specialized in one type of alignment. https://github.com/lh3/minimap2. hengli@broadinstitute.org.

  12. Parametric inference for biological sequence analysis.

    PubMed

    Pachter, Lior; Sturmfels, Bernd

    2004-11-16

    One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.

  13. Computer vision applications for coronagraphic optical alignment and image processing.

    PubMed

    Savransky, Dmitry; Thomas, Sandrine J; Poyneer, Lisa A; Macintosh, Bruce A

    2013-05-10

    Modern coronagraphic systems require very precise alignment between optical components and can benefit greatly from automated image processing. We discuss three techniques commonly employed in the fields of computer vision and image analysis as applied to the Gemini Planet Imager, a new facility instrument for the Gemini South Observatory. We describe how feature extraction and clustering methods can be used to aid in automated system alignment tasks, and also present a search algorithm for finding regular features in science images used for calibration and data processing. Along with discussions of each technique, we present our specific implementation and show results of each one in operation.

  14. Pairwise alignment of chromatograms using an extended Fisher-Rao metric.

    PubMed

    Wallace, W E; Srivastava, A; Telu, K H; Simón-Manso, Y

    2014-09-02

    A conceptually new approach for aligning chromatograms is introduced and applied to examples of metabolite identification in human blood plasma by liquid chromatography-mass spectrometry (LC-MS). A square-root representation of the chromatogram's derivative coupled with an extended Fisher-Rao metric enables the computation of relative differences between chromatograms. Minimization of these differences using a common dynamic programming algorithm brings the chromatograms into alignment. Application to a complex sample, National Institute of Standards and Technology (NIST) Standard Reference Material 1950, Metabolites in Human Plasma, analyzed by two different LC-MS methods having significantly different ranges of elution time is described. Published by Elsevier B.V.

  15. Real-time calibration and alignment of the LHCb RICH detectors

    NASA Astrophysics Data System (ADS)

    HE, Jibo

    2017-12-01

    In 2015, the LHCb experiment established a new and unique software trigger strategy with the purpose of increasing the purity of the signal events by applying the same algorithms online and offline. To achieve this, real-time calibration and alignment of all LHCb sub-systems is needed to provide vertexing, tracking, and particle identification of the best possible quality. The calibration of the refractive index of the RICH radiators, the calibration of the Hybrid Photon Detector image, and the alignment of the RICH mirror system, are reported in this contribution. The stability of the RICH performance and the particle identification performance are also discussed.

  16. BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC

    PubMed Central

    Satija, Rahul; Novák, Ádám; Miklós, István; Lyngsø, Rune; Hein, Jotun

    2009-01-01

    Background We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. Results We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the α-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. Conclusion BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from PMID:19715598

  17. BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC.

    PubMed

    Satija, Rahul; Novák, Adám; Miklós, István; Lyngsø, Rune; Hein, Jotun

    2009-08-28

    We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the alpha-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from http://www.stats.ox.ac.uk/~satija/BigFoot/

  18. AlexSys: a knowledge-based expert system for multiple sequence alignment construction and analysis

    PubMed Central

    Aniba, Mohamed Radhouene; Poch, Olivier; Marchler-Bauer, Aron; Thompson, Julie Dawn

    2010-01-01

    Multiple sequence alignment (MSA) is a cornerstone of modern molecular biology and represents a unique means of investigating the patterns of conservation and diversity in complex biological systems. Many different algorithms have been developed to construct MSAs, but previous studies have shown that no single aligner consistently outperforms the rest. This has led to the development of a number of ‘meta-methods’ that systematically run several aligners and merge the output into one single solution. Although these methods generally produce more accurate alignments, they are inefficient because all the aligners need to be run first and the choice of the best solution is made a posteriori. Here, we describe the development of a new expert system, AlexSys, for the multiple alignment of protein sequences. AlexSys incorporates an intelligent inference engine to automatically select an appropriate aligner a priori, depending only on the nature of the input sequences. The inference engine was trained on a large set of reference multiple alignments, using a novel machine learning approach. Applying AlexSys to a test set of 178 alignments, we show that the expert system represents a good compromise between alignment quality and running time, making it suitable for high throughput projects. AlexSys is freely available from http://alnitak.u-strasbg.fr/∼aniba/alexsys. PMID:20530533

  19. Accountability and Alignment under No Child Left Behind: Multi-Level Perspectives for Educational Leaders

    ERIC Educational Resources Information Center

    Choi, Daniel

    2011-01-01

    Educational leaders have faced the challenges of trying to align schoolwide reforms priorities with accountability demands under the No Child Left Behind law. This article examines the barriers that complicate meaningful alignment among federal, state and local levels. This article also offers the following recommendations: Schools and districts…

  20. Validation of Coevolving Residue Algorithms via Pipeline Sensitivity Analysis: ELSC and OMES and ZNMI, Oh My!

    PubMed Central

    Brown, Christopher A.; Brown, Kevin S.

    2010-01-01

    Correlated amino acid substitution algorithms attempt to discover groups of residues that co-fluctuate due to either structural or functional constraints. Although these algorithms could inform both ab initio protein folding calculations and evolutionary studies, their utility for these purposes has been hindered by a lack of confidence in their predictions due to hard to control sources of error. To complicate matters further, naive users are confronted with a multitude of methods to choose from, in addition to the mechanics of assembling and pruning a dataset. We first introduce a new pair scoring method, called ZNMI (Z-scored-product Normalized Mutual Information), which drastically improves the performance of mutual information for co-fluctuating residue prediction. Second and more important, we recast the process of finding coevolving residues in proteins as a data-processing pipeline inspired by the medical imaging literature. We construct an ensemble of alignment partitions that can be used in a cross-validation scheme to assess the effects of choices made during the procedure on the resulting predictions. This pipeline sensitivity study gives a measure of reproducibility (how similar are the predictions given perturbations to the pipeline?) and accuracy (are residue pairs with large couplings on average close in tertiary structure?). We choose a handful of published methods, along with ZNMI, and compare their reproducibility and accuracy on three diverse protein families. We find that (i) of the algorithms tested, while none appear to be both highly reproducible and accurate, ZNMI is one of the most accurate by far and (ii) while users should be wary of predictions drawn from a single alignment, considering an ensemble of sub-alignments can help to determine both highly accurate and reproducible couplings. Our cross-validation approach should be of interest both to developers and end users of algorithms that try to detect correlated amino acid substitutions. PMID:20531955

Top