alignment algorithm determined: Topics by Science.gov

Sample records for alignment algorithm determined

A novel approach to multiple sequence alignment using hadoop data grids.

PubMed

Sudha Sadasivam, G; Baktavatchalam, G

2010-01-01

Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.
Coarse Alignment Technology on Moving base for SINS Based on the Improved Quaternion Filter Algorithm.

PubMed

Zhang, Tao; Zhu, Yongyun; Zhou, Feng; Yan, Yaxiong; Tong, Jinwu

2017-06-17

Initial alignment of the strapdown inertial navigation system (SINS) is intended to determine the initial attitude matrix in a short time with certain accuracy. The alignment accuracy of the quaternion filter algorithm is remarkable, but the convergence rate is slow. To solve this problem, this paper proposes an improved quaternion filter algorithm for faster initial alignment based on the error model of the quaternion filter algorithm. The improved quaternion filter algorithm constructs the K matrix based on the principle of optimal quaternion algorithm, and rebuilds the measurement model by containing acceleration and velocity errors to make the convergence rate faster. A doppler velocity log (DVL) provides the reference velocity for the improved quaternion filter alignment algorithm. In order to demonstrate the performance of the improved quaternion filter algorithm in the field, a turntable experiment and a vehicle test are carried out. The results of the experiments show that the convergence rate of the proposed improved quaternion filter is faster than that of the tradition quaternion filter algorithm. In addition, the improved quaternion filter algorithm also demonstrates advantages in terms of correctness, effectiveness, and practicability.
Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization.

PubMed

Bauer, Markus; Klau, Gunnar W; Reinert, Knut

2007-07-27

The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. We present a graph-based representation for sequence-structure alignments, which we model as an integer linear program (ILP). We sketch how we compute an optimal or near-optimal solution to the ILP using methods from combinatorial optimization, and present results on a recently published benchmark set for RNA alignments. The implementation of our algorithm yields better alignments in terms of two published scores than the other programs that we tested: This is especially the case with an increasing number of input sequences. Our program LARA is freely available for academic purposes from http://www.planet-lisa.net.
Mobile and replicated alignment of arrays in data-parallel programs

NASA Technical Reports Server (NTRS)

Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert

1993-01-01

When a data-parallel language like FORTRAN 90 is compiled for a distributed-memory machine, aggregate data objects (such as arrays) are distributed across the processor memories. The mapping determines the amount of residual communication needed to bring operands of parallel operations into alignment with each other. A common approach is to break the mapping into two stages: first, an alignment that maps all the objects to an abstract template, and then a distribution that maps the template to the processors. We solve two facets of the problem of finding alignments that reduce residual communication: we determine alignments that vary in loops, and objects that should have replicated alignments. We show that loop-dependent mobile alignment is sometimes necessary for optimum performance, and we provide algorithms with which a compiler can determine good mobile alignments for objects within do loops. We also identify situations in which replicated alignment is either required by the program itself (via spread operations) or can be used to improve performance. We propose an algorithm based on network flow that determines which objects to replicate so as to minimize the total amount of broadcast communication in replication. This work on mobile and replicated alignment extends our earlier work on determining static alignment.
ChromAlign: A two-step algorithmic procedure for time alignment of three-dimensional LC-MS chromatographic surfaces.

PubMed

Sadygov, Rovshan G; Maroto, Fernando Martin; Hühmer, Andreas F R

2006-12-15

We present an algorithmic approach to align three-dimensional chromatographic surfaces of LC-MS data of complex mixture samples. The approach consists of two steps. In the first step, we prealign chromatographic profiles: two-dimensional projections of chromatographic surfaces. This is accomplished by correlation analysis using fast Fourier transforms. In this step, a temporal offset that maximizes the overlap and dot product between two chromatographic profiles is determined. In the second step, the algorithm generates correlation matrix elements between full mass scans of the reference and sample chromatographic surfaces. The temporal offset from the first step indicates a range of the mass scans that are possibly correlated, then the correlation matrix is calculated only for these mass scans. The correlation matrix carries information on highly correlated scans, but it does not itself determine the scan or time alignment. Alignment is determined as a path in the correlation matrix that maximizes the sum of the correlation matrix elements. The computational complexity of the optimal path generation problem is reduced by the use of dynamic programming. The program produces time-aligned surfaces. The use of the temporal offset from the first step in the second step reduces the computation time for generating the correlation matrix and speeds up the process. The algorithm has been implemented in a program, ChromAlign, developed in C++ language for the .NET2 environment in WINDOWS XP. In this work, we demonstrate the applications of ChromAlign to alignment of LC-MS surfaces of several datasets: a mixture of known proteins, samples from digests of surface proteins of T-cells, and samples prepared from digests of cerebrospinal fluid. ChromAlign accurately aligns the LC-MS surfaces we studied. In these examples, we discuss various aspects of the alignment by ChromAlign, such as constant time axis shifts and warping of chromatographic surfaces.
Algorithms for Automatic Alignment of Arrays

NASA Technical Reports Server (NTRS)

Chatterjee, Siddhartha; Gilbert, John R.; Oliker, Leonid; Schreiber, Robert; Sheffler, Thomas J.

1996-01-01

Aggregate data objects (such as arrays) are distributed across the processor memories when compiling a data-parallel language for a distributed-memory machine. The mapping determines the amount of communication needed to bring operands of parallel operations into alignment with each other. A common approach is to break the mapping into two stages: an alignment that maps all the objects to an abstract template, followed by a distribution that maps the template to the processors. This paper describes algorithms for solving the various facets of the alignment problem: axis and stride alignment, static and mobile offset alignment, and replication labeling. We show that optimal axis and stride alignment is NP-complete for general program graphs, and give a heuristic method that can explore the space of possible solutions in a number of ways. We show that some of these strategies can give better solutions than a simple greedy approach proposed earlier. We also show how local graph contractions can reduce the size of the problem significantly without changing the best solution. This allows more complex and effective heuristics to be used. We show how to model the static offset alignment problem using linear programming, and we show that loop-dependent mobile offset alignment is sometimes necessary for optimum performance. We describe an algorithm with for determining mobile alignments for objects within do loops. We also identify situations in which replicated alignment is either required by the program itself or can be used to improve performance. We describe an algorithm based on network flow that replicates objects so as to minimize the total amount of broadcast communication in replication.
Evaluation of Laser Based Alignment Algorithms Under Additive Random and Diffraction Noise

DOE Office of Scientific and Technical Information (OSTI.GOV)

McClay, W A; Awwal, A; Wilhelmsen, K

2004-09-30

The purpose of the automatic alignment algorithm at the National Ignition Facility (NIF) is to determine the position of a laser beam based on the position of beam features from video images. The position information obtained is used to command motors and attenuators to adjust the beam lines to the desired position, which facilitates the alignment of all 192 beams. One of the goals of the algorithm development effort is to ascertain the performance, reliability, and uncertainty of the position measurement. This paper describes a method of evaluating the performance of algorithms using Monte Carlo simulation. In particular we showmore » the application of this technique to the LM1{_}LM3 algorithm, which determines the position of a series of two beam light sources. The performance of the algorithm was evaluated for an ensemble of over 900 simulated images with varying image intensities and noise counts, as well as varying diffraction noise amplitude and frequency. The performance of the algorithm on the image data set had a tolerance well beneath the 0.5-pixel system requirement.« less
Phase Retrieval Using a Genetic Algorithm on the Systematic Image-Based Optical Alignment Testbed

NASA Technical Reports Server (NTRS)

Taylor, Jaime R.

2003-01-01

NASA s Marshall Space Flight Center s Systematic Image-Based Optical Alignment (SIBOA) Testbed was developed to test phase retrieval algorithms and hardware techniques. Individuals working with the facility developed the idea of implementing phase retrieval by breaking the determination of the tip/tilt of each mirror apart from the piston motion (or translation) of each mirror. Presented in this report is an algorithm that determines the optimal phase correction associated only with the piston motion of the mirrors. A description of the Phase Retrieval problem is first presented. The Systematic Image-Based Optical Alignment (SIBOA) Testbeb is then described. A Discrete Fourier Transform (DFT) is necessary to transfer the incoming wavefront (or estimate of phase error) into the spatial frequency domain to compare it with the image. A method for reducing the DFT to seven scalar/matrix multiplications is presented. A genetic algorithm is then used to search for the phase error. The results of this new algorithm on a test problem are presented.
A Self-Alignment Algorithm for SINS Based on Gravitational Apparent Motion and Sensor Data Denoising

PubMed Central

Liu, Yiting; Xu, Xiaosu; Liu, Xixiang; Yao, Yiqing; Wu, Liang; Sun, Jin

2015-01-01

Initial alignment is always a key topic and difficult to achieve in an inertial navigation system (INS). In this paper a novel self-initial alignment algorithm is proposed using gravitational apparent motion vectors at three different moments and vector-operation. Simulation and analysis showed that this method easily suffers from the random noise contained in accelerometer measurements which are used to construct apparent motion directly. Aiming to resolve this problem, an online sensor data denoising method based on a Kalman filter is proposed and a novel reconstruction method for apparent motion is designed to avoid the collinearity among vectors participating in the alignment solution. Simulation, turntable tests and vehicle tests indicate that the proposed alignment algorithm can fulfill initial alignment of strapdown INS (SINS) under both static and swinging conditions. The accuracy can either reach or approach the theoretical values determined by sensor precision under static or swinging conditions. PMID:25923932
Optimal Parameter Design of Coarse Alignment for Fiber Optic Gyro Inertial Navigation System.

PubMed

Lu, Baofeng; Wang, Qiuying; Yu, Chunmei; Gao, Wei

2015-06-25

Two different coarse alignment algorithms for Fiber Optic Gyro (FOG) Inertial Navigation System (INS) based on inertial reference frame are discussed in this paper. Both of them are based on gravity vector integration, therefore, the performance of these algorithms is determined by integration time. In previous works, integration time is selected by experience. In order to give a criterion for the selection process, and make the selection of the integration time more accurate, optimal parameter design of these algorithms for FOG INS is performed in this paper. The design process is accomplished based on the analysis of the error characteristics of these two coarse alignment algorithms. Moreover, this analysis and optimal parameter design allow us to make an adequate selection of the most accurate algorithm for FOG INS according to the actual operational conditions. The analysis and simulation results show that the parameter provided by this work is the optimal value, and indicate that in different operational conditions, the coarse alignment algorithms adopted for FOG INS are different in order to achieve better performance. Lastly, the experiment results validate the effectiveness of the proposed algorithm.
Neurient: An Algorithm for Automatic Tracing of Confluent Neuronal Images to Determine Alignment

PubMed Central

Mitchel, J.A.; Martin, I.S.

2013-01-01

A goal of neural tissue engineering is the development and evaluation of materials that guide neuronal growth and alignment. However, the methods available to quantitatively evaluate the response of neurons to guidance materials are limited and/or expensive, and may require manual tracing to be performed by the researcher. We have developed an open source, automated Matlab-based algorithm, building on previously published methods, to trace and quantify alignment of fluorescent images of neurons in culture. The algorithm is divided into three phases, including computation of a lookup table which contains directional information for each image, location of a set of seed points which may lie along neurite centerlines, and tracing neurites starting with each seed point and indexing into the lookup table. This method was used to obtain quantitative alignment data for complex images of densely cultured neurons. Complete automation of tracing allows for unsupervised processing of large numbers of images. Following image processing with our algorithm, available metrics to quantify neurite alignment include angular histograms, percent of neurite segments in a given direction, and mean neurite angle. The alignment information obtained from traced images can be used to compare the response of neurons to a range of conditions. This tracing algorithm is freely available to the scientific community under the name Neurient, and its implementation in Matlab allows a wide range of researchers to use a standardized, open source method to quantitatively evaluate the alignment of dense neuronal cultures. PMID:23384629
Multiple nodes transfer alignment for airborne missiles based on inertial sensor network

NASA Astrophysics Data System (ADS)

Si, Fan; Zhao, Yan

2017-09-01

Transfer alignment is an important initialization method for airborne missiles because the alignment accuracy largely determines the performance of the missile. However, traditional alignment methods are limited by complicated and unknown flexure angle, and cannot meet the actual requirement when wing flexure deformation occurs. To address this problem, we propose a new method that uses the relative navigation parameters between the weapons and fighter to achieve transfer alignment. First, in the relative inertial navigation algorithm, the relative attitudes and positions are constantly computed in wing flexure deformation situations. Secondly, the alignment results of each weapon are processed using a data fusion algorithm to improve the overall performance. Finally, the feasibility and performance of the proposed method were evaluated under two typical types of deformation, and the simulation results demonstrated that the new transfer alignment method is practical and has high-precision.
A Comprehensive Two-Dimensional Retention Time Alignment Algorithm To Enhance Chemometric Analysis of Comprehensive Two-Dimensional Separation Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pierce, Karisa M.; Wood, Lianna F.; Wright, Bob W.

2005-12-01

A comprehensive two-dimensional (2D) retention time alignment algorithm was developed using a novel indexing scheme. The algorithm is termed comprehensive because it functions to correct the entire chromatogram in both dimensions and it preserves the separation information in both dimensions. Although the algorithm is demonstrated by correcting comprehensive two-dimensional gas chromatography (GC x GC) data, the algorithm is designed to correct shifting in all forms of 2D separations, such as LC x LC, LC x CE, CE x CE, and LC x GC. This 2D alignment algorithm was applied to three different data sets composed of replicate GC x GCmore » separations of (1) three 22-component control mixtures, (2) three gasoline samples, and (3) three diesel samples. The three data sets were collected using slightly different temperature or pressure programs to engender significant retention time shifting in the raw data and then demonstrate subsequent corrections of that shifting upon comprehensive 2D alignment of the data sets. Thirty 12-min GC x GC separations from three 22-component control mixtures were used to evaluate the 2D alignment performance (10 runs/mixture). The average standard deviation of the first column retention time improved 5-fold from 0.020 min (before alignment) to 0.004 min (after alignment). Concurrently, the average standard deviation of second column retention time improved 4-fold from 3.5 ms (before alignment) to 0.8 ms (after alignment). Alignment of the 30 control mixture chromatograms took 20 min. The quantitative integrity of the GC x GC data following 2D alignment was also investigated. The mean integrated signal was determined for all components in the three 22-component mixtures for all 30 replicates. The average percent difference in the integrated signal for each component before and after alignment was 2.6%. Singular value decomposition (SVD) was applied to the 22-component control mixture data before and after alignment to show the restoration of trilinearity to the data, since trilinearity benefits chemometric analysis. By applying comprehensive 2D retention time alignment to all three data sets (control mixtures, gasoline samples, and diesel samples), classification by principal component analysis (PCA) substantially improved, resulting in 100% accurate scores clustering.« less
A Polar Initial Alignment Algorithm for Unmanned Underwater Vehicles

PubMed Central

Yan, Zheping; Wang, Lu; Wang, Tongda; Zhang, Honghan; Zhang, Xun; Liu, Xiangling

2017-01-01

Due to its highly autonomy, the strapdown inertial navigation system (SINS) is widely used in unmanned underwater vehicles (UUV) navigation. Initial alignment is crucial because the initial alignment results will be used as the initial SINS value, which might affect the subsequent SINS results. Due to the rapid convergence of Earth meridians, there is a calculation overflow in conventional initial alignment algorithms, making conventional initial algorithms are invalid for polar UUV navigation. To overcome these problems, a polar initial alignment algorithm for UUV is proposed in this paper, which consists of coarse and fine alignment algorithms. Based on the principle of the conical slow drift of gravity, the coarse alignment algorithm is derived under the grid frame. By choosing the velocity and attitude as the measurement, the fine alignment with the Kalman filter (KF) is derived under the grid frame. Simulation and experiment are realized among polar, conventional and transversal initial alignment algorithms for polar UUV navigation. Results demonstrate that the proposed polar initial alignment algorithm can complete the initial alignment of UUV in the polar region rapidly and accurately. PMID:29168735
Optimization of sequence alignment for simple sequence repeat regions.

PubMed

Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

2011-07-20

Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.
Acceleration of the Smith-Waterman algorithm using single and multiple graphics processors

NASA Astrophysics Data System (ADS)

Khajeh-Saeed, Ali; Poole, Stephen; Blair Perot, J.

2010-06-01

Finding regions of similarity between two very long data streams is a computationally intensive problem referred to as sequence alignment. Alignment algorithms must allow for imperfect sequence matching with different starting locations and some gaps and errors between the two data sequences. Perhaps the most well known application of sequence matching is the testing of DNA or protein sequences against genome databases. The Smith-Waterman algorithm is a method for precisely characterizing how well two sequences can be aligned and for determining the optimal alignment of those two sequences. Like many applications in computational science, the Smith-Waterman algorithm is constrained by the memory access speed and can be accelerated significantly by using graphics processors (GPUs) as the compute engine. In this work we show that effective use of the GPU requires a novel reformulation of the Smith-Waterman algorithm. The performance of this new version of the algorithm is demonstrated using the SSCA#1 (Bioinformatics) benchmark running on one GPU and on up to four GPUs executing in parallel. The results indicate that for large problems a single GPU is up to 45 times faster than a CPU for this application, and the parallel implementation shows linear speed up on up to 4 GPUs.
Development and application of a modified dynamic time warping algorithm (DTW-S) to analyses of primate brain expression time series

PubMed Central

2011-01-01

Background Comparing biological time series data across different conditions, or different specimens, is a common but still challenging task. Algorithms aligning two time series represent a valuable tool for such comparisons. While many powerful computation tools for time series alignment have been developed, they do not provide significance estimates for time shift measurements. Results Here, we present an extended version of the original DTW algorithm that allows us to determine the significance of time shift estimates in time series alignments, the DTW-Significance (DTW-S) algorithm. The DTW-S combines important properties of the original algorithm and other published time series alignment tools: DTW-S calculates the optimal alignment for each time point of each gene, it uses interpolated time points for time shift estimation, and it does not require alignment of the time-series end points. As a new feature, we implement a simulation procedure based on parameters estimated from real time series data, on a series-by-series basis, allowing us to determine the false positive rate (FPR) and the significance of the estimated time shift values. We assess the performance of our method using simulation data and real expression time series from two published primate brain expression datasets. Our results show that this method can provide accurate and robust time shift estimates for each time point on a gene-by-gene basis. Using these estimates, we are able to uncover novel features of the biological processes underlying human brain development and maturation. Conclusions The DTW-S provides a convenient tool for calculating accurate and robust time shift estimates at each time point for each gene, based on time series data. The estimates can be used to uncover novel biological features of the system being studied. The DTW-S is freely available as an R package TimeShift at http://www.picb.ac.cn/Comparative/data.html. PMID:21851598
Development and application of a modified dynamic time warping algorithm (DTW-S) to analyses of primate brain expression time series.

PubMed

Yuan, Yuan; Chen, Yi-Ping Phoebe; Ni, Shengyu; Xu, Augix Guohua; Tang, Lin; Vingron, Martin; Somel, Mehmet; Khaitovich, Philipp

2011-08-18

Comparing biological time series data across different conditions, or different specimens, is a common but still challenging task. Algorithms aligning two time series represent a valuable tool for such comparisons. While many powerful computation tools for time series alignment have been developed, they do not provide significance estimates for time shift measurements. Here, we present an extended version of the original DTW algorithm that allows us to determine the significance of time shift estimates in time series alignments, the DTW-Significance (DTW-S) algorithm. The DTW-S combines important properties of the original algorithm and other published time series alignment tools: DTW-S calculates the optimal alignment for each time point of each gene, it uses interpolated time points for time shift estimation, and it does not require alignment of the time-series end points. As a new feature, we implement a simulation procedure based on parameters estimated from real time series data, on a series-by-series basis, allowing us to determine the false positive rate (FPR) and the significance of the estimated time shift values. We assess the performance of our method using simulation data and real expression time series from two published primate brain expression datasets. Our results show that this method can provide accurate and robust time shift estimates for each time point on a gene-by-gene basis. Using these estimates, we are able to uncover novel features of the biological processes underlying human brain development and maturation. The DTW-S provides a convenient tool for calculating accurate and robust time shift estimates at each time point for each gene, based on time series data. The estimates can be used to uncover novel biological features of the system being studied. The DTW-S is freely available as an R package TimeShift at http://www.picb.ac.cn/Comparative/data.html.
A Coarse-Alignment Method Based on the Optimal-REQUEST Algorithm

PubMed Central

Zhu, Yongyun

2018-01-01

In this paper, we proposed a coarse-alignment method for strapdown inertial navigation systems based on attitude determination. The observation vectors, which can be obtained by inertial sensors, usually contain various types of noise, which affects the convergence rate and the accuracy of the coarse alignment. Given this drawback, we studied an attitude-determination method named optimal-REQUEST, which is an optimal method for attitude determination that is based on observation vectors. Compared to the traditional attitude-determination method, the filtering gain of the proposed method is tuned autonomously; thus, the convergence rate of the attitude determination is faster than in the traditional method. Within the proposed method, we developed an iterative method for determining the attitude quaternion. We carried out simulation and turntable tests, which we used to validate the proposed method’s performance. The experiment’s results showed that the convergence rate of the proposed optimal-REQUEST algorithm is faster and that the coarse alignment’s stability is higher. In summary, the proposed method has a high applicability to practical systems. PMID:29337895
Rapid Quantification of 3D Collagen Fiber Alignment and Fiber Intersection Correlations with High Sensitivity

PubMed Central

Sun, Meng; Bloom, Alexander B.; Zaman, Muhammad H.

2015-01-01

Metastatic cancers aggressively reorganize collagen in their microenvironment. For example, radially orientated collagen fibers have been observed surrounding tumor cell clusters in vivo. The degree of fiber alignment, as a consequence of this remodeling, has often been difficult to quantify. In this paper, we present an easy to implement algorithm for accurate detection of collagen fiber orientation in a rapid pixel-wise manner. This algorithm quantifies the alignment of both computer generated and actual collagen fiber networks of varying degrees of alignment within 5°°. We also present an alternative easy method to calculate the alignment index directly from the standard deviation of fiber orientation. Using this quantitative method for determining collagen alignment, we demonstrate that the number of collagen fiber intersections has a negative correlation with the degree of fiber alignment. This decrease in intersections of aligned fibers could explain why cells move more rapidly along aligned fibers than unaligned fibers, as previously reported. Overall, our paper provides an easier, more quantitative and quicker way to quantify fiber orientation and alignment, and presents a platform in studying effects of matrix and cellular properties on fiber alignment in complex 3D environments. PMID:26158674

Experience from the in-flight calibration of the Extreme Ultraviolet Explorer (EUVE) and Upper Atmosphere Research Satellite (UARS) fixed head star trackers (FHSTs)

NASA Technical Reports Server (NTRS)

Lee, Michael

1995-01-01

Since the original post-launch calibration of the FHSTs (Fixed Head Star Trackers) on EUVE (Extreme Ultraviolet Explorer) and UARS (Upper Atmosphere Research Satellite), the Flight Dynamics task has continued to analyze the FHST performance. The algorithm used for inflight alignment of spacecraft sensors is described and the equations for the errors in the relative alignment for the simple 2 star tracker case are shown. Simulated data and real data are used to compute the covariance of the relative alignment errors. Several methods for correcting the alignment are compared and results analyzed. The specific problems seen on orbit with UARS and EUVE are then discussed. UARS has experienced anomalous tracker performance on an FHST resulting in continuous variation in apparent tracker alignment. On EUVE, the FHST residuals from the attitude determination algorithm showed a dependence on the direction of roll during survey mode. This dependence is traced back to time tagging errors and the original post launch alignment is found to be in error due to the impact of the time tagging errors on the alignment algorithm. The methods used by the FDF (Flight Dynamics Facility) to correct for these problems is described.
Implementation of a parallel protein structure alignment service on cloud.

PubMed

Hung, Che-Lun; Lin, Yaw-Ling

2013-01-01

Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.
Implementation of a Parallel Protein Structure Alignment Service on Cloud

PubMed Central

Hung, Che-Lun; Lin, Yaw-Ling

2013-01-01

Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform. PMID:23671842
Initial Navigation Alignment of Optical Instruments on GOES-R

NASA Astrophysics Data System (ADS)

Isaacson, P.; DeLuccia, F.; Reth, A. D.; Igli, D. A.; Carter, D.

2016-12-01

The GOES-R satellite is the first in NOAA's next-generation series of geostationary weather satellites. In addition to a number of space weather sensors, it will carry two principal optical earth-observing instruments, the Advanced Baseline Imager (ABI) and the Geostationary Lightning Mapper (GLM). During launch, currently scheduled for November of 2016, the alignment of these optical instruments is anticipated to shift from that measured during pre-launch characterization. While both instruments have image navigation and registration (INR) processing algorithms to enable automated geolocation of the collected data, the launch-derived misalignment may be too large for these approaches to function without an initial adjustment to calibration parameters. The parameters that may require adjustment are for Line of Sight Motion Compensation (LMC), and the adjustments will be estimated on orbit during the post-launch test (PLT) phase. We have developed approaches to estimate the initial alignment errors for both ABI and GLM image products. Our approaches involve comparison of ABI and GLM images collected during PLT to a set of reference ("truth") images using custom image processing tools and other software (the INR Performance Assessment Tool Set, or "IPATS") being developed for other INR assessments of ABI and GLM data. IPATS is based on image correlation approaches to determine offsets between input and reference images, and these offsets are the fundamental input to our estimate of the initial alignment errors. Initial testing of our alignment algorithms on proxy datasets lends high confidence that their application will determine the initial alignment errors to within sufficient accuracy to enable the operational INR processing approaches to proceed in a nominal fashion. We will report on the algorithms, implementation approach, and status of these initial alignment tools being developed for the GOES-R ABI and GLM instruments.
A statistical physics perspective on alignment-independent protein sequence comparison.

PubMed

Chattopadhyay, Amit K; Nasiev, Diar; Flower, Darren R

2015-08-01

Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from 'first passage probability distribution' to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach. © The Author 2015. Published by Oxford University Press.
SEAN: SNP prediction and display program utilizing EST sequence clusters.

PubMed

Huntley, Derek; Baldo, Angela; Johri, Saurabh; Sergot, Marek

2006-02-15

SEAN is an application that predicts single nucleotide polymorphisms (SNPs) using multiple sequence alignments produced from expressed sequence tag (EST) clusters. The algorithm uses rules of sequence identity and SNP abundance to determine the quality of the prediction. A Java viewer is provided to display the EST alignments and predicted SNPs.
Inverting Image Data For Optical Testing And Alignment

NASA Technical Reports Server (NTRS)

Shao, Michael; Redding, David; Yu, Jeffrey W.; Dumont, Philip J.

1993-01-01

Data from images produced by slightly incorrectly figured concave primary mirror in telescope processed into estimate of spherical aberration of mirror, by use of algorithm finding nonlinear least-squares best fit between actual images and synthetic images produced by multiparameter mathematical model of telescope optical system. Estimated spherical aberration, in turn, converted into estimate of deviation of reflector surface from nominal precise shape. Algorithm devised as part of effort to determine error in surface figure of primary mirror of Hubble space telescope, so corrective lens designed. Modified versions of algorithm also used to find optical errors in other components of telescope or of other optical systems, for purposes of testing, alignment, and/or correction.
Multiple alignment analysis on phylogenetic tree of the spread of SARS epidemic using distance method

NASA Astrophysics Data System (ADS)

Amiroch, S.; Pradana, M. S.; Irawan, M. I.; Mukhlash, I.

2017-09-01

Multiple Alignment (MA) is a particularly important tool for studying the viral genome and determine the evolutionary process of the specific virus. Application of MA in the case of the spread of the Severe acute respiratory syndrome (SARS) epidemic is an interesting thing because this virus epidemic a few years ago spread so quickly that medical attention in many countries. Although there has been a lot of software to process multiple sequences, but the use of pairwise alignment to process MA is very important to consider. In previous research, the alignment between the sequences to process MA algorithm, Super Pairwise Alignment, but in this study used a dynamic programming algorithm Needleman wunchs simulated in Matlab. From the analysis of MA obtained and stable region and unstable which indicates the position where the mutation occurs, the system network topology that produced the phylogenetic tree of the SARS epidemic distance method, and system area networks mutation.
PROPER: global protein interaction network alignment through percolation matching.

PubMed

Kazemi, Ehsan; Hassani, Hamed; Grossglauser, Matthias; Pezeshgi Modarres, Hassan

2016-12-12

The alignment of protein-protein interaction (PPI) networks enables us to uncover the relationships between different species, which leads to a deeper understanding of biological systems. Network alignment can be used to transfer biological knowledge between species. Although different PPI-network alignment algorithms were introduced during the last decade, developing an accurate and scalable algorithm that can find alignments with high biological and structural similarities among PPI networks is still challenging. In this paper, we introduce a new global network alignment algorithm for PPI networks called PROPER. Compared to other global network alignment methods, our algorithm shows higher accuracy and speed over real PPI datasets and synthetic networks. We show that the PROPER algorithm can detect large portions of conserved biological pathways between species. Also, using a simple parsimonious evolutionary model, we explain why PROPER performs well based on several different comparison criteria. We highlight that PROPER has high potential in further applications such as detecting biological pathways, finding protein complexes and PPI prediction. The PROPER algorithm is available at http://proper.epfl.ch .
Tolerancing the alignment of large-core optical fibers, fiber bundles and light guides using a Fourier approach.

PubMed

Sawyer, Travis W; Petersburg, Ryan; Bohndiek, Sarah E

2017-04-20

Optical fiber technology is found in a wide variety of applications to flexibly relay light between two points, enabling information transfer across long distances and allowing access to hard-to-reach areas. Large-core optical fibers and light guides find frequent use in illumination and spectroscopic applications, for example, endoscopy and high-resolution astronomical spectroscopy. Proper alignment is critical for maximizing throughput in optical fiber coupling systems; however, there currently are no formal approaches to tolerancing the alignment of a light-guide coupling system. Here, we propose a Fourier alignment sensitivity (FAS) algorithm to determine the optimal tolerances on the alignment of a light guide by computing the alignment sensitivity. The algorithm shows excellent agreement with both simulated and experimentally measured values and improves on the computation time of equivalent ray-tracing simulations by two orders of magnitude. We then apply FAS to tolerance and fabricate a coupling system, which is shown to meet specifications, thus validating FAS as a tolerancing technique. These results indicate that FAS is a flexible and rapid means to quantify the alignment sensitivity of a light guide, widely informing the design and tolerancing of coupling systems.
Tolerancing the alignment of large-core optical fibers, fiber bundles and light guides using a Fourier approach

PubMed Central

Sawyer, Travis W.; Petersburg, Ryan; Bohndiek, Sarah E.

2017-01-01

Optical fiber technology is found in a wide variety of applications to flexibly relay light between two points, enabling information transfer across long distances and allowing access to hard-to-reach areas. Large-core optical fibers and light guides find frequent use in illumination and spectroscopic applications; for example, endoscopy and high-resolution astronomical spectroscopy. Proper alignment is critical for maximizing throughput in optical fiber coupling systems, however, there currently are no formal approaches to tolerancing the alignment of a light guide coupling system. Here, we propose a Fourier Alignment Sensitivity (FAS) algorithm to determine the optimal tolerances on the alignment of a light guide by computing the alignment sensitivity. The algorithm shows excellent agreement with both simulated and experimentally measured values and improves on the computation time of equivalent ray tracing simulations by two orders of magnitude. We then apply FAS to tolerance and fabricate a coupling system, which is shown to meet specifications, thus validating FAS as a tolerancing technique. These results indicate that FAS is a flexible and rapid means to quantify the alignment sensitivity of a light guide, widely informing the design and tolerancing of coupling systems. PMID:28430250
SPHINX--an algorithm for taxonomic binning of metagenomic sequences.

PubMed

Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Singh, Nitin Kumar; Mande, Sharmila S

2011-01-01

Compared with composition-based binning algorithms, the binning accuracy and specificity of alignment-based binning algorithms is significantly higher. However, being alignment-based, the latter class of algorithms require enormous amount of time and computing resources for binning huge metagenomic datasets. The motivation was to develop a binning approach that can analyze metagenomic datasets as rapidly as composition-based approaches, but nevertheless has the accuracy and specificity of alignment-based algorithms. This article describes a hybrid binning approach (SPHINX) that achieves high binning efficiency by utilizing the principles of both 'composition'- and 'alignment'-based binning algorithms. Validation results with simulated sequence datasets indicate that SPHINX is able to analyze metagenomic sequences as rapidly as composition-based algorithms. Furthermore, the binning efficiency (in terms of accuracy and specificity of assignments) of SPHINX is observed to be comparable with results obtained using alignment-based algorithms. A web server for the SPHINX algorithm is available at http://metagenomics.atc.tcs.com/SPHINX/.
Accelerated probabilistic inference of RNA structure evolution

PubMed Central

Holmes, Ian

2005-01-01

Background Pairwise stochastic context-free grammars (Pair SCFGs) are powerful tools for evolutionary analysis of RNA, including simultaneous RNA sequence alignment and secondary structure prediction, but the associated algorithms are intensive in both CPU and memory usage. The same problem is faced by other RNA alignment-and-folding algorithms based on Sankoff's 1985 algorithm. It is therefore desirable to constrain such algorithms, by pre-processing the sequences and using this first pass to limit the range of structures and/or alignments that can be considered. Results We demonstrate how flexible classes of constraint can be imposed, greatly reducing the computational costs while maintaining a high quality of structural homology prediction. Any score-attributed context-free grammar (e.g. energy-based scoring schemes, or conditionally normalized Pair SCFGs) is amenable to this treatment. It is now possible to combine independent structural and alignment constraints of unprecedented general flexibility in Pair SCFG alignment algorithms. We outline several applications to the bioinformatics of RNA sequence and structure, including Waterman-Eggert N-best alignments and progressive multiple alignment. We evaluate the performance of the algorithm on test examples from the RFAM database. Conclusion A program, Stemloc, that implements these algorithms for efficient RNA sequence alignment and structure prediction is available under the GNU General Public License. PMID:15790387
Robust algorithm for aligning two-dimensional chromatograms.

PubMed

Gros, Jonas; Nabi, Deedar; Dimitriou-Christidis, Petros; Rutler, Rebecca; Arey, J Samuel

2012-11-06

Comprehensive two-dimensional gas chromatography (GC × GC) chromatograms typically exhibit run-to-run retention time variability. Chromatogram alignment is often a desirable step prior to further analysis of the data, for example, in studies of environmental forensics or weathering of complex mixtures. We present a new algorithm for aligning whole GC × GC chromatograms. This technique is based on alignment points that have locations indicated by the user both in a target chromatogram and in a reference chromatogram. We applied the algorithm to two sets of samples. First, we aligned the chromatograms of twelve compositionally distinct oil spill samples, all analyzed using the same instrument parameters. Second, we applied the algorithm to two compositionally distinct wastewater extracts analyzed using two different instrument temperature programs, thus involving larger retention time shifts than the first sample set. For both sample sets, the new algorithm performed favorably compared to two other available alignment algorithms: that of Pierce, K. M.; Wood, Lianna F.; Wright, B. W.; Synovec, R. E. Anal. Chem.2005, 77, 7735-7743 and 2-D COW from Zhang, D.; Huang, X.; Regnier, F. E.; Zhang, M. Anal. Chem.2008, 80, 2664-2671. The new algorithm achieves the best matches of retention times for test analytes, avoids some artifacts which result from the other alignment algorithms, and incurs the least modification of quantitative signal information.
Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm.

PubMed

Rani, R Ranjani; Ramyachitra, D

2016-12-01

Multiple sequence alignment (MSA) is a widespread approach in computational biology and bioinformatics. MSA deals with how the sequences of nucleotides and amino acids are sequenced with possible alignment and minimum number of gaps between them, which directs to the functional, evolutionary and structural relationships among the sequences. Still the computation of MSA is a challenging task to provide an efficient accuracy and statistically significant results of alignments. In this work, the Bacterial Foraging Optimization Algorithm was employed to align the biological sequences which resulted in a non-dominated optimal solution. It employs Multi-objective, such as: Maximization of Similarity, Non-gap percentage, Conserved blocks and Minimization of gap penalty. BAliBASE 3.0 benchmark database was utilized to examine the proposed algorithm against other methods In this paper, two algorithms have been proposed: Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC) and Bacterial Foraging Optimization Algorithm. It was found that Hybrid Genetic Algorithm with Artificial Bee Colony performed better than the existing optimization algorithms. But still the conserved blocks were not obtained using GA-ABC. Then BFO was used for the alignment and the conserved blocks were obtained. The proposed Multi-Objective Bacterial Foraging Optimization Algorithm (MO-BFO) was compared with widely used MSA methods Clustal Omega, Kalign, MUSCLE, MAFFT, Genetic Algorithm (GA), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO) and Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC). The final results show that the proposed MO-BFO algorithm yields better alignment than most widely used methods. Copyright Â© 2016 Elsevier Ireland Ltd. All rights reserved.
Retention time alignment of LC/MS data by a divide-and-conquer algorithm.

PubMed

Zhang, Zhongqi

2012-04-01

Liquid chromatography-mass spectrometry (LC/MS) has become the method of choice for characterizing complex mixtures. These analyses often involve quantitative comparison of components in multiple samples. To achieve automated sample comparison, the components of interest must be detected and identified, and their retention times aligned and peak areas calculated. This article describes a simple pairwise iterative retention time alignment algorithm, based on the divide-and-conquer approach, for alignment of ion features detected in LC/MS experiments. In this iterative algorithm, ion features in the sample run are first aligned with features in the reference run by applying a single constant shift of retention time. The sample chromatogram is then divided into two shorter chromatograms, which are aligned to the reference chromatogram the same way. Each shorter chromatogram is further divided into even shorter chromatograms. This process continues until each chromatogram is sufficiently narrow so that ion features within it have a similar retention time shift. In six pairwise LC/MS alignment examples containing a total of 6507 confirmed true corresponding feature pairs with retention time shifts up to five peak widths, the algorithm successfully aligned these features with an error rate of 0.2%. The alignment algorithm is demonstrated to be fast, robust, fully automatic, and superior to other algorithms. After alignment and gap-filling of detected ion features, their abundances can be tabulated for direct comparison between samples.
Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns.

PubMed

Ortuño, Francisco M; Valenzuela, Olga; Rojas, Fernando; Pomares, Hector; Florido, Javier P; Urquiza, Jose M; Rojas, Ignacio

2013-09-01

Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, above all when sequences are less similar. Consequently, researchers and biologists do not agree about which is the most suitable way to evaluate MSAs. Recent evaluations tend to use more complex scores including further biological features. Among them, 3D structures are increasingly being used to evaluate alignments. Because structures are more conserved in proteins than sequences, scores with structural information are better suited to evaluate more distant relationships between sequences. The proposed multiobjective algorithm, based on the non-dominated sorting genetic algorithm, aims to jointly optimize three objectives: STRIKE score, non-gaps percentage and totally conserved columns. It was significantly assessed on the BAliBASE benchmark according to the Kruskal-Wallis test (P < 0.01). This algorithm also outperforms other aligners, such as ClustalW, Multiple Sequence Alignment Genetic Algorithm (MSA-GA), PRRP, DIALIGN, Hidden Markov Model Training (HMMT), Pattern-Induced Multi-sequence Alignment (PIMA), MULTIALIGN, Sequence Alignment Genetic Algorithm (SAGA), PILEUP, Rubber Band Technique Genetic Algorithm (RBT-GA) and Vertical Decomposition Genetic Algorithm (VDGA), according to the Wilcoxon signed-rank test (P < 0.05), whereas it shows results not significantly different to 3D-COFFEE (P > 0.05) with the advantage of being able to use less structures. Structural information is included within the objective function to evaluate more accurately the obtained alignments. The source code is available at http://www.ugr.es/~fortuno/MOSAStrE/MO-SAStrE.zip.
Multiscale registration algorithm for alignment of meshes

NASA Astrophysics Data System (ADS)

Vadde, Srikanth; Kamarthi, Sagar V.; Gupta, Surendra M.

2004-03-01

Taking a multi-resolution approach, this research work proposes an effective algorithm for aligning a pair of scans obtained by scanning an object's surface from two adjacent views. This algorithm first encases each scan in the pair with an array of cubes of equal and fixed size. For each scan in the pair a surrogate scan is created by the centroids of the cubes that encase the scan. The Gaussian curvatures of points across the surrogate scan pair are compared to find the surrogate corresponding points. If the difference between the Gaussian curvatures of any two points on the surrogate scan pair is less than a predetermined threshold, then those two points are accepted as a pair of surrogate corresponding points. The rotation and translation values between the surrogate scan pair are determined by using a set of surrogate corresponding points. Using the same rotation and translation values the original scan pairs are aligned. The resulting registration (or alignment) error is computed to check the accuracy of the scan alignment. When the registration error becomes acceptably small, the algorithm is terminated. Otherwise the above process is continued with cubes of smaller and smaller sizes until the algorithm is terminated. However at each finer resolution the search space for finding the surrogate corresponding points is restricted to the regions in the neighborhood of the surrogate points that were at found at the preceding coarser level. The surrogate corresponding points, as the resolution becomes finer and finer, converge to the true corresponding points on the original scans. This approach offers three main benefits: it improves the chances of finding the true corresponding points on the scans, minimize the adverse effects of noise in the scans, and reduce the computational load for finding the corresponding points.
Introducing difference recurrence relations for faster semi-global alignment of long sequences.

PubMed

Suzuki, Hajime; Kasahara, Masahiro

2018-02-19

The read length of single-molecule DNA sequencers is reaching 1 Mb. Popular alignment software tools widely used for analyzing such long reads often take advantage of single-instruction multiple-data (SIMD) operations to accelerate calculation of dynamic programming (DP) matrices in the Smith-Waterman-Gotoh (SWG) algorithm with a fixed alignment start position at the origin. Nonetheless, 16-bit or 32-bit integers are necessary for storing the values in a DP matrix when sequences to be aligned are long; this situation hampers the use of the full SIMD width of modern processors. We proposed a faster semi-global alignment algorithm, "difference recurrence relations," that runs more rapidly than the state-of-the-art algorithm by a factor of 2.1. Instead of calculating and storing all the values in a DP matrix directly, our algorithm computes and stores mainly the differences between the values of adjacent cells in the matrix. Although the SWG algorithm and our algorithm can output exactly the same result, our algorithm mainly involves 8-bit integer operations, enabling us to exploit the full width of SIMD operations (e.g., 32) on modern processors. We also developed a library, libgaba, so that developers can easily integrate our algorithm into alignment programs. Our novel algorithm and optimized library implementation will facilitate accelerating nucleotide long-read analysis algorithms that use pairwise alignment stages. The library is implemented in the C programming language and available at https://github.com/ocxtal/libgaba .
StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase.

PubMed

Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L

2011-06-02

Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.

Aligning Greek-English parallel texts

NASA Astrophysics Data System (ADS)

Galiotou, Eleni; Koronakis, George; Lazari, Vassiliki

2015-02-01

In this paper, we discuss issues concerning the alignment of parallel texts written in languages with different alphabets based on an experiment of aligning texts from the proceedings of the European Parliament in Greek and English. First, we describe our implementation of the k-vec algorithm and its application to the bilingual corpus. Then the output of the algorithm is used as a starting point for an alignment procedure at a sentence level which also takes into account mark-ups of meta-information. The results of the implementation are compared to those of the application of the Church and Gale alignment algorithm on the Europarl corpus. The conclusions of this comparison can give useful insights as for the efficiency of alignment algorithms when applied to the particular bilingual corpus.
A generalized global alignment algorithm.

PubMed

Huang, Xiaoqiu; Chao, Kun-Mao

2003-01-22

Homologous sequences are sometimes similar over some regions but different over other regions. Homologous sequences have a much lower global similarity if the different regions are much longer than the similar regions. We present a generalized global alignment algorithm for comparing sequences with intermittent similarities, an ordered list of similar regions separated by different regions. A generalized global alignment model is defined to handle sequences with intermittent similarities. A dynamic programming algorithm is designed to compute an optimal general alignment in time proportional to the product of sequence lengths and in space proportional to the sum of sequence lengths. The algorithm is implemented as a computer program named GAP3 (Global Alignment Program Version 3). The generalized global alignment model is validated by experimental results produced with GAP3 on both DNA and protein sequences. The GAP3 program extends the ability of standard global alignment programs to recognize homologous sequences of lower similarity. The GAP3 program is freely available for academic use at http://bioinformatics.iastate.edu/aat/align/align.html.
ARYANA: Aligning Reads by Yet Another Approach

PubMed Central

2014-01-01

Motivation Although there are many different algorithms and software tools for aligning sequencing reads, fast gapped sequence search is far from solved. Strong interest in fast alignment is best reflected in the $106 prize for the Innocentive competition on aligning a collection of reads to a given database of reference genomes. In addition, de novo assembly of next-generation sequencing long reads requires fast overlap-layout-concensus algorithms which depend on fast and accurate alignment. Contribution We introduce ARYANA, a fast gapped read aligner, developed on the base of BWA indexing infrastructure with a completely new alignment engine that makes it significantly faster than three other aligners: Bowtie2, BWA and SeqAlto, with comparable generality and accuracy. Instead of the time-consuming backtracking procedures for handling mismatches, ARYANA comes with the seed-and-extend algorithmic framework and a significantly improved efficiency by integrating novel algorithmic techniques including dynamic seed selection, bidirectional seed extension, reset-free hash tables, and gap-filling dynamic programming. As the read length increases ARYANA's superiority in terms of speed and alignment rate becomes more evident. This is in perfect harmony with the read length trend as the sequencing technologies evolve. The algorithmic platform of ARYANA makes it easy to develop mission-specific aligners for other applications using ARYANA engine. Availability ARYANA with complete source code can be obtained from http://github.com/aryana-aligner PMID:25252881
ARYANA: Aligning Reads by Yet Another Approach.

PubMed

Gholami, Milad; Arbabi, Aryan; Sharifi-Zarchi, Ali; Chitsaz, Hamidreza; Sadeghi, Mehdi

2014-01-01

Although there are many different algorithms and software tools for aligning sequencing reads, fast gapped sequence search is far from solved. Strong interest in fast alignment is best reflected in the $10(6) prize for the Innocentive competition on aligning a collection of reads to a given database of reference genomes. In addition, de novo assembly of next-generation sequencing long reads requires fast overlap-layout-concensus algorithms which depend on fast and accurate alignment. We introduce ARYANA, a fast gapped read aligner, developed on the base of BWA indexing infrastructure with a completely new alignment engine that makes it significantly faster than three other aligners: Bowtie2, BWA and SeqAlto, with comparable generality and accuracy. Instead of the time-consuming backtracking procedures for handling mismatches, ARYANA comes with the seed-and-extend algorithmic framework and a significantly improved efficiency by integrating novel algorithmic techniques including dynamic seed selection, bidirectional seed extension, reset-free hash tables, and gap-filling dynamic programming. As the read length increases ARYANA's superiority in terms of speed and alignment rate becomes more evident. This is in perfect harmony with the read length trend as the sequencing technologies evolve. The algorithmic platform of ARYANA makes it easy to develop mission-specific aligners for other applications using ARYANA engine. ARYANA with complete source code can be obtained from http://github.com/aryana-aligner.
Defining ‘Unhealthy’: A Systematic Analysis of Alignment between the Australian Dietary Guidelines and the Health Star Rating System

PubMed Central

Rådholm, Karin; Neal, Bruce

2018-01-01

The Australian Dietary Guidelines (ADGs) and Health Star Rating (HSR) front-of-pack labelling system are two national interventions to promote healthier diets. Our aim was to assess the degree of alignment between the two policies. Methods: Nutrition information was extracted for 65,660 packaged foods available in The George Institute’s Australian FoodSwitch database. Products were classified ‘core’ or ‘discretionary’ based on the ADGs, and a HSR generated irrespective of whether currently displayed on pack. Apparent outliers were identified as those products classified ‘core’ that received HSR ≤ 2.0; and those classified ‘discretionary’ that received HSR ≥ 3.5. Nutrient cut-offs were applied to determine whether apparent outliers were ‘high in’ salt, total sugar or saturated fat, and outlier status thereby attributed to a failure of the ADGs or HSR algorithm. Results: 47,116 products (23,460 core; 23,656 discretionary) were included. Median (Q1, Q3) HSRs were 4.0 (3.0 to 4.5) for core and 2.0 (1.0 to 3.0) for discretionary products. Overall alignment was good: 86.6% of products received a HSR aligned with their ADG classification. Among 6324 products identified as apparent outliers, 5246 (83.0%) were ultimately determined to be ADG failures, largely caused by challenges in defining foods as ‘core’ or ‘discretionary’. In total, 1078 (17.0%) were determined to be true failures of the HSR algorithm. Conclusion: The scope of genuine misalignment between the ADGs and HSR algorithm is very small. We provide evidence-informed recommendations for strengthening both policies to more effectively guide Australians towards healthier choices. PMID:29670024
Node fingerprinting: an efficient heuristic for aligning biological networks.

PubMed

Radu, Alex; Charleston, Michael

2014-10-01

With the continuing increase in availability of biological data and improvements to biological models, biological network analysis has become a promising area of research. An emerging technique for the analysis of biological networks is through network alignment. Network alignment has been used to calculate genetic distance, similarities between regulatory structures, and the effect of external forces on gene expression, and to depict conditional activity of expression modules in cancer. Network alignment is algorithmically complex, and therefore we must rely on heuristics, ideally as efficient and accurate as possible. The majority of current techniques for network alignment rely on precomputed information, such as with protein sequence alignment, or on tunable network alignment parameters, which may introduce an increased computational overhead. Our presented algorithm, which we call Node Fingerprinting (NF), is appropriate for performing global pairwise network alignment without precomputation or tuning, can be fully parallelized, and is able to quickly compute an accurate alignment between two biological networks. It has performed as well as or better than existing algorithms on biological and simulated data, and with fewer computational resources. The algorithmic validation performed demonstrates the low computational resource requirements of NF.
Alignment of the measurement scale mark during immersion hydrometer calibration using an image processing system.

PubMed

Peña-Perez, Luis Manuel; Pedraza-Ortega, Jesus Carlos; Ramos-Arreguin, Juan Manuel; Arriaga, Saul Tovar; Fernandez, Marco Antonio Aceves; Becerra, Luis Omar; Hurtado, Efren Gorrostieta; Vargas-Soto, Jose Emilio

2013-10-24

The present work presents an improved method to align the measurement scale mark in an immersion hydrometer calibration system of CENAM, the National Metrology Institute (NMI) of Mexico, The proposed method uses a vision system to align the scale mark of the hydrometer to the surface of the liquid where it is immersed by implementing image processing algorithms. This approach reduces the variability in the apparent mass determination during the hydrostatic weighing in the calibration process, therefore decreasing the relative uncertainty of calibration.
Alignment of the Measurement Scale Mark during Immersion Hydrometer Calibration Using an Image Processing System

PubMed Central

Peña-Perez, Luis Manuel; Pedraza-Ortega, Jesus Carlos; Ramos-Arreguin, Juan Manuel; Arriaga, Saul Tovar; Fernandez, Marco Antonio Aceves; Becerra, Luis Omar; Hurtado, Efren Gorrostieta; Vargas-Soto, Jose Emilio

2013-01-01

The present work presents an improved method to align the measurement scale mark in an immersion hydrometer calibration system of CENAM, the National Metrology Institute (NMI) of Mexico, The proposed method uses a vision system to align the scale mark of the hydrometer to the surface of the liquid where it is immersed by implementing image processing algorithms. This approach reduces the variability in the apparent mass determination during the hydrostatic weighing in the calibration process, therefore decreasing the relative uncertainty of calibration. PMID:24284770
AntiClustal: Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.

PubMed

Di Pietro, C; Di Pietro, V; Emmanuele, G; Ferro, A; Maugeri, T; Modica, E; Pigola, G; Pulvirenti, A; Purrello, M; Ragusa, M; Scalia, M; Shasha, D; Travali, S; Zimmitti, V

2003-01-01

In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.
The Application of the Weighted k-Partite Graph Problem to the Multiple Alignment for Metabolic Pathways.

PubMed

Chen, Wenbin; Hendrix, William; Samatova, Nagiza F

2017-12-01

The problem of aligning multiple metabolic pathways is one of very challenging problems in computational biology. A metabolic pathway consists of three types of entities: reactions, compounds, and enzymes. Based on similarities between enzymes, Tohsato et al. gave an algorithm for aligning multiple metabolic pathways. However, the algorithm given by Tohsato et al. neglects the similarities among reactions, compounds, enzymes, and pathway topology. How to design algorithms for the alignment problem of multiple metabolic pathways based on the similarity of reactions, compounds, and enzymes? It is a difficult computational problem. In this article, we propose an algorithm for the problem of aligning multiple metabolic pathways based on the similarities among reactions, compounds, enzymes, and pathway topology. First, we compute a weight between each pair of like entities in different input pathways based on the entities' similarity score and topological structure using Ay et al.'s methods. We then construct a weighted k-partite graph for the reactions, compounds, and enzymes. We extract a mapping between these entities by solving the maximum-weighted k-partite matching problem by applying a novel heuristic algorithm. By analyzing the alignment results of multiple pathways in different organisms, we show that the alignments found by our algorithm correctly identify common subnetworks among multiple pathways.
Evaluation of mathematical algorithms for automatic patient alignment in radiosurgery.

PubMed

Williams, Kenneth M; Schulte, Reinhard W; Schubert, Keith E; Wroe, Andrew J

2015-06-01

Image registration techniques based on anatomical features can serve to automate patient alignment for intracranial radiosurgery procedures in an effort to improve the accuracy and efficiency of the alignment process as well as potentially eliminate the need for implanted fiducial markers. To explore this option, four two-dimensional (2D) image registration algorithms were analyzed: the phase correlation technique, mutual information (MI) maximization, enhanced correlation coefficient (ECC) maximization, and the iterative closest point (ICP) algorithm. Digitally reconstructed radiographs from the treatment planning computed tomography scan of a human skull were used as the reference images, while orthogonal digital x-ray images taken in the treatment room were used as the captured images to be aligned. The accuracy of aligning the skull with each algorithm was compared to the alignment of the currently practiced procedure, which is based on a manual process of selecting common landmarks, including implanted fiducials and anatomical skull features. Of the four algorithms, three (phase correlation, MI maximization, and ECC maximization) demonstrated clinically adequate (ie, comparable to the standard alignment technique) translational accuracy and improvements in speed compared to the interactive, user-guided technique; however, the ICP algorithm failed to give clinically acceptable results. The results of this work suggest that a combination of different algorithms may provide the best registration results. This research serves as the initial groundwork for the translation of automated, anatomy-based 2D algorithms into a real-world system for 2D-to-2D image registration and alignment for intracranial radiosurgery. This may obviate the need for invasive implantation of fiducial markers into the skull and may improve treatment room efficiency and accuracy. © The Author(s) 2014.
Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner.

PubMed

Lu, David V; Brown, Randall H; Arumugam, Manimozhiyan; Brent, Michael R

2009-07-01

The most accurate way to determine the intron-exon structures in a genome is to align spliced cDNA sequences to the genome. Thus, cDNA-to-genome alignment programs are a key component of most annotation pipelines. The scoring system used to choose the best alignment is a primary determinant of alignment accuracy, while heuristics that prevent consideration of certain alignments are a primary determinant of runtime and memory usage. Both accuracy and speed are important considerations in choosing an alignment algorithm, but scoring systems have received much less attention than heuristics. We present Pairagon, a pair hidden Markov model based cDNA-to-genome alignment program, as the most accurate aligner for sequences with high- and low-identity levels. We conducted a series of experiments testing alignment accuracy with varying sequence identity. We first created 'perfect' simulated cDNA sequences by splicing the sequences of exons in the reference genome sequences of fly and human. The complete reference genome sequences were then mutated to various degrees using a realistic mutation simulator and the perfect cDNAs were aligned to them using Pairagon and 12 other aligners. To validate these results with natural sequences, we performed cross-species alignment using orthologous transcripts from human, mouse and rat. We found that aligner accuracy is heavily dependent on sequence identity. For sequences with 100% identity, Pairagon achieved accuracy levels of >99.6%, with one quarter of the errors of any other aligner. Furthermore, for human/mouse alignments, which are only 85% identical, Pairagon achieved 87% accuracy, higher than any other aligner. Pairagon source and executables are freely available at http://mblab.wustl.edu/software/pairagon/
A Novel Center Star Multiple Sequence Alignment Algorithm Based on Affine Gap Penalty and K-Band

NASA Astrophysics Data System (ADS)

Zou, Quan; Shan, Xiao; Jiang, Yi

Multiple sequence alignment is one of the most important topics in computational biology, but it cannot deal with the large data so far. As the development of copy-number variant(CNV) and Single Nucleotide Polymorphisms(SNP) research, many researchers want to align numbers of similar sequences for detecting CNV and SNP. In this paper, we propose a novel multiple sequence alignment algorithm based on affine gap penalty and k-band. It can align more quickly and accurately, that will be helpful for mining CNV and SNP. Experiments prove the performance of our algorithm.
Use of a Closed-Loop Tracking Algorithm for Orientation Bias Determination of an S-Band Ground Station

NASA Technical Reports Server (NTRS)

Welch, Bryan W.; Piasecki, Marie T.; Schrage, Dean S.

2015-01-01

The Space Communications and Navigation (SCaN) Testbed project completed installation and checkout testing of a new S-Band ground station at the NASA Glenn Research Center in Cleveland, Ohio in 2015. As with all ground stations, a key alignment process must be conducted to obtain offset angles in azimuth (AZ) and elevation (EL). In telescopes with AZ-EL gimbals, this is normally done with a two-star alignment process, where telescope-based pointing vectors are derived from catalogued locations with the AZ-EL bias angles derived from the pointing vector difference. For an antenna, the process is complicated without an optical asset. For the present study, the solution was to utilize the gimbal control algorithms closed-loop tracking capability to acquire the peak received power signal automatically from two distinct NASA Tracking and Data Relay Satellite (TDRS) spacecraft, without a human making the pointing adjustments. Briefly, the TDRS satellite acts as a simulated optical source and the alignment process proceeds exactly the same way as a one-star alignment. The data reduction process, which will be discussed in the paper, results in two bias angles which are retained for future pointing determination. Finally, the paper compares the test results and provides lessons learned from the activity.
Development of an embedded instrument for autofocus and polarization alignment of polarization maintaining fiber

NASA Astrophysics Data System (ADS)

Feng, Di; Fang, Qimeng; Huang, Huaibo; Zhao, Zhengqi; Song, Ningfang

2017-12-01

The development and implementation of a practical instrument based on an embedded technique for autofocus and polarization alignment of polarization maintaining fiber is presented. For focusing efficiency and stability, an image-based focusing algorithm fully considering the image definition evaluation and the focusing search strategy was used to accomplish autofocus. For improving the alignment accuracy, various image-based algorithms of alignment detection were developed with high calculation speed and strong robustness. The instrument can be operated as a standalone device with real-time processing and convenience operations. The hardware construction, software interface, and image-based algorithms of main modules are described. Additionally, several image simulation experiments were also carried out to analyze the accuracy of the above alignment detection algorithms. Both the simulation results and experiment results indicate that the instrument can achieve the accuracy of polarization alignment <±0.1 deg.
A New Continuous Rotation IMU Alignment Algorithm Based on Stochastic Modeling for Cost Effective North-Finding Applications

PubMed Central

Li, Yun; Wu, Wenqi; Jiang, Qingan; Wang, Jinling

2016-01-01

Based on stochastic modeling of Coriolis vibration gyros by the Allan variance technique, this paper discusses Angle Random Walk (ARW), Rate Random Walk (RRW) and Markov process gyroscope noises which have significant impacts on the North-finding accuracy. A new continuous rotation alignment algorithm for a Coriolis vibration gyroscope Inertial Measurement Unit (IMU) is proposed in this paper, in which the extended observation equations are used for the Kalman filter to enhance the estimation of gyro drift errors, thus improving the north-finding accuracy. Theoretical and numerical comparisons between the proposed algorithm and the traditional ones are presented. The experimental results show that the new continuous rotation alignment algorithm using the extended observation equations in the Kalman filter is more efficient than the traditional two-position alignment method. Using Coriolis vibration gyros with bias instability of 0.1°/h, a north-finding accuracy of 0.1° (1σ) is achieved by the new continuous rotation alignment algorithm, compared with 0.6° (1σ) north-finding accuracy for the two-position alignment and 1° (1σ) for the fixed-position alignment. PMID:27983585
Ontology Alignment Repair through Modularization and Confidence-Based Heuristics

PubMed Central

Santos, Emanuel; Faria, Daniel; Pesquita, Catia; Couto, Francisco M.

2015-01-01

Ontology Matching aims at identifying a set of semantic correspondences, called an alignment, between related ontologies. In recent years, there has been a growing interest in efficient and effective matching methods for large ontologies. However, alignments produced for large ontologies are often logically incoherent. It was only recently that the use of repair techniques to improve the coherence of ontology alignments began to be explored. This paper presents a novel modularization technique for ontology alignment repair which extracts fragments of the input ontologies that only contain the necessary classes and relations to resolve all detectable incoherences. The paper presents also an alignment repair algorithm that uses a global repair strategy to minimize both the degree of incoherence and the number of mappings removed from the alignment, while overcoming the scalability problem by employing the proposed modularization technique. Our evaluation shows that our modularization technique produces significantly small fragments of the ontologies and that our repair algorithm produces more complete alignments than other current alignment repair systems, while obtaining an equivalent degree of incoherence. Additionally, we also present a variant of our repair algorithm that makes use of the confidence values of the mappings to improve alignment repair. Our repair algorithm was implemented as part of AgreementMakerLight, a free and open-source ontology matching system. PMID:26710335
Ontology Alignment Repair through Modularization and Confidence-Based Heuristics.

PubMed

Santos, Emanuel; Faria, Daniel; Pesquita, Catia; Couto, Francisco M

2015-01-01

Ontology Matching aims at identifying a set of semantic correspondences, called an alignment, between related ontologies. In recent years, there has been a growing interest in efficient and effective matching methods for large ontologies. However, alignments produced for large ontologies are often logically incoherent. It was only recently that the use of repair techniques to improve the coherence of ontology alignments began to be explored. This paper presents a novel modularization technique for ontology alignment repair which extracts fragments of the input ontologies that only contain the necessary classes and relations to resolve all detectable incoherences. The paper presents also an alignment repair algorithm that uses a global repair strategy to minimize both the degree of incoherence and the number of mappings removed from the alignment, while overcoming the scalability problem by employing the proposed modularization technique. Our evaluation shows that our modularization technique produces significantly small fragments of the ontologies and that our repair algorithm produces more complete alignments than other current alignment repair systems, while obtaining an equivalent degree of incoherence. Additionally, we also present a variant of our repair algorithm that makes use of the confidence values of the mappings to improve alignment repair. Our repair algorithm was implemented as part of AgreementMakerLight, a free and open-source ontology matching system.
mTM-align: a server for fast protein structure database search and multiple protein structure alignment.

PubMed

Dong, Runze; Pan, Shuo; Peng, Zhenling; Zhang, Yang; Yang, Jianyi

2018-05-21

With the rapid increase of the number of protein structures in the Protein Data Bank, it becomes urgent to develop algorithms for efficient protein structure comparisons. In this article, we present the mTM-align server, which consists of two closely related modules: one for structure database search and the other for multiple structure alignment. The database search is speeded up based on a heuristic algorithm and a hierarchical organization of the structures in the database. The multiple structure alignment is performed using the recently developed algorithm mTM-align. Benchmark tests demonstrate that our algorithms outperform other peering methods for both modules, in terms of speed and accuracy. One of the unique features for the server is the interplay between database search and multiple structure alignment. The server provides service not only for performing fast database search, but also for making accurate multiple structure alignment with the structures found by the search. For the database search, it takes about 2-5 min for a structure of a medium size (∼300 residues). For the multiple structure alignment, it takes a few seconds for ∼10 structures of medium sizes. The server is freely available at: http://yanglab.nankai.edu.cn/mTM-align/.
Active control for stabilization of neoclassical tearing modesa)

NASA Astrophysics Data System (ADS)

Humphreys, D. A.; Ferron, J. R.; La Haye, R. J.; Luce, T. C.; Petty, C. C.; Prater, R.; Welander, A. S.

2006-05-01

This work describes active control algorithms used by DIII-D [J. L. Luxon, Nucl. Fusion 42, 614 (2002)] to stabilize and maintain suppression of 3/2 or 2/1 neoclassical tearing modes (NTMs) by application of electron cyclotron current drive (ECCD) at the rational q surface. The DIII-D NTM control system can determine the correct q-surface/ECCD alignment and stabilize existing modes within 100-500ms of activation, or prevent mode growth with preemptive application of ECCD, in both cases enabling stable operation at normalized beta values above 3.5. Because NTMs can limit performance or cause plasma-terminating disruptions in tokamaks, their stabilization is essential to the high performance operation of ITER [R. Aymar et al., ITER Joint Central Team, ITER Home Teams, Nucl. Fusion 41, 1301 (2001)]. The DIII-D NTM control system has demonstrated many elements of an eventual ITER solution, including general algorithms for robust detection of q-surface/ECCD alignment and for real-time maintenance of alignment following the disappearance of the mode. This latter capability, unique to DIII-D, is based on real-time reconstruction of q-surface geometry by a Grad-Shafranov solver using external magnetics and internal motional Stark effect measurements. Alignment is achieved by varying either the plasma major radius (and the rational q surface) or the toroidal field (and the deposition location). The requirement to achieve and maintain q-surface/ECCD alignment with accuracy on the order of 1cm is routinely met by the DIII-D Plasma Control System and these algorithms. We discuss the integrated plasma control design process used for developing these and other general control algorithms, which includes physics-based modeling and testing of the algorithm implementation against simulations of actuator and plasma responses. This systematic design/test method and modeling environment enabled successful mode suppression by the NTM control system upon first-time use in an experimental discharge.

Feature Based Retention Time Alignment for Improved HDX MS Analysis

NASA Astrophysics Data System (ADS)

Venable, John D.; Scuba, William; Brock, Ansgar

2013-04-01

An algorithm for retention time alignment of mass shifted hydrogen-deuterium exchange (HDX) data based on an iterative distance minimization procedure is described. The algorithm performs pairwise comparisons in an iterative fashion between a list of features from a reference file and a file to be time aligned to calculate a retention time mapping function. Features are characterized by their charge, retention time and mass of the monoisotopic peak. The algorithm is able to align datasets with mass shifted features, which is a prerequisite for aligning hydrogen-deuterium exchange mass spectrometry datasets. Confidence assignments from the fully automated processing of a commercial HDX software package are shown to benefit significantly from retention time alignment prior to extraction of deuterium incorporation values.
Simultaneous phylogeny reconstruction and multiple sequence alignment

PubMed Central

Yue, Feng; Shi, Jian; Tang, Jijun

2009-01-01

Background A phylogeny is the evolutionary history of a group of organisms. To date, sequence data is still the most used data type for phylogenetic reconstruction. Before any sequences can be used for phylogeny reconstruction, they must be aligned, and the quality of the multiple sequence alignment has been shown to affect the quality of the inferred phylogeny. At the same time, all the current multiple sequence alignment programs use a guide tree to produce the alignment and experiments showed that good guide trees can significantly improve the multiple alignment quality. Results We devise a new algorithm to simultaneously align multiple sequences and search for the phylogenetic tree that leads to the best alignment. We also implemented the algorithm as a C program package, which can handle both DNA and protein data and can take simple cost model as well as complex substitution matrices, such as PAM250 or BLOSUM62. The performance of the new method are compared with those from other popular multiple sequence alignment tools, including the widely used programs such as ClustalW and T-Coffee. Experimental results suggest that this method has good performance in terms of both phylogeny accuracy and alignment quality. Conclusion We present an algorithm to align multiple sequences and reconstruct the phylogenies that minimize the alignment score, which is based on an efficient algorithm to solve the median problems for three sequences. Our extensive experiments suggest that this method is very promising and can produce high quality phylogenies and alignments. PMID:19208110
A greedy, graph-based algorithm for the alignment of multiple homologous gene lists.

PubMed

Fostier, Jan; Proost, Sebastian; Dhoedt, Bart; Saeys, Yvan; Demeester, Piet; Van de Peer, Yves; Vandepoele, Klaas

2011-03-15

Many comparative genomics studies rely on the correct identification of homologous genomic regions using accurate alignment tools. In such case, the alphabet of the input sequences consists of complete genes, rather than nucleotides or amino acids. As optimal multiple sequence alignment is computationally impractical, a progressive alignment strategy is often employed. However, such an approach is susceptible to the propagation of alignment errors in early pairwise alignment steps, especially when dealing with strongly diverged genomic regions. In this article, we present a novel accurate and efficient greedy, graph-based algorithm for the alignment of multiple homologous genomic segments, represented as ordered gene lists. Based on provable properties of the graph structure, several heuristics are developed to resolve local alignment conflicts that occur due to gene duplication and/or rearrangement events on the different genomic segments. The performance of the algorithm is assessed by comparing the alignment results of homologous genomic segments in Arabidopsis thaliana to those obtained by using both a progressive alignment method and an earlier graph-based implementation. Especially for datasets that contain strongly diverged segments, the proposed method achieves a substantially higher alignment accuracy, and proves to be sufficiently fast for large datasets including a few dozens of eukaryotic genomes. http://bioinformatics.psb.ugent.be/software. The algorithm is implemented as a part of the i-ADHoRe 3.0 package.
Use artificial neural network to align biological ontologies.

PubMed

Huang, Jingshan; Dang, Jiangbo; Huhns, Michael N; Zheng, W Jim

2008-09-16

Being formal, declarative knowledge representation models, ontologies help to address the problem of imprecise terminologies in biological and biomedical research. However, ontologies constructed under the auspices of the Open Biomedical Ontologies (OBO) group have exhibited a great deal of variety, because different parties can design ontologies according to their own conceptual views of the world. It is therefore becoming critical to align ontologies from different parties. During automated/semi-automated alignment across biological ontologies, different semantic aspects, i.e., concept name, concept properties, and concept relationships, contribute in different degrees to alignment results. Therefore, a vector of weights must be assigned to these semantic aspects. It is not trivial to determine what those weights should be, and current methodologies depend a lot on human heuristics. In this paper, we take an artificial neural network approach to learn and adjust these weights, and thereby support a new ontology alignment algorithm, customized for biological ontologies, with the purpose of avoiding some disadvantages in both rule-based and learning-based aligning algorithms. This approach has been evaluated by aligning two real-world biological ontologies, whose features include huge file size, very few instances, concept names in numerical strings, and others. The promising experiment results verify our proposed hypothesis, i.e., three weights for semantic aspects learned from a subset of concepts are representative of all concepts in the same ontology. Therefore, our method represents a large leap forward towards automating biological ontology alignment.
Image stack alignment in full-field X-ray absorption spectroscopy using SIFT_PyOCL.

PubMed

Paleo, Pierre; Pouyet, Emeline; Kieffer, Jérôme

2014-03-01

Full-field X-ray absorption spectroscopy experiments allow the acquisition of millions of spectra within minutes. However, the construction of the hyperspectral image requires an image alignment procedure with sub-pixel precision. While the image correlation algorithm has originally been used for image re-alignment using translations, the Scale Invariant Feature Transform (SIFT) algorithm (which is by design robust versus rotation, illumination change, translation and scaling) presents an additional advantage: the alignment can be limited to a region of interest of any arbitrary shape. In this context, a Python module, named SIFT_PyOCL, has been developed. It implements a parallel version of the SIFT algorithm in OpenCL, providing high-speed image registration and alignment both on processors and graphics cards. The performance of the algorithm allows online processing of large datasets.
Optimal Alignment of Structures for Finite and Periodic Systems.

PubMed

Griffiths, Matthew; Niblett, Samuel P; Wales, David J

2017-10-10

Finding the optimal alignment between two structures is important for identifying the minimum root-mean-square distance (RMSD) between them and as a starting point for calculating pathways. Most current algorithms for aligning structures are stochastic, scale exponentially with the size of structure, and the performance can be unreliable. We present two complementary methods for aligning structures corresponding to isolated clusters of atoms and to condensed matter described by a periodic cubic supercell. The first method (Go-PERMDIST), a branch and bound algorithm, locates the global minimum RMSD deterministically in polynomial time. The run time increases for larger RMSDs. The second method (FASTOVERLAP) is a heuristic algorithm that aligns structures by finding the global maximum kernel correlation between them using fast Fourier transforms (FFTs) and fast SO(3) transforms (SOFTs). For periodic systems, FASTOVERLAP scales with the square of the number of identical atoms in the system, reliably finds the best alignment between structures that are not too distant, and shows significantly better performance than existing algorithms. The expected run time for Go-PERMDIST is longer than FASTOVERLAP for periodic systems. For finite clusters, the FASTOVERLAP algorithm is competitive with existing algorithms. The expected run time for Go-PERMDIST to find the global RMSD between two structures deterministically is generally longer than for existing stochastic algorithms. However, with an earlier exit condition, Go-PERMDIST exhibits similar or better performance.
StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zemla, A; Lang, D; Kostova, T

2010-11-29

Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less
Pairwise Sequence Alignment Library

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jeff Daily, PNNL

2015-05-20

Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, amore » novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.« less
Differential evolution-simulated annealing for multiple sequence alignment

NASA Astrophysics Data System (ADS)

Addawe, R. C.; Addawe, J. M.; Sueño, M. R. K.; Magadia, J. C.

2017-10-01

Multiple sequence alignments (MSA) are used in the analysis of molecular evolution and sequence structure relationships. In this paper, a hybrid algorithm, Differential Evolution - Simulated Annealing (DESA) is applied in optimizing multiple sequence alignments (MSAs) based on structural information, non-gaps percentage and totally conserved columns. DESA is a robust algorithm characterized by self-organization, mutation, crossover, and SA-like selection scheme of the strategy parameters. Here, the MSA problem is treated as a multi-objective optimization problem of the hybrid evolutionary algorithm, DESA. Thus, we name the algorithm as DESA-MSA. Simulated sequences and alignments were generated to evaluate the accuracy and efficiency of DESA-MSA using different indel sizes, sequence lengths, deletion rates and insertion rates. The proposed hybrid algorithm obtained acceptable solutions particularly for the MSA problem evaluated based on the three objectives.
High-speed peak matching algorithm for retention time alignment of gas chromatographic data for chemometric analysis.

PubMed

Johnson, Kevin J; Wright, Bob W; Jarman, Kristin H; Synovec, Robert E

2003-05-09

A rapid retention time alignment algorithm was developed as a preprocessing utility to be used prior to chemometric analysis of large datasets of diesel fuel profiles obtained using gas chromatography (GC). Retention time variation from chromatogram-to-chromatogram has been a significant impediment against the use of chemometric techniques in the analysis of chromatographic data due to the inability of current chemometric techniques to correctly model information that shifts from variable to variable within a dataset. The alignment algorithm developed is shown to increase the efficacy of pattern recognition methods applied to diesel fuel chromatograms by retaining chemical selectivity while reducing chromatogram-to-chromatogram retention time variations and to do so on a time scale that makes analysis of large sets of chromatographic data practical. Two sets of diesel fuel gas chromatograms were studied using the novel alignment algorithm followed by principal component analysis (PCA). In the first study, retention times for corresponding chromatographic peaks in 60 chromatograms varied by as much as 300 ms between chromatograms before alignment. In the second study of 42 chromatograms, the retention time shifting exhibited was on the order of 10 s between corresponding chromatographic peaks, and required a coarse retention time correction prior to alignment with the algorithm. In both cases, an increase in retention time precision afforded by the algorithm was clearly visible in plots of overlaid chromatograms before and then after applying the retention time alignment algorithm. Using the alignment algorithm, the standard deviation for corresponding peak retention times following alignment was 17 ms throughout a given chromatogram, corresponding to a relative standard deviation of 0.003% at an average retention time of 8 min. This level of retention time precision is a 5-fold improvement over the retention time precision initially provided by a state-of-the-art GC instrument equipped with electronic pressure control and was critical to the performance of the chemometric analysis. This increase in retention time precision does not come at the expense of chemical selectivity, since the PCA results suggest that essentially all of the chemical selectivity is preserved. Cluster resolution between dissimilar groups of diesel fuel chromatograms in a two-dimensional scores space generated with PCA is shown to substantially increase after alignment. The alignment method is robust against missing or extra peaks relative to a target chromatogram used in the alignment, and operates at high speed, requiring roughly 1 s of computation time per GC chromatogram.
Linking GPS and travel diary data using sequence alignment in a study of children's independent mobility

PubMed Central

2011-01-01

Background Global positioning systems (GPS) are increasingly being used in health research to determine the location of study participants. Combining GPS data with data collected via travel/activity diaries allows researchers to assess where people travel in conjunction with data about trip purpose and accompaniment. However, linking GPS and diary data is problematic and to date the only method has been to match the two datasets manually, which is time consuming and unlikely to be practical for larger data sets. This paper assesses the feasibility of a new sequence alignment method of linking GPS and travel diary data in comparison with the manual matching method. Methods GPS and travel diary data obtained from a study of children's independent mobility were linked using sequence alignment algorithms to test the proof of concept. Travel diaries were assessed for quality by counting the number of errors and inconsistencies in each participant's set of diaries. The success of the sequence alignment method was compared for higher versus lower quality travel diaries, and for accompanied versus unaccompanied trips. Time taken and percentage of trips matched were compared for the sequence alignment method and the manual method. Results The sequence alignment method matched 61.9% of all trips. Higher quality travel diaries were associated with higher match rates in both the sequence alignment and manual matching methods. The sequence alignment method performed almost as well as the manual method and was an order of magnitude faster. However, the sequence alignment method was less successful at fully matching trips and at matching unaccompanied trips. Conclusions Sequence alignment is a promising method of linking GPS and travel diary data in large population datasets, especially if limitations in the trip detection algorithm are addressed. PMID:22142322
DNA motif alignment by evolving a population of Markov chains.

PubMed

Bi, Chengpeng

2009-01-30

Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.
Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices.

PubMed

Li, Guang; Wang, Yadong; Su, Xiaohong

2012-10-01

When developing personal DNA databases, there must be an appropriate guarantee of anonymity, which means that the data cannot be related back to individuals. DNA lattice anonymization (DNALA) is a successful method for making personal DNA sequences anonymous. However, it uses time-consuming multiple sequence alignment and a low-accuracy greedy clustering algorithm. Furthermore, DNALA is not an online algorithm, and so it cannot quickly return results when the database is updated. This study improves the DNALA method. Specifically, we replaced the multiple sequence alignment in DNALA with global pairwise sequence alignment to save time, and we designed a hybrid clustering algorithm comprised of a maximum weight matching (MWM)-based algorithm and an online algorithm. The MWM-based algorithm is more accurate than the greedy algorithm in DNALA and has the same time complexity. The online algorithm can process data quickly when the database is updated. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Protein alignment algorithms with an efficient backtracking routine on multiple GPUs.

PubMed

Blazewicz, Jacek; Frohmberg, Wojciech; Kierzynka, Michal; Pesch, Erwin; Wojciechowski, Pawel

2011-05-20

Pairwise sequence alignment methods are widely used in biological research. The increasing number of sequences is perceived as one of the upcoming challenges for sequence alignment methods in the nearest future. To overcome this challenge several GPU (Graphics Processing Unit) computing approaches have been proposed lately. These solutions show a great potential of a GPU platform but in most cases address the problem of sequence database scanning and computing only the alignment score whereas the alignment itself is omitted. Thus, the need arose to implement the global and semiglobal Needleman-Wunsch, and Smith-Waterman algorithms with a backtracking procedure which is needed to construct the alignment. In this paper we present the solution that performs the alignment of every given sequence pair, which is a required step for progressive multiple sequence alignment methods, as well as for DNA recognition at the DNA assembly stage. Performed tests show that the implementation, with performance up to 6.3 GCUPS on a single GPU for affine gap penalties, is very efficient in comparison to other CPU and GPU-based solutions. Moreover, multiple GPUs support with load balancing makes the application very scalable. The article shows that the backtracking procedure of the sequence alignment algorithms may be designed to fit in with the GPU architecture. Therefore, our algorithm, apart from scores, is able to compute pairwise alignments. This opens a wide range of new possibilities, allowing other methods from the area of molecular biology to take advantage of the new computational architecture. Performed tests show that the efficiency of the implementation is excellent. Moreover, the speed of our GPU-based algorithms can be almost linearly increased when using more than one graphics card.
Concept of AHRS Algorithm Designed for Platform Independent Imu Attitude Alignment

NASA Astrophysics Data System (ADS)

Tomaszewski, Dariusz; Rapiński, Jacek; Pelc-Mieczkowska, Renata

2017-12-01

Nowadays, along with the advancement of technology one can notice the rapid development of various types of navigation systems. So far the most popular satellite navigation, is now supported by positioning results calculated with use of other measurement system. The method and manner of integration will depend directly on the destination of system being developed. To increase the frequency of readings and improve the operation of outdoor navigation systems, one will support satellite navigation systems (GPS, GLONASS ect.) with inertial navigation. Such method of navigation consists of several steps. The first stage is the determination of initial orientation of inertial measurement unit, called INS alignment. During this process, on the basis of acceleration and the angular velocity readings, values of Euler angles (pitch, roll, yaw) are calculated allowing for unambiguous orientation of the sensor coordinate system relative to external coordinate system. The following study presents the concept of AHRS (Attitude and heading reference system) algorithm, allowing to define the Euler angles.The study were conducted with the use of readings from low-cost MEMS cell phone sensors. Subsequently the results of the study were analyzed to determine the accuracy of featured algorithm. On the basis of performed experiments the legitimacy of developed algorithm was stated.
Net2Align: An Algorithm For Pairwise Global Alignment of Biological Networks

PubMed Central

Wadhwab, Gulshan; Upadhyayaa, K. C.

2016-01-01

The amount of data on molecular interactions is growing at an enormous pace, whereas the progress of methods for analysing this data is still lacking behind. Particularly, in the area of comparative analysis of biological networks, where one wishes to explore the similarity between two biological networks, this holds a potential problem. In consideration that the functionality primarily runs at the network level, it advocates the need for robust comparison methods. In this paper, we describe Net2Align, an algorithm for pairwise global alignment that can perform node-to-node correspondences as well as edge-to-edge correspondences into consideration. The uniqueness of our algorithm is in the fact that it is also able to detect the type of interaction, which is essential in case of directed graphs. The existing algorithm is only able to identify the common nodes but not the common edges. Another striking feature of the algorithm is that it is able to remove duplicate entries in case of variable datasets being aligned. This is achieved through creation of a local database which helps exclude duplicate links. In a pervasive computational study on gene regulatory network, we establish that our algorithm surpasses its counterparts in its results. Net2Align has been implemented in Java 7 and the source code is available as supplementary files. PMID:28356678
CAMPways: constrained alignment framework for the comparative analysis of a pair of metabolic pathways.

PubMed

Abaka, Gamze; Bıyıkoğlu, Türker; Erten, Cesim

2013-07-01

Given a pair of metabolic pathways, an alignment of the pathways corresponds to a mapping between similar substructures of the pair. Successful alignments may provide useful applications in phylogenetic tree reconstruction, drug design and overall may enhance our understanding of cellular metabolism. We consider the problem of providing one-to-many alignments of reactions in a pair of metabolic pathways. We first provide a constrained alignment framework applicable to the problem. We show that the constrained alignment problem even in a primitive setting is computationally intractable, which justifies efforts for designing efficient heuristics. We present our Constrained Alignment of Metabolic Pathways (CAMPways) algorithm designed for this purpose. Through extensive experiments involving a large pathway database, we demonstrate that when compared with a state-of-the-art alternative, the CAMPways algorithm provides better alignment results on metabolic networks as far as measures based on same-pathway inclusion and biochemical significance are concerned. The execution speed of our algorithm constitutes yet another important improvement over alternative algorithms. Open source codes, executable binary, useful scripts, all the experimental data and the results are freely available as part of the Supplementary Material at http://code.google.com/p/campways/. Supplementary data are available at Bioinformatics online.
Iterative refinement of structure-based sequence alignments by Seed Extension

PubMed Central

Kim, Changhoon; Tai, Chin-Hsien; Lee, Byungkook

2009-01-01

Background Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension) that iteratively refines a structure-based sequence alignment. Results RSE uses SE (Seed Extension) in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs. Conclusion RSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs. PMID:19589133
Aligning Biomolecular Networks Using Modular Graph Kernels

NASA Astrophysics Data System (ADS)

Towfic, Fadi; Greenlee, M. Heather West; Honavar, Vasant

Comparative analysis of biomolecular networks constructed using measurements from different conditions, tissues, and organisms offer a powerful approach to understanding the structure, function, dynamics, and evolution of complex biological systems. We explore a class of algorithms for aligning large biomolecular networks by breaking down such networks into subgraphs and computing the alignment of the networks based on the alignment of their subgraphs. The resulting subnetworks are compared using graph kernels as scoring functions. We provide implementations of the resulting algorithms as part of BiNA, an open source biomolecular network alignment toolkit. Our experiments using Drosophila melanogaster, Saccharomyces cerevisiae, Mus musculus and Homo sapiens protein-protein interaction networks extracted from the DIP repository of protein-protein interaction data demonstrate that the performance of the proposed algorithms (as measured by % GO term enrichment of subnetworks identified by the alignment) is competitive with some of the state-of-the-art algorithms for pair-wise alignment of large protein-protein interaction networks. Our results also show that the inter-species similarity scores computed based on graph kernels can be used to cluster the species into a species tree that is consistent with the known phylogenetic relationships among the species.
Fitting-free algorithm for efficient quantification of collagen fiber alignment in SHG imaging applications.

PubMed

Hall, Gunnsteinn; Liang, Wenxuan; Li, Xingde

2017-10-01

Collagen fiber alignment derived from second harmonic generation (SHG) microscopy images can be important for disease diagnostics. Image processing algorithms are needed to robustly quantify the alignment in images with high sensitivity and reliability. Fourier transform (FT) magnitude, 2D power spectrum, and image autocorrelation have previously been used to extract fiber information from images by assuming a certain mathematical model (e.g. Gaussian distribution of the fiber-related parameters) and fitting. The fitting process is slow and fails to converge when the data is not Gaussian. Herein we present an efficient constant-time deterministic algorithm which characterizes the symmetricity of the FT magnitude image in terms of a single parameter, named the fiber alignment anisotropy R ranging from 0 (randomized fibers) to 1 (perfect alignment). This represents an important improvement of the technology and may bring us one step closer to utilizing the technology for various applications in real time. In addition, we present a digital image phantom-based framework for characterizing and validating the algorithm, as well as assessing the robustness of the algorithm against different perturbations.

A Rapid Convergent Low Complexity Interference Alignment Algorithm for Wireless Sensor Networks.

PubMed

Jiang, Lihui; Wu, Zhilu; Ren, Guanghui; Wang, Gangyi; Zhao, Nan

2015-07-29

Interference alignment (IA) is a novel technique that can effectively eliminate the interference and approach the sum capacity of wireless sensor networks (WSNs) when the signal-to-noise ratio (SNR) is high, by casting the desired signal and interference into different signal subspaces. The traditional alternating minimization interference leakage (AMIL) algorithm for IA shows good performance in high SNR regimes, however, the complexity of the AMIL algorithm increases dramatically as the number of users and antennas increases, posing limits to its applications in the practical systems. In this paper, a novel IA algorithm, called directional quartic optimal (DQO) algorithm, is proposed to minimize the interference leakage with rapid convergence and low complexity. The properties of the AMIL algorithm are investigated, and it is discovered that the difference between the two consecutive iteration results of the AMIL algorithm will approximately point to the convergence solution when the precoding and decoding matrices obtained from the intermediate iterations are sufficiently close to their convergence values. Based on this important property, the proposed DQO algorithm employs the line search procedure so that it can converge to the destination directly. In addition, the optimal step size can be determined analytically by optimizing a quartic function. Numerical results show that the proposed DQO algorithm can suppress the interference leakage more rapidly than the traditional AMIL algorithm, and can achieve the same level of sum rate as that of AMIL algorithm with far less iterations and execution time.
PSO-based methods for medical image registration and change assessment of pigmented skin

NASA Astrophysics Data System (ADS)

Kacenjar, Steve; Zook, Matthew; Balint, Michael

2011-03-01

There are various scientific and technological areas in which it is imperative to rapidly detect and quantify changes in imagery over time. In fields such as earth remote sensing, aerospace systems, and medical imaging, searching for timedependent, regional changes across deformable topographies is complicated by varying camera acquisition geometries, lighting environments, background clutter conditions, and occlusion. Under these constantly-fluctuating conditions, the use of standard, rigid-body registration approaches often fail to provide sufficient fidelity to overlay image scenes together. This is problematic because incorrect assessments of the underlying changes of high-level topography can result in systematic errors in the quantification and classification of interested areas. For example, in the current naked-eye detection strategies of melanoma, a dermatologist often uses static morphological attributes to identify suspicious skin lesions for biopsy. This approach does not incorporate temporal changes which suggest malignant degeneration. By performing the co-registration of time-separated skin imagery, a dermatologist may more effectively detect and identify early morphological changes in pigmented lesions; enabling the physician to detect cancers at an earlier stage resulting in decreased morbidity and mortality. This paper describes an image processing system which will be used to detect changes in the characteristics of skin lesions over time. The proposed system consists of three main functional elements: 1.) coarse alignment of timesequenced imagery, 2.) refined alignment of local skin topographies, and 3.) assessment of local changes in lesion size. During the coarse alignment process, various approaches can be used to obtain a rough alignment, including: 1.) a manual landmark/intensity-based registration method1, and 2.) several flavors of autonomous optical matched filter methods2. These procedures result in the rough alignment of a patient's back topography. Since the skin is a deformable membrane, this process only provides an initial condition for subsequent refinements in aligning the localized topography of the skin. To achieve a refined enhancement, a Particle Swarm Optimizer (PSO) is used to optimally determine the local camera models associated with a generalized geometric transform. Here the optimization process is driven using the minimization of entropy between the multiple time-separated images. Once the camera models are corrected for local skin deformations, the images are compared using both pixel-based and regional-based methods. Limits on the detectability of change are established by the fidelity to which the algorithm corrects for local skin deformation and background alterations. These limits provide essential information in establishing early-warning thresholds for Melanoma detection. Key to this work is the development of a PSO alignment algorithm to perform the refined alignment in local skin topography between the time sequenced imagery (TSI). Test and validation of this alignment process is achieved using a forward model producing known geometric artifacts in the images and afterwards using a PSO algorithm to demonstrate the ability to identify and correct for these artifacts. Specifically, the forward model introduces local translational, rotational, and magnification changes within the image. These geometric modifiers are expected during TSI acquisition because of logistical issues to precisely align the patient to the image recording geometry and is therefore of paramount importance to any viable image registration system. This paper shows that the PSO alignment algorithm is effective in autonomously determining and mitigating these geometric modifiers. The degree of efficacy is measured by several statistically and morphologically based pre-image filtering operations applied to the TSI imagery before applying the PSO alignment algorithm. These trade studies show that global image threshold binarization provides rapid and superior convergence characteristics relative to that of morphologically based methods.
Automated and Adaptable Quantification of Cellular Alignment from Microscopic Images for Tissue Engineering Applications

PubMed Central

Xu, Feng; Beyazoglu, Turker; Hefner, Evan; Gurkan, Umut Atakan

2011-01-01

Cellular alignment plays a critical role in functional, physical, and biological characteristics of many tissue types, such as muscle, tendon, nerve, and cornea. Current efforts toward regeneration of these tissues include replicating the cellular microenvironment by developing biomaterials that facilitate cellular alignment. To assess the functional effectiveness of the engineered microenvironments, one essential criterion is quantification of cellular alignment. Therefore, there is a need for rapid, accurate, and adaptable methodologies to quantify cellular alignment for tissue engineering applications. To address this need, we developed an automated method, binarization-based extraction of alignment score (BEAS), to determine cell orientation distribution in a wide variety of microscopic images. This method combines a sequenced application of median and band-pass filters, locally adaptive thresholding approaches and image processing techniques. Cellular alignment score is obtained by applying a robust scoring algorithm to the orientation distribution. We validated the BEAS method by comparing the results with the existing approaches reported in literature (i.e., manual, radial fast Fourier transform-radial sum, and gradient based approaches). Validation results indicated that the BEAS method resulted in statistically comparable alignment scores with the manual method (coefficient of determination R2=0.92). Therefore, the BEAS method introduced in this study could enable accurate, convenient, and adaptable evaluation of engineered tissue constructs and biomaterials in terms of cellular alignment and organization. PMID:21370940
Alignment-free detection of horizontal gene transfer between closely related bacterial genomes.

PubMed

Domazet-Lošo, Mirjana; Haubold, Bernhard

2011-09-01

Bacterial epidemics are often caused by strains that have acquired their increased virulence through horizontal gene transfer. Due to this association with disease, the detection of horizontal gene transfer continues to receive attention from microbiologists and bioinformaticians alike. Most software for detecting transfer events is based on alignments of sets of genes or of entire genomes. But despite great advances in the design of algorithms and computer programs, genome alignment remains computationally challenging. We have therefore developed an alignment-free algorithm for rapidly detecting horizontal gene transfer between closely related bacterial genomes. Our implementation of this algorithm is called alfy for "ALignment Free local homologY" and is freely available from http://guanine.evolbio.mpg.de/alfy/. In this comment we demonstrate the application of alfy to the genomes of Staphylococcus aureus. We also argue that-contrary to popular belief and in spite of increasing computer speed-algorithmic optimization is becoming more, not less, important if genome data continues to accumulate at the present rate.
Transcript mapping for handwritten English documents

NASA Astrophysics Data System (ADS)

Jose, Damien; Bharadwaj, Anurag; Govindaraju, Venu

2008-01-01

Transcript mapping or text alignment with handwritten documents is the automatic alignment of words in a text file with word images in a handwritten document. Such a mapping has several applications in fields ranging from machine learning where large quantities of truth data are required for evaluating handwriting recognition algorithms, to data mining where word image indexes are used in ranked retrieval of scanned documents in a digital library. The alignment also aids "writer identity" verification algorithms. Interfaces which display scanned handwritten documents may use this alignment to highlight manuscript tokens when a person examines the corresponding transcript word. We propose an adaptation of the True DTW dynamic programming algorithm for English handwritten documents. The integration of the dissimilarity scores from a word-model word recognizer and Levenshtein distance between the recognized word and lexicon word, as a cost metric in the DTW algorithm leading to a fast and accurate alignment, is our primary contribution. Results provided, confirm the effectiveness of our approach.
Genetic algorithms for protein threading.

PubMed

Yadgari, J; Amir, A; Unger, R

1998-01-01

Despite many years of efforts, a direct prediction of protein structure from sequence is still not possible. As a result, in the last few years researchers have started to address the "inverse folding problem": Identifying and aligning a sequence to the fold with which it is most compatible, a process known as "threading". In two meetings in which protein folding predictions were objectively evaluated, it became clear that threading as a concept promises a real breakthrough, but that much improvement is still needed in the technique itself. Threading is a NP-hard problem, and thus no general polynomial solution can be expected. Still a practical approach with demonstrated ability to find optimal solutions in many cases, and acceptable solutions in other cases, is needed. We applied the technique of Genetic Algorithms in order to significantly improve the ability of threading algorithms to find the optimal alignment of a sequence to a structure, i.e. the alignment with the minimum free energy. A major progress reported here is the design of a representation of the threading alignment as a string of fixed length. With this representation validation of alignments and genetic operators are effectively implemented. Appropriate data structure and parameters have been selected. It is shown that Genetic Algorithm threading is effective and is able to find the optimal alignment in a few test cases. Furthermore, the described algorithm is shown to perform well even without pre-definition of core elements. Existing threading methods are dependent on such constraints to make their calculations feasible. But the concept of core elements is inherently arbitrary and should be avoided if possible. While a rigorous proof is hard to submit yet an, we present indications that indeed Genetic Algorithm threading is capable of finding consistently good solutions of full alignments in search spaces of size up to 10(70).
Computing sparse derivatives and consecutive zeros problem

NASA Astrophysics Data System (ADS)

Chandra, B. V. Ravi; Hossain, Shahadat

2013-02-01

We describe a substitution based sparse Jacobian matrix determination method using algorithmic differentiation. Utilizing the a priori known sparsity pattern, a compression scheme is determined using graph coloring. The "compressed pattern" of the Jacobian matrix is then reordered into a form suitable for computation by substitution. We show that the column reordering of the compressed pattern matrix (so as to align the zero entries into consecutive locations in each row) can be viewed as a variant of traveling salesman problem. Preliminary computational results show that on the test problems the performance of nearest-neighbor type heuristic algorithms is highly encouraging.
An Accurate Scalable Template-based Alignment Algorithm

PubMed Central

Gardner, David P.; Xu, Weijia; Miranker, Daniel P.; Ozer, Stuart; Cannone, Jamie J.; Gutell, Robin R.

2013-01-01

The rapid determination of nucleic acid sequences is increasing the number of sequences that are available. Inherent in a template or seed alignment is the culmination of structural and functional constraints that are selecting those mutations that are viable during the evolution of the RNA. While we might not understand these structural and functional, template-based alignment programs utilize the patterns of sequence conservation to encapsulate the characteristics of viable RNA sequences that are aligned properly. We have developed a program that utilizes the different dimensions of information in rCAD, a large RNA informatics resource, to establish a profile for each position in an alignment. The most significant include sequence identity and column composition in different phylogenetic taxa. We have compared our methods with a maximum of eight alternative alignment methods on different sets of 16S and 23S rRNA sequences with sequence percent identities ranging from 50% to 100%. The results showed that CRWAlign outperformed the other alignment methods in both speed and accuracy. A web-based alignment server is available at http://www.rna.ccbb.utexas.edu/SAE/2F/CRWAlign. PMID:24772376
Alignment of leading-edge and peak-picking time of arrival methods to obtain accurate source locations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Roussel-Dupre, R.; Symbalisty, E.; Fox, C.

2009-08-01

The location of a radiating source can be determined by time-tagging the arrival of the radiated signal at a network of spatially distributed sensors. The accuracy of this approach depends strongly on the particular time-tagging algorithm employed at each of the sensors. If different techniques are used across the network, then the time tags must be referenced to a common fiducial for maximum location accuracy. In this report we derive the time corrections needed to temporally align leading-edge, time-tagging techniques with peak-picking algorithms. We focus on broadband radio frequency (RF) sources, an ionospheric propagation channel, and narrowband receivers, but themore » final results can be generalized to apply to any source, propagation environment, and sensor. Our analytic results are checked against numerical simulations for a number of representative cases and agree with the specific leading-edge algorithm studied independently by Kim and Eng (1995) and Pongratz (2005 and 2007).« less
Simulations in site error estimation for direction finders

NASA Astrophysics Data System (ADS)

López, Raúl E.; Passi, Ranjit M.

1991-08-01

The performance of an algorithm for the recovery of site-specific errors of direction finder (DF) networks is tested under controlled simulated conditions. The simulations show that the algorithm has some inherent shortcomings for the recovery of site errors from the measured azimuth data. These limitations are fundamental to the problem of site error estimation using azimuth information. Several ways for resolving or ameliorating these basic complications are tested by means of simulations. From these it appears that for the effective implementation of the site error determination algorithm, one should design the networks with at least four DFs, improve the alignment of the antennas, and increase the gain of the DFs as much as it is compatible with other operational requirements. The use of a nonzero initial estimate of the site errors when working with data from networks of four or more DFs also improves the accuracy of the site error recovery. Even for networks of three DFs, reasonable site error corrections could be obtained if the antennas could be well aligned.
Experimental image alignment system

NASA Technical Reports Server (NTRS)

Moyer, A. L.; Kowel, S. T.; Kornreich, P. G.

1980-01-01

A microcomputer-based instrument for image alignment with respect to a reference image is described which uses the DEFT sensor (Direct Electronic Fourier Transform) for image sensing and preprocessing. The instrument alignment algorithm which uses the two-dimensional Fourier transform as input is also described. It generates signals used to steer the stage carrying the test image into the correct orientation. This algorithm has computational advantages over algorithms which use image intensity data as input and is suitable for a microcomputer-based instrument since the two-dimensional Fourier transform is provided by the DEFT sensor.
FEAST: sensitive local alignment with multiple rates of evolution.

PubMed

Hudek, Alexander K; Brown, Daniel G

2011-01-01

We present a pairwise local aligner, FEAST, which uses two new techniques: a sensitive extension algorithm for identifying homologous subsequences, and a descriptive probabilistic alignment model. We also present a new procedure for training alignment parameters and apply it to the human and mouse genomes, producing a better parameter set for these sequences. Our extension algorithm identifies homologous subsequences by considering all evolutionary histories. It has higher maximum sensitivity than Viterbi extensions, and better balances specificity. We model alignments with several submodels, each with unique statistical properties, describing strongly similar and weakly similar regions of homologous DNA. Training parameters using two submodels produces superior alignments, even when we align with only the parameters from the weaker submodel. Our extension algorithm combined with our new parameter set achieves sensitivity 0.59 on synthetic tests. In contrast, LASTZ with default settings achieves sensitivity 0.35 with the same false positive rate. Using the weak submodel as parameters for LASTZ increases its sensitivity to 0.59 with high error. FEAST is available at http://monod.uwaterloo.ca/feast/.
GCALIGNER 1.0: an alignment program to compute a multiple sample comparison data matrix from large eco-chemical datasets obtained by GC.

PubMed

Dellicour, Simon; Lecocq, Thomas

2013-10-01

GCALIGNER 1.0 is a computer program designed to perform a preliminary data comparison matrix of chemical data obtained by GC without MS information. The alignment algorithm is based on the comparison between the retention times of each detected compound in a sample. In this paper, we test the GCALIGNER efficiency on three datasets of the chemical secretions of bumble bees. The algorithm performs the alignment with a low error rate (<3%). GCALIGNER 1.0 is a useful, simple and free program based on an algorithm that enables the alignment of table-type data from GC. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Robust prediction of consensus secondary structures using averaged base pairing probability matrices.

PubMed

Kiryu, Hisanori; Kin, Taishin; Asai, Kiyoshi

2007-02-15

Recent transcriptomic studies have revealed the existence of a considerable number of non-protein-coding RNA transcripts in higher eukaryotic cells. To investigate the functional roles of these transcripts, it is of great interest to find conserved secondary structures from multiple alignments on a genomic scale. Since multiple alignments are often created using alignment programs that neglect the special conservation patterns of RNA secondary structures for computational efficiency, alignment failures can cause potential risks of overlooking conserved stem structures. We investigated the dependence of the accuracy of secondary structure prediction on the quality of alignments. We compared three algorithms that maximize the expected accuracy of secondary structures as well as other frequently used algorithms. We found that one of our algorithms, called McCaskill-MEA, was more robust against alignment failures than others. The McCaskill-MEA method first computes the base pairing probability matrices for all the sequences in the alignment and then obtains the base pairing probability matrix of the alignment by averaging over these matrices. The consensus secondary structure is predicted from this matrix such that the expected accuracy of the prediction is maximized. We show that the McCaskill-MEA method performs better than other methods, particularly when the alignment quality is low and when the alignment consists of many sequences. Our model has a parameter that controls the sensitivity and specificity of predictions. We discussed the uses of that parameter for multi-step screening procedures to search for conserved secondary structures and for assigning confidence values to the predicted base pairs. The C++ source code that implements the McCaskill-MEA algorithm and the test dataset used in this paper are available at http://www.ncrna.org/papers/McCaskillMEA/. Supplementary data are available at Bioinformatics online.
Evaluating the efficacy of a structure-derived amino acid substitution matrix in detecting protein homologs by BLAST and PSI-BLAST.

PubMed

Goonesekere, Nalin Cw

2009-01-01

The large numbers of protein sequences generated by whole genome sequencing projects require rapid and accurate methods of annotation. The detection of homology through computational sequence analysis is a powerful tool in determining the complex evolutionary and functional relationships that exist between proteins. Homology search algorithms employ amino acid substitution matrices to detect similarity between proteins sequences. The substitution matrices in common use today are constructed using sequences aligned without reference to protein structure. Here we present amino acid substitution matrices constructed from the alignment of a large number of protein domain structures from the structural classification of proteins (SCOP) database. We show that when incorporated into the homology search algorithms BLAST and PSI-blast, the structure-based substitution matrices enhance the efficacy of detecting remote homologs.
Real-time driver fatigue detection based on face alignment

NASA Astrophysics Data System (ADS)

Tao, Huanhuan; Zhang, Guiying; Zhao, Yong; Zhou, Yi

2017-07-01

The performance and robustness of fatigue detection largely decrease if the driver with glasses. To address this issue, this paper proposes a practical driver fatigue detection method based on face alignment at 3000 FPS algorithm. Firstly, the eye regions of the driver are localized by exploiting 6 landmarks surrounding each eye. Secondly, the HOG features of the extracted eye regions are calculated and put into SVM classifier to recognize the eye state. Finally, the value of PERCLOS is calculated to determine whether the driver is drowsy or not. An alarm will be generated if the eye is closed for a specified period of time. The accuracy and real-time on testing videos with different drivers demonstrate that the proposed algorithm is robust and obtain better accuracy for driver fatigue detection compared with some previous method.
Functional Alignment of Metabolic Networks.

PubMed

Mazza, Arnon; Wagner, Allon; Ruppin, Eytan; Sharan, Roded

2016-05-01

Network alignment has become a standard tool in comparative biology, allowing the inference of protein function, interaction, and orthology. However, current alignment techniques are based on topological properties of networks and do not take into account their functional implications. Here we propose, for the first time, an algorithm to align two metabolic networks by taking advantage of their coupled metabolic models. These models allow us to assess the functional implications of genes or reactions, captured by the metabolic fluxes that are altered following their deletion from the network. Such implications may spread far beyond the region of the network where the gene or reaction lies. We apply our algorithm to align metabolic networks from various organisms, ranging from bacteria to humans, showing that our alignment can reveal functional orthology relations that are missed by conventional topological alignments.
Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets.

PubMed

Hoffmann, Nils; Keck, Matthias; Neuweger, Heiko; Wilhelm, Mathias; Högy, Petra; Niehaus, Karsten; Stoye, Jens

2012-08-27

Modern analytical methods in biology and chemistry use separation techniques coupled to sensitive detectors, such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). These hyphenated methods provide high-dimensional data. Comparing such data manually to find corresponding signals is a laborious task, as each experiment usually consists of thousands of individual scans, each containing hundreds or even thousands of distinct signals. In order to allow for successful identification of metabolites or proteins within such data, especially in the context of metabolomics and proteomics, an accurate alignment and matching of corresponding features between two or more experiments is required. Such a matching algorithm should capture fluctuations in the chromatographic system which lead to non-linear distortions on the time axis, as well as systematic changes in recorded intensities. Many different algorithms for the retention time alignment of GC-MS and LC-MS data have been proposed and published, but all of them focus either on aligning previously extracted peak features or on aligning and comparing the complete raw data containing all available features. In this paper we introduce two algorithms for retention time alignment of multiple GC-MS datasets: multiple alignment by bidirectional best hits peak assignment and cluster extension (BIPACE) and center-star multiple alignment by pairwise partitioned dynamic time warping (CeMAPP-DTW). We show how the similarity-based peak group matching method BIPACE may be used for multiple alignment calculation individually and how it can be used as a preprocessing step for the pairwise alignments performed by CeMAPP-DTW. We evaluate the algorithms individually and in combination on a previously published small GC-MS dataset studying the Leishmania parasite and on a larger GC-MS dataset studying grains of wheat (Triticum aestivum). We have shown that BIPACE achieves very high precision and recall and a very low number of false positive peak assignments on both evaluation datasets. CeMAPP-DTW finds a high number of true positives when executed on its own, but achieves even better results when BIPACE is used to constrain its search space. The source code of both algorithms is included in the OpenSource software framework Maltcms, which is available from http://maltcms.sf.net. The evaluation scripts of the present study are available from the same source.
Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets

PubMed Central

2012-01-01

Background Modern analytical methods in biology and chemistry use separation techniques coupled to sensitive detectors, such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). These hyphenated methods provide high-dimensional data. Comparing such data manually to find corresponding signals is a laborious task, as each experiment usually consists of thousands of individual scans, each containing hundreds or even thousands of distinct signals. In order to allow for successful identification of metabolites or proteins within such data, especially in the context of metabolomics and proteomics, an accurate alignment and matching of corresponding features between two or more experiments is required. Such a matching algorithm should capture fluctuations in the chromatographic system which lead to non-linear distortions on the time axis, as well as systematic changes in recorded intensities. Many different algorithms for the retention time alignment of GC-MS and LC-MS data have been proposed and published, but all of them focus either on aligning previously extracted peak features or on aligning and comparing the complete raw data containing all available features. Results In this paper we introduce two algorithms for retention time alignment of multiple GC-MS datasets: multiple alignment by bidirectional best hits peak assignment and cluster extension (BIPACE) and center-star multiple alignment by pairwise partitioned dynamic time warping (CeMAPP-DTW). We show how the similarity-based peak group matching method BIPACE may be used for multiple alignment calculation individually and how it can be used as a preprocessing step for the pairwise alignments performed by CeMAPP-DTW. We evaluate the algorithms individually and in combination on a previously published small GC-MS dataset studying the Leishmania parasite and on a larger GC-MS dataset studying grains of wheat (Triticum aestivum). Conclusions We have shown that BIPACE achieves very high precision and recall and a very low number of false positive peak assignments on both evaluation datasets. CeMAPP-DTW finds a high number of true positives when executed on its own, but achieves even better results when BIPACE is used to constrain its search space. The source code of both algorithms is included in the OpenSource software framework Maltcms, which is available from http://maltcms.sf.net. The evaluation scripts of the present study are available from the same source. PMID:22920415
An Efficient Correction Algorithm for Eliminating Image Misalignment Effects on Co-Phasing Measurement Accuracy for Segmented Active Optics Systems

PubMed Central

Yue, Dan; Xu, Shuyan; Nie, Haitao; Wang, Zongyang

2016-01-01

The misalignment between recorded in-focus and out-of-focus images using the Phase Diversity (PD) algorithm leads to a dramatic decline in wavefront detection accuracy and image recovery quality for segmented active optics systems. This paper demonstrates the theoretical relationship between the image misalignment and tip-tilt terms in Zernike polynomials of the wavefront phase for the first time, and an efficient two-step alignment correction algorithm is proposed to eliminate these misalignment effects. This algorithm processes a spatial 2-D cross-correlation of the misaligned images, revising the offset to 1 or 2 pixels and narrowing the search range for alignment. Then, it eliminates the need for subpixel fine alignment to achieve adaptive correction by adding additional tip-tilt terms to the Optical Transfer Function (OTF) of the out-of-focus channel. The experimental results demonstrate the feasibility and validity of the proposed correction algorithm to improve the measurement accuracy during the co-phasing of segmented mirrors. With this alignment correction, the reconstructed wavefront is more accurate, and the recovered image is of higher quality. PMID:26934045

A method to align the coordinate system of accelerometers to the axes of a human body: The depitch algorithm.

PubMed

Gietzelt, Matthias; Schnabel, Stephan; Wolf, Klaus-Hendrik; Büsching, Felix; Song, Bianying; Rust, Stefan; Marschollek, Michael

2012-05-01

One of the key problems in accelerometry based gait analyses is that it may not be possible to attach an accelerometer to the lower trunk so that its axes are perfectly aligned to the axes of the subject. In this paper we will present an algorithm that was designed to virtually align the axes of the accelerometer to the axes of the subject during walking sections. This algorithm is based on a physically reasonable approach and built for measurements in unsupervised settings, where the test persons are applying the sensors by themselves. For evaluation purposes we conducted a study with 6 healthy subjects and measured their gait with a manually aligned and a skewed accelerometer attached to the subject's lower trunk. After applying the algorithm the intra-axis correlation of both sensors was on average 0.89±0.1 with a mean absolute error of 0.05g. We concluded that the algorithm was able to adjust the skewed sensor node virtually to the coordinate system of the subject. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

PubMed

Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

2011-01-01

The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
Evaluation of peak picking quality in LC-MS metabolomics data.

PubMed

Brodsky, Leonid; Moussaieff, Arieh; Shahaf, Nir; Aharoni, Asaph; Rogachev, Ilana

2010-11-15

The output of LC-MS metabolomics experiments consists of mass-peak intensities identified through a peak-picking/alignment procedure. Besides imperfections in biological samples and instrumentation, data accuracy is highly dependent on the applied algorithms and their parameters. Consequently, quality control (QC) is essential for further data analysis. Here, we present a QC approach that is based on discrepancies between replicate samples. First, the quantile normalization of per-sample log-signal distributions is applied to each group of biologically homogeneous samples. Next, the overall quality of each replicate group is characterized by the Z-transformed correlation coefficients between samples. This general QC allows a tuning of the procedure's parameters which minimizes the inter-replicate discrepancies in the generated output. Subsequently, an in-depth QC measure detects local neighborhoods on a template of aligned chromatograms that are enriched by divergences between intensity profiles of replicate samples. These neighborhoods are determined through a segmentation algorithm. The retention time (RT)-m/z positions of the neighborhoods with local divergences are indicative of either: incorrect alignment of chromatographic features, technical problems in the chromatograms, or to a true biological discrepancy between replicates for particular metabolites. We expect this method to aid in the accurate analysis of metabolomics data and in the development of new peak-picking/alignment procedures.
An efficient algorithm for pairwise local alignment of protein interaction networks

DOE PAGES

Chen, Wenbin; Schmidt, Matthew; Tian, Wenhong; ...

2015-04-01

Recently, researchers seeking to understand, modify, and create beneficial traits in organisms have looked for evolutionarily conserved patterns of protein interactions. Their conservation likely means that the proteins of these conserved functional modules are important to the trait's expression. In this paper, we formulate the problem of identifying these conserved patterns as a graph optimization problem, and develop a fast heuristic algorithm for this problem. We compare the performance of our network alignment algorithm to that of the MaWISh algorithm [Koyuturk M, Kim Y, Topkara U, Subramaniam S, Szpankowski W, Grama A, Pairwise alignment of protein interaction networks, J Computmore » Biol 13(2): 182-199, 2006.], which bases its search algorithm on a related decision problem formulation. We find that our algorithm discovers conserved modules with a larger number of proteins in an order of magnitude less time. In conclusion, the protein sets found by our algorithm correspond to known conserved functional modules at comparable precision and recall rates as those produced by the MaWISh algorithm.« less
Projected power iteration for network alignment

NASA Astrophysics Data System (ADS)

Onaran, Efe; Villar, Soledad

2017-08-01

The network alignment problem asks for the best correspondence between two given graphs, so that the largest possible number of edges are matched. This problem appears in many scientific problems (like the study of protein-protein interactions) and it is very closely related to the quadratic assignment problem which has graph isomorphism, traveling salesman and minimum bisection problems as particular cases. The graph matching problem is NP-hard in general. However, under some restrictive models for the graphs, algorithms can approximate the alignment efficiently. In that spirit the recent work by Feizi and collaborators introduce EigenAlign, a fast spectral method with convergence guarantees for Erd-s-Renyí graphs. In this work we propose the algorithm Projected Power Alignment, which is a projected power iteration version of EigenAlign. We numerically show it improves the recovery rates of EigenAlign and we describe the theory that may be used to provide performance guarantees for Projected Power Alignment.
AlignNemo: a local network alignment method to integrate homology and topology.

PubMed

Ciriello, Giovanni; Mina, Marco; Guzzi, Pietro H; Cannataro, Mario; Guerra, Concettina

2012-01-01

Local network alignment is an important component of the analysis of protein-protein interaction networks that may lead to the identification of evolutionary related complexes. We present AlignNemo, a new algorithm that, given the networks of two organisms, uncovers subnetworks of proteins that relate in biological function and topology of interactions. The discovered conserved subnetworks have a general topology and need not to correspond to specific interaction patterns, so that they more closely fit the models of functional complexes proposed in the literature. The algorithm is able to handle sparse interaction data with an expansion process that at each step explores the local topology of the networks beyond the proteins directly interacting with the current solution. To assess the performance of AlignNemo, we ran a series of benchmarks using statistical measures as well as biological knowledge. Based on reference datasets of protein complexes, AlignNemo shows better performance than other methods in terms of both precision and recall. We show our solutions to be biologically sound using the concept of semantic similarity applied to Gene Ontology vocabularies. The binaries of AlignNemo and supplementary details about the algorithms and the experiments are available at: sourceforge.net/p/alignnemo.
Sparse alignment for robust tensor learning.

PubMed

Lai, Zhihui; Wong, Wai Keung; Xu, Yong; Zhao, Cairong; Sun, Mingming

2014-10-01

Multilinear/tensor extensions of manifold learning based algorithms have been widely used in computer vision and pattern recognition. This paper first provides a systematic analysis of the multilinear extensions for the most popular methods by using alignment techniques, thereby obtaining a general tensor alignment framework. From this framework, it is easy to show that the manifold learning based tensor learning methods are intrinsically different from the alignment techniques. Based on the alignment framework, a robust tensor learning method called sparse tensor alignment (STA) is then proposed for unsupervised tensor feature extraction. Different from the existing tensor learning methods, L1- and L2-norms are introduced to enhance the robustness in the alignment step of the STA. The advantage of the proposed technique is that the difficulty in selecting the size of the local neighborhood can be avoided in the manifold learning based tensor feature extraction algorithms. Although STA is an unsupervised learning method, the sparsity encodes the discriminative information in the alignment step and provides the robustness of STA. Extensive experiments on the well-known image databases as well as action and hand gesture databases by encoding object images as tensors demonstrate that the proposed STA algorithm gives the most competitive performance when compared with the tensor-based unsupervised learning methods.
On the Impact of Widening Vector Registers on Sequence Alignment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Daily, Jeffrey A.; Kalyanaraman, Anantharaman; Krishnamoorthy, Sriram

2016-09-22

Vector extensions, such as SSE, have been part of the x86 since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. In this paper, we demonstrate that the trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based onmore » striped data layouts. We present a practically efficient SIMD implementation of a parallel scan based sequence alignment algorithm that can better exploit wider SIMD units. We conduct comprehensive workload and use case analyses to characterize the relative behavior of the striped and scan approaches and identify the best choice of algorithm based on input length and SIMD width.« less
Rapid Transfer Alignment of MEMS SINS Based on Adaptive Incremental Kalman Filter.

PubMed

Chu, Hairong; Sun, Tingting; Zhang, Baiqiang; Zhang, Hongwei; Chen, Yang

2017-01-14

In airborne MEMS SINS transfer alignment, the error of MEMS IMU is highly environment-dependent and the parameters of the system model are also uncertain, which may lead to large error and bad convergence of the Kalman filter. In order to solve this problem, an improved adaptive incremental Kalman filter (AIKF) algorithm is proposed. First, the model of SINS transfer alignment is defined based on the "Velocity and Attitude" matching method. Then the detailed algorithm progress of AIKF and its recurrence formulas are presented. The performance and calculation amount of AKF and AIKF are also compared. Finally, a simulation test is designed to verify the accuracy and the rapidity of the AIKF algorithm by comparing it with KF and AKF. The results show that the AIKF algorithm has better estimation accuracy and shorter convergence time, especially for the bias of the gyroscope and the accelerometer, which can meet the accuracy and rapidity requirement of transfer alignment.
Rapid Transfer Alignment of MEMS SINS Based on Adaptive Incremental Kalman Filter

PubMed Central

Chu, Hairong; Sun, Tingting; Zhang, Baiqiang; Zhang, Hongwei; Chen, Yang

2017-01-01

In airborne MEMS SINS transfer alignment, the error of MEMS IMU is highly environment-dependent and the parameters of the system model are also uncertain, which may lead to large error and bad convergence of the Kalman filter. In order to solve this problem, an improved adaptive incremental Kalman filter (AIKF) algorithm is proposed. First, the model of SINS transfer alignment is defined based on the “Velocity and Attitude” matching method. Then the detailed algorithm progress of AIKF and its recurrence formulas are presented. The performance and calculation amount of AKF and AIKF are also compared. Finally, a simulation test is designed to verify the accuracy and the rapidity of the AIKF algorithm by comparing it with KF and AKF. The results show that the AIKF algorithm has better estimation accuracy and shorter convergence time, especially for the bias of the gyroscope and the accelerometer, which can meet the accuracy and rapidity requirement of transfer alignment. PMID:28098829
RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation.

PubMed

Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin

2017-01-21

RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/ .
An Automatic Registration Algorithm for 3D Maxillofacial Model

NASA Astrophysics Data System (ADS)

Qiu, Luwen; Zhou, Zhongwei; Guo, Jixiang; Lv, Jiancheng

2016-09-01

3D image registration aims at aligning two 3D data sets in a common coordinate system, which has been widely used in computer vision, pattern recognition and computer assisted surgery. One challenging problem in 3D registration is that point-wise correspondences between two point sets are often unknown apriori. In this work, we develop an automatic algorithm for 3D maxillofacial models registration including facial surface model and skull model. Our proposed registration algorithm can achieve a good alignment result between partial and whole maxillofacial model in spite of ambiguous matching, which has a potential application in the oral and maxillofacial reparative and reconstructive surgery. The proposed algorithm includes three steps: (1) 3D-SIFT features extraction and FPFH descriptors construction; (2) feature matching using SAC-IA; (3) coarse rigid alignment and refinement by ICP. Experiments on facial surfaces and mandible skull models demonstrate the efficiency and robustness of our algorithm.
Automatic Data Distribution for CFD Applications on Structured Grids

NASA Technical Reports Server (NTRS)

Frumkin, Michael; Yan, Jerry

2000-01-01

Data distribution is an important step in implementation of any parallel algorithm. The data distribution determines data traffic, utilization of the interconnection network and affects the overall code efficiency. In recent years a number data distribution methods have been developed and used in real programs for improving data traffic. We use some of the methods for translating data dependence and affinity relations into data distribution directives. We describe an automatic data alignment and placement tool (ADAFT) which implements these methods and show it results for some CFD codes (NPB and ARC3D). Algorithms for program analysis and derivation of data distribution implemented in ADAFT are efficient three pass algorithms. Most algorithms have linear complexity with the exception of some graph algorithms having complexity O(n(sup 4)) in the worst case.
Automatic Data Distribution for CFD Applications on Structured Grids

NASA Technical Reports Server (NTRS)

Frumkin, Michael; Yan, Jerry

1999-01-01

Data distribution is an important step in implementation of any parallel algorithm. The data distribution determines data traffic, utilization of the interconnection network and affects the overall code efficiency. In recent years a number data distribution methods have been developed and used in real programs for improving data traffic. We use some of the methods for translating data dependence and affinity relations into data distribution directives. We describe an automatic data alignment and placement tool (ADAPT) which implements these methods and show it results for some CFD codes (NPB and ARC3D). Algorithms for program analysis and derivation of data distribution implemented in ADAPT are efficient three pass algorithms. Most algorithms have linear complexity with the exception of some graph algorithms having complexity O(n(sup 4)) in the worst case.
GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies.

PubMed

Kim, Jeremie S; Senol Cali, Damla; Xin, Hongyi; Lee, Donghyuk; Ghose, Saugata; Alser, Mohammed; Hassan, Hasan; Ergin, Oguz; Alkan, Can; Mutlu, Onur

2018-05-09

Seed location filtering is critical in DNA read mapping, a process where billions of DNA fragments (reads) sampled from a donor are mapped onto a reference genome to identify genomic variants of the donor. State-of-the-art read mappers 1) quickly generate possible mapping locations for seeds (i.e., smaller segments) within each read, 2) extract reference sequences at each of the mapping locations, and 3) check similarity between each read and its associated reference sequences with a computationally-expensive algorithm (i.e., sequence alignment) to determine the origin of the read. A seed location filter comes into play before alignment, discarding seed locations that alignment would deem a poor match. The ideal seed location filter would discard all poor match locations prior to alignment such that there is no wasted computation on unnecessary alignments. We propose a novel seed location filtering algorithm, GRIM-Filter, optimized to exploit 3D-stacked memory systems that integrate computation within a logic layer stacked under memory layers, to perform processing-in-memory (PIM). GRIM-Filter quickly filters seed locations by 1) introducing a new representation of coarse-grained segments of the reference genome, and 2) using massively-parallel in-memory operations to identify read presence within each coarse-grained segment. Our evaluations show that for a sequence alignment error tolerance of 0.05, GRIM-Filter 1) reduces the false negative rate of filtering by 5.59x-6.41x, and 2) provides an end-to-end read mapper speedup of 1.81x-3.65x, compared to a state-of-the-art read mapper employing the best previous seed location filtering algorithm. GRIM-Filter exploits 3D-stacked memory, which enables the efficient use of processing-in-memory, to overcome the memory bandwidth bottleneck in seed location filtering. We show that GRIM-Filter significantly improves the performance of a state-of-the-art read mapper. GRIM-Filter is a universal seed location filter that can be applied to any read mapper. We hope that our results provide inspiration for new works to design other bioinformatics algorithms that take advantage of emerging technologies and new processing paradigms, such as processing-in-memory using 3D-stacked memory devices.
Constructing Aligned Assessments Using Automated Test Construction

ERIC Educational Resources Information Center

Porter, Andrew; Polikoff, Morgan S.; Barghaus, Katherine M.; Yang, Rui

2013-01-01

We describe an innovative automated test construction algorithm for building aligned achievement tests. By incorporating the algorithm into the test construction process, along with other test construction procedures for building reliable and unbiased assessments, the result is much more valid tests than result from current test construction…
FPFH-based graph matching for 3D point cloud registration

NASA Astrophysics Data System (ADS)

Zhao, Jiapeng; Li, Chen; Tian, Lihua; Zhu, Jihua

2018-04-01

Correspondence detection is a vital step in point cloud registration and it can help getting a reliable initial alignment. In this paper, we put forward an advanced point feature-based graph matching algorithm to solve the initial alignment problem of rigid 3D point cloud registration with partial overlap. Specifically, Fast Point Feature Histograms are used to determine the initial possible correspondences firstly. Next, a new objective function is provided to make the graph matching more suitable for partially overlapping point cloud. The objective function is optimized by the simulated annealing algorithm for final group of correct correspondences. Finally, we present a novel set partitioning method which can transform the NP-hard optimization problem into a O(n3)-solvable one. Experiments on the Stanford and UWA public data sets indicates that our method can obtain better result in terms of both accuracy and time cost compared with other point cloud registration methods.
Quadrupole Alignment and Trajectory Correction for Future Linear Colliders: SLC Tests of a Dispersion-Free Steering Algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Assmann, R

2004-06-08

The feasibility of future linear colliders depends on achieving very tight alignment and steering tolerances. All proposals (NLC, JLC, CLIC, TESLA and S-BAND) currently require a total emittance growth in the main linac of less than 30-100% [1]. This should be compared with a 100% emittance growth in the much smaller SLC linac [2]. Major advances in alignment and beam steering techniques beyond those used in the SLC are necessary for the next generation of linear colliders. In this paper, we present an experimental study of quadrupole alignment with a dispersion-free steering algorithm. A closely related method (wakefield-free steering) takesmore » into account wakefield effects [3]. However, this method can not be studied at the SLC. The requirements for future linear colliders lead to new and unconventional ideas about alignment and beam steering. For example, no dipole correctors are foreseen for the standard trajectory correction in the NLC [4]; beam steering will be done by moving the quadrupole positions with magnet movers. This illustrates the close symbiosis between alignment, beam steering and beam dynamics that will emerge. It is no longer possible to consider the accelerator alignment as static with only a few surveys and realignments per year. The alignment in future linear colliders will be a dynamic process in which the whole linac, with thousands of beam-line elements, is aligned in a few hours or minutes, while the required accuracy of about 5 pm for the NLC quadrupole alignment [4] is a factor of 20 higher than in existing accelerators. The major task in alignment and steering is the accurate determination of the optimum beam-line position. Ideally one would like all elements to be aligned along a straight line. However, this is not practical. Instead a ''smooth curve'' is acceptable as long as its wavelength is much longer than the betatron wavelength of the accelerated beam. Conventional alignment methods are limited in accuracy by errors in the survey and the fiducials. Beam-based alignment methods ideally only depend upon the BPM resolution and generally provide much better precision. Many of those techniques are described in other contributions to this workshop. In this paper we describe our experiences with a dispersion-free steering algorithm for linacs. This algorithm was first suggested by Raubenheimer and Ruth in 1990 [5]. It h as been studied in simulations for NLC [5], TESLA [6], the S-BAND proposal [7] and CLIC [8]. The dispersion-free steering technique can be applied to the whole linac at once and returns the alignment (or trajectory) that minimizes the dispersive emittance growth of the beam. Thus it allows an extremely fast alignment of the beam-line. As we will show dispersion-free steering is only sensitive to quadrupole misalignments. Wakefield-free steering [3] as mentioned before is a closely related technique that minimizes the emittance growth caused by both dispersion and wakefields. Due to hardware limitations (i.e. insufficient relative range of power supplies) we could not study this method experimentally in the SLC. However, its systematics are very similar to those of dispersion-free steering. The studies of dispersion-free steering which are presented made extensive use of the unique potential of the SLC as the only operating linear collider. We used it to study the performance and problems of advanced beam-based optimization tools in a real beam-line environment and on a large scale. We should mention that the SLC has utilized beam-based alignment for years [9], using the difference of electron and positron trajectories. This method, however, cannot be used in future linear colliders. The goal of our work is to demonstrate the performance of advanced beam-based alignment techniques in linear colliders and to anticipate possible reality-related problems. Those can then be solved in the design state for the next generation of linear colliders.« less
MultiSETTER: web server for multiple RNA structure comparison.

PubMed

Čech, Petr; Hoksza, David; Svozil, Daniel

2015-08-12

Understanding the architecture and function of RNA molecules requires methods for comparing and analyzing their tertiary and quaternary structures. While structural superposition of short RNAs is achievable in a reasonable time, large structures represent much bigger challenge. Therefore, we have developed a fast and accurate algorithm for RNA pairwise structure superposition called SETTER and implemented it in the SETTER web server. However, though biological relationships can be inferred by a pairwise structure alignment, key features preserved by evolution can be identified only from a multiple structure alignment. Thus, we extended the SETTER algorithm to the alignment of multiple RNA structures and developed the MultiSETTER algorithm. In this paper, we present the updated version of the SETTER web server that implements a user friendly interface to the MultiSETTER algorithm. The server accepts RNA structures either as the list of PDB IDs or as user-defined PDB files. After the superposition is computed, structures are visualized in 3D and several reports and statistics are generated. To the best of our knowledge, the MultiSETTER web server is the first publicly available tool for a multiple RNA structure alignment. The MultiSETTER server offers the visual inspection of an alignment in 3D space which may reveal structural and functional relationships not captured by other multiple alignment methods based either on a sequence or on secondary structure motifs.
Iterative pass optimization of sequence data

NASA Technical Reports Server (NTRS)

Wheeler, Ward C.

2003-01-01

The problem of determining the minimum-cost hypothetical ancestral sequences for a given cladogram is known to be NP-complete. This "tree alignment" problem has motivated the considerable effort placed in multiple sequence alignment procedures. Wheeler in 1996 proposed a heuristic method, direct optimization, to calculate cladogram costs without the intervention of multiple sequence alignment. This method, though more efficient in time and more effective in cladogram length than many alignment-based procedures, greedily optimizes nodes based on descendent information only. In their proposal of an exact multiple alignment solution, Sankoff et al. in 1976 described a heuristic procedure--the iterative improvement method--to create alignments at internal nodes by solving a series of median problems. The combination of a three-sequence direct optimization with iterative improvement and a branch-length-based cladogram cost procedure, provides an algorithm that frequently results in superior (i.e., lower) cladogram costs. This iterative pass optimization is both computation and memory intensive, but economies can be made to reduce this burden. An example in arthropod systematics is discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.

icoshift: A versatile tool for the rapid alignment of 1D NMR spectra

NASA Astrophysics Data System (ADS)

Savorani, F.; Tomasi, G.; Engelsen, S. B.

2010-02-01

The increasing scientific and industrial interest towards metabonomics takes advantage from the high qualitative and quantitative information level of nuclear magnetic resonance (NMR) spectroscopy. However, several chemical and physical factors can affect the absolute and the relative position of an NMR signal and it is not always possible or desirable to eliminate these effects a priori. To remove misalignment of NMR signals a posteriori, several algorithms have been proposed in the literature. The icoshift program presented here is an open source and highly efficient program designed for solving signal alignment problems in metabonomic NMR data analysis. The icoshift algorithm is based on correlation shifting of spectral intervals and employs an FFT engine that aligns all spectra simultaneously. The algorithm is demonstrated to be faster than similar methods found in the literature making full-resolution alignment of large datasets feasible and thus avoiding down-sampling steps such as binning. The algorithm uses missing values as a filling alternative in order to avoid spectral artifacts at the segment boundaries. The algorithm is made open source and the Matlab code including documentation can be downloaded from www.models.life.ku.dk.
Genetic Algorithm Phase Retrieval for the Systematic Image-Based Optical Alignment Testbed

NASA Technical Reports Server (NTRS)

Rakoczy, John; Steincamp, James; Taylor, Jaime

2003-01-01

A reduced surrogate, one point crossover genetic algorithm with random rank-based selection was used successfully to estimate the multiple phases of a segmented optical system modeled on the seven-mirror Systematic Image-Based Optical Alignment testbed located at NASA's Marshall Space Flight Center.
Triangular Alignment (TAME). A Tensor-based Approach for Higher-order Network Alignment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mohammadi, Shahin; Gleich, David F.; Kolda, Tamara G.

2015-11-01

Network alignment is an important tool with extensive applications in comparative interactomics. Traditional approaches aim to simultaneously maximize the number of conserved edges and the underlying similarity of aligned entities. We propose a novel formulation of the network alignment problem that extends topological similarity to higher-order structures and provide a new objective function that maximizes the number of aligned substructures. This objective function corresponds to an integer programming problem, which is NP-hard. Consequently, we approximate this objective function as a surrogate function whose maximization results in a tensor eigenvalue problem. Based on this formulation, we present an algorithm called Triangularmore » AlignMEnt (TAME), which attempts to maximize the number of aligned triangles across networks. We focus on alignment of triangles because of their enrichment in complex networks; however, our formulation and resulting algorithms can be applied to general motifs. Using a case study on the NAPABench dataset, we show that TAME is capable of producing alignments with up to 99% accuracy in terms of aligned nodes. We further evaluate our method by aligning yeast and human interactomes. Our results indicate that TAME outperforms the state-of-art alignment methods both in terms of biological and topological quality of the alignments.« less
Accelerating large-scale protein structure alignments with graphics processing units

PubMed Central

2012-01-01

Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132
Some aspects of SR beamline alignment

NASA Astrophysics Data System (ADS)

Gaponov, Yu. A.; Cerenius, Y.; Nygaard, J.; Ursby, T.; Larsson, K.

2011-09-01

Based on the Synchrotron Radiation (SR) beamline optical element-by-element alignment with analysis of the alignment results an optimized beamline alignment algorithm has been designed and developed. The alignment procedures have been designed and developed for the MAX-lab I911-4 fixed energy beamline. It has been shown that the intermediate information received during the monochromator alignment stage can be used for the correction of both monochromator and mirror without the next stages of alignment of mirror, slits, sample holder, etc. Such an optimization of the beamline alignment procedures decreases the time necessary for the alignment and becomes useful and helpful in the case of any instability of the beamline optical elements, storage ring electron orbit or the wiggler insertion device, which could result in the instability of angular and positional parameters of the SR beam. A general purpose software package for manual, semi-automatic and automatic SR beamline alignment has been designed and developed using the developed algorithm. The TANGO control system is used as the middle-ware between the stand-alone beamline control applications BLTools, BPMonitor and the beamline equipment.
A new algorithm for distorted fingerprints matching based on normalized fuzzy similarity measure.

PubMed

Chen, Xinjian; Tian, Jie; Yang, Xin

2006-03-01

Coping with nonlinear distortions in fingerprint matching is a challenging task. This paper proposes a novel algorithm, normalized fuzzy similarity measure (NFSM), to deal with the nonlinear distortions. The proposed algorithm has two main steps. First, the template and input fingerprints were aligned. In this process, the local topological structure matching was introduced to improve the robustness of global alignment. Second, the method NFSM was introduced to compute the similarity between the template and input fingerprints. The proposed algorithm was evaluated on fingerprints databases of FVC2004. Experimental results confirm that NFSM is a reliable and effective algorithm for fingerprint matching with nonliner distortions. The algorithm gives considerably higher matching scores compared to conventional matching algorithms for the deformed fingerprints.
Attitude algorithm and initial alignment method for SINS applied in short-range aircraft

NASA Astrophysics Data System (ADS)

Zhang, Rong-Hui; He, Zhao-Cheng; You, Feng; Chen, Bo

2017-07-01

This paper presents an attitude solution algorithm based on the Micro-Electro-Mechanical System and quaternion method. We completed the numerical calculation and engineering practice by adopting fourth-order Runge-Kutta algorithm in the digital signal processor. The state space mathematical model of initial alignment in static base was established, and the initial alignment method based on Kalman filter was proposed. Based on the hardware in the loop simulation platform, the short-range flight simulation test and the actual flight test were carried out. The results show that the error of pitch, yaw and roll angle is fast convergent, and the fitting rate between flight simulation and flight test is more than 85%.
Quasiparticle Level Alignment for Photocatalytic Interfaces.

PubMed

Migani, Annapaoala; Mowbray, Duncan J; Zhao, Jin; Petek, Hrvoje; Rubio, Angel

2014-05-13

Electronic level alignment at the interface between an adsorbed molecular layer and a semiconducting substrate determines the activity and efficiency of many photocatalytic materials. Standard density functional theory (DFT)-based methods have proven unable to provide a quantitative description of this level alignment. This requires a proper treatment of the anisotropic screening, necessitating the use of quasiparticle (QP) techniques. However, the computational complexity of QP algorithms has meant a quantitative description of interfacial levels has remained elusive. We provide a systematic study of a prototypical interface, bare and methanol-covered rutile TiO2(110) surfaces, to determine the type of many-body theory required to obtain an accurate description of the level alignment. This is accomplished via a direct comparison with metastable impact electron spectroscopy (MIES), ultraviolet photoelectron spectroscopy (UPS), and two-photon photoemission (2PP) spectroscopy. We consider GGA DFT, hybrid DFT, and G0W0, scQPGW1, scQPGW0, and scQPGW QP calculations. Our results demonstrate that G0W0, or our recently introduced scQPGW1 approach, are required to obtain the correct alignment of both the highest occupied and lowest unoccupied interfacial molecular levels (HOMO/LUMO). These calculations set a new standard in the interpretation of electronic structure probe experiments of complex organic molecule/semiconductor interfaces.
Comparing Four Age Model Techniques using Nine Sediment Cores from the Iberian Margin

NASA Astrophysics Data System (ADS)

Lisiecki, L. E.; Jones, A. M.; Lawrence, C.

2017-12-01

Interpretations of paleoclimate records from ocean sediment cores rely on age models, which provide estimates of age as a function of core depth. Here we compare four methods used to generate age models for sediment cores for the past 140 kyr. The first method is based on radiocarbon dating using the Bayesian statistical software, Bacon [Blaauw and Christen, 2011]. The second method aligns benthic δ18O to a target core using the probabilistic alignment algorithm, HMM-Match, which also generates age uncertainty estimates [Lin et al., 2014]. The third and fourth methods are planktonic δ18O and sea surface temperature (SST) alignments to the same target core, using the alignment algorithm Match [Lisiecki and Lisiecki, 2002]. Unlike HMM-Match, Match requires parameter tuning and does not produce uncertainty estimates. The results of these four age model techniques are compared for nine high-resolution cores from the Iberian margin. The root mean square error between the individual age model results and each core's average estimated age is 1.4 kyr. Additionally, HMM-Match and Bacon age estimates agree to within uncertainty and have similar 95% confidence widths of 1-2 kyr for the highest resolution records. In one core, the planktonic and SST alignments did not fall within the 95% confidence intervals from HMM-Match. For this core, the surface proxy alignments likely produce more reliable results due to millennial-scale SST variability and the presence of several gaps in the benthic δ18O data. Similar studies of other oceanographic regions are needed to determine the spatial extents over which these climate proxies may be stratigraphically correlated.
A portable foot-parameter-extracting system

NASA Astrophysics Data System (ADS)

Zhang, MingKai; Liang, Jin; Li, Wenpan; Liu, Shifan

2016-03-01

In order to solve the problem of automatic foot measurement in garment customization, a new automatic footparameter- extracting system based on stereo vision, photogrammetry and heterodyne multiple frequency phase shift technology is proposed and implemented. The key technologies applied in the system are studied, including calibration of projector, alignment of point clouds, and foot measurement. Firstly, a new projector calibration algorithm based on plane model has been put forward to get the initial calibration parameters and a feature point detection scheme of calibration board image is developed. Then, an almost perfect match of two clouds is achieved by performing a first alignment using the Sampled Consensus - Initial Alignment algorithm (SAC-IA) and refining the alignment using the Iterative Closest Point algorithm (ICP). Finally, the approaches used for foot-parameterextracting and the system scheme are presented in detail. Experimental results show that the RMS error of the calibration result is 0.03 pixel and the foot parameter extracting experiment shows the feasibility of the extracting algorithm. Compared with the traditional measurement method, the system can be more portable, accurate and robust.
Heat-Treatment-Responsive Proteins in Different Developmental Stages of Tomato Pollen Detected by Targeted Mass Accuracy Precursor Alignment (tMAPA).

PubMed

Chaturvedi, Palak; Doerfler, Hannes; Jegadeesan, Sridharan; Ghatak, Arindam; Pressman, Etan; Castillejo, Maria Angeles; Wienkoop, Stefanie; Egelhofer, Volker; Firon, Nurit; Weckwerth, Wolfram

2015-11-06

Recently, we have developed a quantitative shotgun proteomics strategy called mass accuracy precursor alignment (MAPA). The MAPA algorithm uses high mass accuracy to bin mass-to-charge (m/z) ratios of precursor ions from LC-MS analyses, determines their intensities, and extracts a quantitative sample versus m/z ratio data alignment matrix from a multitude of samples. Here, we introduce a novel feature of this algorithm that allows the extraction and alignment of proteotypic peptide precursor ions or any other target peptide from complex shotgun proteomics data for accurate quantification of unique proteins. This strategy circumvents the problem of confusing the quantification of proteins due to indistinguishable protein isoforms by a typical shotgun proteomics approach. We applied this strategy to a comparison of control and heat-treated tomato pollen grains at two developmental stages, post-meiotic and mature. Pollen is a temperature-sensitive tissue involved in the reproductive cycle of plants and plays a major role in fruit setting and yield. By LC-MS-based shotgun proteomics, we identified more than 2000 proteins in total for all different tissues. By applying the targeted MAPA data-processing strategy, 51 unique proteins were identified as heat-treatment-responsive protein candidates. The potential function of the identified candidates in a specific developmental stage is discussed.
An extensive assessment of network alignment algorithms for comparison of brain connectomes.

PubMed

Milano, Marianna; Guzzi, Pietro Hiram; Tymofieva, Olga; Xu, Duan; Hess, Christofer; Veltri, Pierangelo; Cannataro, Mario

2017-06-06

Recently the study of the complex system of connections in neural systems, i.e. the connectome, has gained a central role in neurosciences. The modeling and analysis of connectomes are therefore a growing area. Here we focus on the representation of connectomes by using graph theory formalisms. Macroscopic human brain connectomes are usually derived from neuroimages; the analyzed brains are co-registered in the image domain and brought to a common anatomical space. An atlas is then applied in order to define anatomically meaningful regions that will serve as the nodes of the network - this process is referred to as parcellation. The atlas-based parcellations present some known limitations in cases of early brain development and abnormal anatomy. Consequently, it has been recently proposed to perform atlas-free random brain parcellation into nodes and align brains in the network space instead of the anatomical image space, as a way to deal with the unknown correspondences of the parcels. Such process requires modeling of the brain using graph theory and the subsequent comparison of the structure of graphs. The latter step may be modeled as a network alignment (NA) problem. In this work, we first define the problem formally, then we test six existing state of the art of network aligners on diffusion MRI-derived brain networks. We compare the performances of algorithms by assessing six topological measures. We also evaluated the robustness of algorithms to alterations of the dataset. The results confirm that NA algorithms may be applied in cases of atlas-free parcellation for a fully network-driven comparison of connectomes. The analysis shows MAGNA++ is the best global alignment algorithm. The paper presented a new analysis methodology that uses network alignment for validating atlas-free parcellation brain connectomes. The methodology has been experimented on several brain datasets.
Navigation strategy and filter design for solar electric missions

NASA Technical Reports Server (NTRS)

Tapley, B. D.; Hagar, H., Jr.

1972-01-01

Methods which have been proposed to improve the navigation accuracy for the low-thrust space vehicle include modifications to the standard Sequential- and Batch-type orbit determination procedures and the use of inertial measuring units (IMU) which measures directly the acceleration applied to the vehicle. The navigation accuracy obtained using one of the more promising modifications to the orbit determination procedures is compared with a combined IMU-Standard. The unknown accelerations are approximated as both first-order and second-order Gauss-Markov processes. The comparison is based on numerical results obtained in a study of the navigation requirements of a numerically simulated 152-day low-thrust mission to the asteroid Eros. The results obtained in the simulation indicate that the DMC algorithm will yield a significant improvement over the navigation accuracies achieved with previous estimation algorithms. In addition, the DMC algorithms will yield better navigation accuracies than the IMU-Standard Orbit Determination algorithm, except for extremely precise IMU measurements, i.e., gyroplatform alignment .01 deg and accelerometer signal-to-noise ratio .07. Unless these accuracies are achieved, the IMU navigation accuracies are generally unacceptable.
Improved alignment evaluation and optimization : final report.

DOT National Transportation Integrated Search

2007-09-11

This report outlines the development of an enhanced highway alignment evaluation and optimization : model. A GIS-based software tool is prepared for alignment optimization that uses genetic algorithms for : optimal search. The software is capable of ...
Can a semi-automated surface matching and principal axis-based algorithm accurately quantify femoral shaft fracture alignment in six degrees of freedom?

PubMed

Crookshank, Meghan C; Beek, Maarten; Singh, Devin; Schemitsch, Emil H; Whyne, Cari M

2013-07-01

Accurate alignment of femoral shaft fractures treated with intramedullary nailing remains a challenge for orthopaedic surgeons. The aim of this study is to develop and validate a cone-beam CT-based, semi-automated algorithm to quantify the malalignment in six degrees of freedom (6DOF) using a surface matching and principal axes-based approach. Complex comminuted diaphyseal fractures were created in nine cadaveric femora and cone-beam CT images were acquired (27 cases total). Scans were cropped and segmented using intensity-based thresholding, producing superior, inferior and comminution volumes. Cylinders were fit to estimate the long axes of the superior and inferior fragments. The angle and distance between the two cylindrical axes were calculated to determine flexion/extension and varus/valgus angulation and medial/lateral and anterior/posterior translations, respectively. Both surfaces were unwrapped about the cylindrical axes. Three methods of matching the unwrapped surface for determination of periaxial rotation were compared based on minimizing the distance between features. The calculated corrections were compared to the input malalignment conditions. All 6DOF were calculated to within current clinical tolerances for all but two cases. This algorithm yielded accurate quantification of malalignment of femoral shaft fractures for fracture gaps up to 60 mm, based on a single CBCT image of the fractured limb. Copyright © 2012 IPEM. Published by Elsevier Ltd. All rights reserved.
BiPACE 2D--graph-based multiple alignment for comprehensive 2D gas chromatography-mass spectrometry.

PubMed

Hoffmann, Nils; Wilhelm, Mathias; Doebbe, Anja; Niehaus, Karsten; Stoye, Jens

2014-04-01

Comprehensive 2D gas chromatography-mass spectrometry is an established method for the analysis of complex mixtures in analytical chemistry and metabolomics. It produces large amounts of data that require semiautomatic, but preferably automatic handling. This involves the location of significant signals (peaks) and their matching and alignment across different measurements. To date, there exist only a few openly available algorithms for the retention time alignment of peaks originating from such experiments that scale well with increasing sample and peak numbers, while providing reliable alignment results. We describe BiPACE 2D, an automated algorithm for retention time alignment of peaks from 2D gas chromatography-mass spectrometry experiments and evaluate it on three previously published datasets against the mSPA, SWPA and Guineu algorithms. We also provide a fourth dataset from an experiment studying the H2 production of two different strains of Chlamydomonas reinhardtii that is available from the MetaboLights database together with the experimental protocol, peak-detection results and manually curated multiple peak alignment for future comparability with newly developed algorithms. BiPACE 2D is contained in the freely available Maltcms framework, version 1.3, hosted at http://maltcms.sf.net, under the terms of the L-GPL v3 or Eclipse Open Source licenses. The software used for the evaluation along with the underlying datasets is available at the same location. The C.reinhardtii dataset is freely available at http://www.ebi.ac.uk/metabolights/MTBLS37.
QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families

PubMed Central

Gudyś, Adam; Deorowicz, Sebastian

2017-01-01

The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are ineffective when large protein families are investigated. We present QuickProbs 2, an algorithm for multiple sequence alignment. Based on probabilistic models, equipped with novel column-oriented refinement and selective consistency, it offers outstanding accuracy. When analysing hundreds of sequences, Quick-Probs 2 is noticeably better than ClustalΩ and MAFFT, the previous leaders for processing numerous protein families. In the case of smaller sets, for which consistency-based methods are the best performing, QuickProbs 2 is also superior to the competitors. Due to low computational requirements of selective consistency and utilization of massively parallel architectures, presented algorithm has similar execution times to ClustalΩ, and is orders of magnitude faster than full consistency approaches, like MSAProbs or PicXAA. All these make QuickProbs 2 an excellent tool for aligning families ranging from few, to hundreds of proteins. PMID:28139687
Modular and configurable optimal sequence alignment software: Cola.

PubMed

Zamani, Neda; Sundström, Görel; Höppner, Marc P; Grabherr, Manfred G

2014-01-01

The fundamental challenge in optimally aligning homologous sequences is to define a scoring scheme that best reflects the underlying biological processes. Maximising the overall number of matches in the alignment does not always reflect the patterns by which nucleotides mutate. Efficiently implemented algorithms that can be parameterised to accommodate more complex non-linear scoring schemes are thus desirable. We present Cola, alignment software that implements different optimal alignment algorithms, also allowing for scoring contiguous matches of nucleotides in a nonlinear manner. The latter places more emphasis on short, highly conserved motifs, and less on the surrounding nucleotides, which can be more diverged. To illustrate the differences, we report results from aligning 14,100 sequences from 3' untranslated regions of human genes to 25 of their mammalian counterparts, where we found that a nonlinear scoring scheme is more consistent than a linear scheme in detecting short, conserved motifs. Cola is freely available under LPGL from https://github.com/nedaz/cola.
Application of Improved 5th-Cubature Kalman Filter in Initial Strapdown Inertial Navigation System Alignment for Large Misalignment Angles.

PubMed

Wang, Wei; Chen, Xiyuan

2018-02-23

In view of the fact the accuracy of the third-degree Cubature Kalman Filter (CKF) used for initial alignment under large misalignment angle conditions is insufficient, an improved fifth-degree CKF algorithm is proposed in this paper. In order to make full use of the innovation on filtering, the innovation covariance matrix is calculated recursively by an innovative sequence with an exponent fading factor. Then a new adaptive error covariance matrix scaling algorithm is proposed. The Singular Value Decomposition (SVD) method is used for improving the numerical stability of the fifth-degree CKF in this paper. In order to avoid the overshoot caused by excessive scaling of error covariance matrix during the convergence stage, the scaling scheme is terminated when the gradient of azimuth reaches the maximum. The experimental results show that the improved algorithm has better alignment accuracy with large misalignment angles than the traditional algorithm.
Clustalnet: the joining of Clustal and CORBA.

PubMed

Campagne, F

2000-07-01

Performing sequence alignment operations from a different program than the original sequence alignment code, and/or through a network connection, is often required. Interactive alignment editors and large-scale biological data analysis are common examples where such a flexibility is important. Interoperability between the alignment engine and the client should be obtained regardless of the architectures and programming languages of the server and client. Clustalnet, a Clustal alignment CORBA server is described, which was developed on the basis of Clustalw. This server brings the robustness of the algorithms and implementations of Clustal to a new level of reuse. A Clustalnet server object can be accessed from a program, transparently through the network. We present interfaces to perform the alignment operations and to control these operations via immutable contexts. The interfaces that select the contexts do not depend on the nature of the operation to be performed, making the design modular. The IDL interfaces presented here are not specific to Clustal and can be implemented on top of different sequence alignment algorithm implementations.

SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics.

PubMed

Will, Sebastian; Otto, Christina; Miladi, Milad; Möhl, Mathias; Backofen, Rolf

2015-08-01

RNA-Seq experiments have revealed a multitude of novel ncRNAs. The gold standard for their analysis based on simultaneous alignment and folding suffers from extreme time complexity of [Formula: see text]. Subsequently, numerous faster 'Sankoff-style' approaches have been suggested. Commonly, the performance of such methods relies on sequence-based heuristics that restrict the search space to optimal or near-optimal sequence alignments; however, the accuracy of sequence-based methods breaks down for RNAs with sequence identities below 60%. Alignment approaches like LocARNA that do not require sequence-based heuristics, have been limited to high complexity ([Formula: see text] quartic time). Breaking this barrier, we introduce the novel Sankoff-style algorithm 'sparsified prediction and alignment of RNAs based on their structure ensembles (SPARSE)', which runs in quadratic time without sequence-based heuristics. To achieve this low complexity, on par with sequence alignment algorithms, SPARSE features strong sparsification based on structural properties of the RNA ensembles. Following PMcomp, SPARSE gains further speed-up from lightweight energy computation. Although all existing lightweight Sankoff-style methods restrict Sankoff's original model by disallowing loop deletions and insertions, SPARSE transfers the Sankoff algorithm to the lightweight energy model completely for the first time. Compared with LocARNA, SPARSE achieves similar alignment and better folding quality in significantly less time (speedup: 3.7). At similar run-time, it aligns low sequence identity instances substantially more accurate than RAF, which uses sequence-based heuristics. © The Author 2015. Published by Oxford University Press.
Automatic laser beam alignment using blob detection for an environment monitoring spectroscopy

NASA Astrophysics Data System (ADS)

Khidir, Jarjees; Chen, Youhua; Anderson, Gary

2013-05-01

This paper describes a fully automated system to align an infra-red laser beam with a small retro-reflector over a wide range of distances. The component development and test were especially used for an open-path spectrometer gas detection system. Using blob detection under OpenCV library, an automatic alignment algorithm was designed to achieve fast and accurate target detection in a complex background environment. Test results are presented to show that the proposed algorithm has been successfully applied to various target distances and environment conditions.
HubAlign: an accurate and efficient method for global alignment of protein-protein interaction networks.

PubMed

Hashemifar, Somaye; Xu, Jinbo

2014-09-01

High-throughput experimental techniques have produced a large amount of protein-protein interaction (PPI) data. The study of PPI networks, such as comparative analysis, shall benefit the understanding of life process and diseases at the molecular level. One way of comparative analysis is to align PPI networks to identify conserved or species-specific subnetwork motifs. A few methods have been developed for global PPI network alignment, but it still remains challenging in terms of both accuracy and efficiency. This paper presents a novel global network alignment algorithm, denoted as HubAlign, that makes use of both network topology and sequence homology information, based upon the observation that topologically important proteins in a PPI network usually are much more conserved and thus, more likely to be aligned. HubAlign uses a minimum-degree heuristic algorithm to estimate the topological and functional importance of a protein from the global network topology information. Then HubAlign aligns topologically important proteins first and gradually extends the alignment to the whole network. Extensive tests indicate that HubAlign greatly outperforms several popular methods in terms of both accuracy and efficiency, especially in detecting functionally similar proteins. HubAlign is available freely for non-commercial purposes at http://ttic.uchicago.edu/∼hashemifar/software/HubAlign.zip. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
SVM-dependent pairwise HMM: an application to protein pairwise alignments.

PubMed

Orlando, Gabriele; Raimondi, Daniele; Khan, Taushif; Lenaerts, Tom; Vranken, Wim F

2017-12-15

Methods able to provide reliable protein alignments are crucial for many bioinformatics applications. In the last years many different algorithms have been developed and various kinds of information, from sequence conservation to secondary structure, have been used to improve the alignment performances. This is especially relevant for proteins with highly divergent sequences. However, recent works suggest that different features may have different importance in diverse protein classes and it would be an advantage to have more customizable approaches, capable to deal with different alignment definitions. Here we present Rigapollo, a highly flexible pairwise alignment method based on a pairwise HMM-SVM that can use any type of information to build alignments. Rigapollo lets the user decide the optimal features to align their protein class of interest. It outperforms current state of the art methods on two well-known benchmark datasets when aligning highly divergent sequences. A Python implementation of the algorithm is available at http://ibsquare.be/rigapollo. wim.vranken@vub.be. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.

PubMed

Lin, Jie; Adjeroh, Donald A; Jiang, Bing-Hua; Jiang, Yue

2018-05-15

Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). yueljiang@163.com. Supplementary data are available at Bioinformatics online.
Validation of Coevolving Residue Algorithms via Pipeline Sensitivity Analysis: ELSC and OMES and ZNMI, Oh My!

PubMed Central

Brown, Christopher A.; Brown, Kevin S.

2010-01-01

Correlated amino acid substitution algorithms attempt to discover groups of residues that co-fluctuate due to either structural or functional constraints. Although these algorithms could inform both ab initio protein folding calculations and evolutionary studies, their utility for these purposes has been hindered by a lack of confidence in their predictions due to hard to control sources of error. To complicate matters further, naive users are confronted with a multitude of methods to choose from, in addition to the mechanics of assembling and pruning a dataset. We first introduce a new pair scoring method, called ZNMI (Z-scored-product Normalized Mutual Information), which drastically improves the performance of mutual information for co-fluctuating residue prediction. Second and more important, we recast the process of finding coevolving residues in proteins as a data-processing pipeline inspired by the medical imaging literature. We construct an ensemble of alignment partitions that can be used in a cross-validation scheme to assess the effects of choices made during the procedure on the resulting predictions. This pipeline sensitivity study gives a measure of reproducibility (how similar are the predictions given perturbations to the pipeline?) and accuracy (are residue pairs with large couplings on average close in tertiary structure?). We choose a handful of published methods, along with ZNMI, and compare their reproducibility and accuracy on three diverse protein families. We find that (i) of the algorithms tested, while none appear to be both highly reproducible and accurate, ZNMI is one of the most accurate by far and (ii) while users should be wary of predictions drawn from a single alignment, considering an ensemble of sub-alignments can help to determine both highly accurate and reproducible couplings. Our cross-validation approach should be of interest both to developers and end users of algorithms that try to detect correlated amino acid substitutions. PMID:20531955
Interactive software tool to comprehend the calculation of optimal sequence alignments with dynamic programming.

PubMed

Ibarra, Ignacio L; Melo, Francisco

2010-07-01

Dynamic programming (DP) is a general optimization strategy that is successfully used across various disciplines of science. In bioinformatics, it is widely applied in calculating the optimal alignment between pairs of protein or DNA sequences. These alignments form the basis of new, verifiable biological hypothesis. Despite its importance, there are no interactive tools available for training and education on understanding the DP algorithm. Here, we introduce an interactive computer application with a graphical interface, for the purpose of educating students about DP. The program displays the DP scoring matrix and the resulting optimal alignment(s), while allowing the user to modify key parameters such as the values in the similarity matrix, the sequence alignment algorithm version and the gap opening/extension penalties. We hope that this software will be useful to teachers and students of bioinformatics courses, as well as researchers who implement the DP algorithm for diverse applications. The software is freely available at: http:/melolab.org/sat. The software is written in the Java computer language, thus it runs on all major platforms and operating systems including Windows, Mac OS X and LINUX. All inquiries or comments about this software should be directed to Francisco Melo at fmelo@bio.puc.cl.
Addition of Improved Shock-Capturing Schemes to OVERFLOW 2.1

NASA Technical Reports Server (NTRS)

Burning, Pieter G.; Nichols, Robert H.; Tramel, Robert W.

2009-01-01

Existing approximate Riemann solvers do not perform well when the grid is not aligned with strong shocks in the flow field. Three new approximate Riemann algorithms are investigated to improve solution accuracy and stability in the vicinity of strong shocks. The new algorithms are compared to the existing upwind algorithms in OVERFLOW 2.1. The new algorithms use a multidimensional pressure gradient based switch to transition to a more numerically dissipative algorithm in the vicinity of strong shocks. One new algorithm also attempts to artificially thicken captured shocks in order to alleviate the errors in the solution introduced by "stair-stepping" of the shock resulting from the approximate Riemann solver. This algorithm performed well for all the example cases and produced results that were almost insensitive to the alignment of the grid and the shock.
A basic analysis toolkit for biological sequences

PubMed Central

Giancarlo, Raffaele; Siragusa, Alessandro; Siragusa, Enrico; Utro, Filippo

2007-01-01

This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functions. Moreover, it also supports filtering operations to select strings from a set and establish their statistical significance, via z-score computation. None of the algorithms is new, but although they are generally regarded as fundamental for sequence analysis, they have not been implemented in a single and consistent software package, as we do here. Therefore, our main contribution is to fill this gap between algorithmic theory and practice by providing an extensible and easy to use software library that includes algorithms for the mentioned string matching and alignment problems. The library consists of C/C++ library functions as well as Perl library functions. It can be interfaced with Bioperl and can also be used as a stand-alone system with a GUI. The software is available at under the GNU GPL. PMID:17877802
Cryo-EM image alignment based on nonuniform fast Fourier transform.

PubMed

Yang, Zhengfan; Penczek, Pawel A

2008-08-01

In single particle analysis, two-dimensional (2-D) alignment is a fundamental step intended to put into register various particle projections of biological macromolecules collected at the electron microscope. The efficiency and quality of three-dimensional (3-D) structure reconstruction largely depends on the computational speed and alignment accuracy of this crucial step. In order to improve the performance of alignment, we introduce a new method that takes advantage of the highly accurate interpolation scheme based on the gridding method, a version of the nonuniform fast Fourier transform, and utilizes a multi-dimensional optimization algorithm for the refinement of the orientation parameters. Using simulated data, we demonstrate that by using less than half of the sample points and taking twice the runtime, our new 2-D alignment method achieves dramatically better alignment accuracy than that based on quadratic interpolation. We also apply our method to image to volume registration, the key step in the single particle EM structure refinement protocol. We find that in this case the accuracy of the method not only surpasses the accuracy of the commonly used real-space implementation, but results are achieved in much shorter time, making gridding-based alignment a perfect candidate for efficient structure determination in single particle analysis.
Cryo-EM Image Alignment Based on Nonuniform Fast Fourier Transform

PubMed Central

Yang, Zhengfan; Penczek, Pawel A.

2008-01-01

In single particle analysis, two-dimensional (2-D) alignment is a fundamental step intended to put into register various particle projections of biological macromolecules collected at the electron microscope. The efficiency and quality of three-dimensional (3-D) structure reconstruction largely depends on the computational speed and alignment accuracy of this crucial step. In order to improve the performance of alignment, we introduce a new method that takes advantage of the highly accurate interpolation scheme based on the gridding method, a version of the nonuniform Fast Fourier Transform, and utilizes a multi-dimensional optimization algorithm for the refinement of the orientation parameters. Using simulated data, we demonstrate that by using less than half of the sample points and taking twice the runtime, our new 2-D alignment method achieves dramatically better alignment accuracy than that based on quadratic interpolation. We also apply our method to image to volume registration, the key step in the single particle EM structure refinement protocol. We find that in this case the accuracy of the method not only surpasses the accuracy of the commonly used real-space implementation, but results are achieved in much shorter time, making gridding-based alignment a perfect candidate for efficient structure determination in single particle analysis. PMID:18499351
Initial Alignment for SINS Based on Pseudo-Earth Frame in Polar Regions.

PubMed

Gao, Yanbin; Liu, Meng; Li, Guangchun; Guang, Xingxing

2017-06-16

An accurate initial alignment must be required for inertial navigation system (INS). The performance of initial alignment directly affects the following navigation accuracy. However, the rapid convergence of meridians and the small horizontalcomponent of rotation of Earth make the traditional alignment methods ineffective in polar regions. In this paper, from the perspective of global inertial navigation, a novel alignment algorithm based on pseudo-Earth frame and backward process is proposed to implement the initial alignment in polar regions. Considering that an accurate coarse alignment of azimuth is difficult to obtain in polar regions, the dynamic error modeling with large azimuth misalignment angle is designed. At the end of alignment phase, the strapdown attitude matrix relative to local geographic frame is obtained without influence of position errors and cumbersome computation. As a result, it would be more convenient to access the following polar navigation system. Then, it is also expected to unify the polar alignment algorithm as much as possible, thereby further unifying the form of external reference information. Finally, semi-physical static simulation and in-motion tests with large azimuth misalignment angle assisted by unscented Kalman filter (UKF) validate the effectiveness of the proposed method.
A short note on dynamic programming in a band.

PubMed

Gibrat, Jean-François

2018-06-15

Third generation sequencing technologies generate long reads that exhibit high error rates, in particular for insertions and deletions which are usually the most difficult errors to cope with. The only exact algorithm capable of aligning sequences with insertions and deletions is a dynamic programming algorithm. In this note, for the sake of efficiency, we consider dynamic programming in a band. We show how to choose the band width in function of the long reads' error rates, thus obtaining an [Formula: see text] algorithm in space and time. We also propose a procedure to decide whether this algorithm, when applied to semi-global alignments, provides the optimal score. We suggest that dynamic programming in a band is well suited to the problem of aligning long reads between themselves and can be used as a core component of methods for obtaining a consensus sequence from the long reads alone. The function implementing the dynamic programming algorithm in a band is available, as a standalone program, at: https://forgemia.inra.fr/jean-francois.gibrat/BAND_DYN_PROG.git.
LPV Modeling of a Flexible Wing Aircraft Using Modal Alignment and Adaptive Gridding Methods

NASA Technical Reports Server (NTRS)

Al-Jiboory, Ali Khudhair; Zhu, Guoming; Swei, Sean Shan-Min; Su, Weihua; Nguyen, Nhan T.

2017-01-01

One of the earliest approaches in gain-scheduling control is the gridding based approach, in which a set of local linear time-invariant models are obtained at various gridded points corresponding to the varying parameters within the flight envelop. In order to ensure smooth and effective Linear Parameter-Varying control, aligning all the flexible modes within each local model and maintaining small number of representative local models over the gridded parameter space are crucial. In addition, since the flexible structural models tend to have large dimensions, a tractable model reduction process is necessary. In this paper, the notion of s-shifted H2- and H Infinity-norm are introduced and used as a metric to measure the model mismatch. A new modal alignment algorithm is developed which utilizes the defined metric for aligning all the local models over the entire gridded parameter space. Furthermore, an Adaptive Grid Step Size Determination algorithm is developed to minimize the number of local models required to represent the gridded parameter space. For model reduction, we propose to utilize the concept of Composite Modal Cost Analysis, through which the collective contribution of each flexible mode is computed and ranked. Therefore, a reduced-order model is constructed by retaining only those modes with significant contribution. The NASA Generic Transport Model operating at various flight speeds is studied for verification purpose, and the analysis and simulation results demonstrate the effectiveness of the proposed modeling approach.
Spacecraft alignment estimation. [for onboard sensors

NASA Technical Reports Server (NTRS)

Shuster, Malcolm D.; Bierman, Gerald J.

1988-01-01

A numerically well-behaved factorized methodology is developed for estimating spacecraft sensor alignments from prelaunch and inflight data without the need to compute the spacecraft attitude or angular velocity. Such a methodology permits the estimation of sensor alignments (or other biases) in a framework free of unknown dynamical variables. In actual mission implementation such an algorithm is usually better behaved than one that must compute sensor alignments simultaneously with the spacecraft attitude, for example by means of a Kalman filter. In particular, such a methodology is less sensitive to data dropouts of long duration, and the derived measurement used in the attitude-independent algorithm usually makes data checking and editing of outliers much simpler than would be the case in the filter.
Unsupervised parameter optimization for automated retention time alignment of severely shifted gas chromatographic data using the piecework alignment algorithm.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pierce, Karisa M.; Wright, Bob W.; Synovec, Robert E.

2007-02-02

First, simulated chromatographic separations with declining retention time precision were used to study the performance of the piecewise retention time alignment algorithm and to demonstrate an unsupervised parameter optimization method. The average correlation coefficient between the first chromatogram and every other chromatogram in the data set was used to optimize the alignment parameters. This correlation method does not require a training set, so it is unsupervised and automated. This frees the user from needing to provide class information and makes the alignment algorithm more generally applicable to classifying completely unknown data sets. For a data set of simulated chromatograms wheremore » the average chromatographic peak was shifted past two neighboring peaks between runs, the average correlation coefficient of the raw data was 0.46 ± 0.25. After automated, optimized piecewise alignment, the average correlation coefficient was 0.93 ± 0.02. Additionally, a relative shift metric and principal component analysis (PCA) were used to independently quantify and categorize the alignment performance, respectively. The relative shift metric was defined as four times the standard deviation of a given peak’s retention time in all of the chromatograms, divided by the peak-width-at-base. The raw simulated data sets that were studied contained peaks with average relative shifts ranging between 0.3 and 3.0. Second, a “real” data set of gasoline separations was gathered using three different GC methods to induce severe retention time shifting. In these gasoline separations, retention time precision improved ~8 fold following alignment. Finally, piecewise alignment and the unsupervised correlation optimization method were applied to severely shifted GC separations of reformate distillation fractions. The effect of piecewise alignment on peak heights and peak areas is also reported. Piecewise alignment either did not change the peak height, or caused it to slightly decrease. The average relative difference in peak height after piecewise alignment was –0.20%. Piecewise alignment caused the peak areas to either stay the same, slightly increase, or slightly decrease. The average absolute relative difference in area after piecewise alignment was 0.15%.« less
Advanced Dispersed Fringe Sensing Algorithm for Coarse Phasing Segmented Mirror Telescopes

NASA Technical Reports Server (NTRS)

Spechler, Joshua A.; Hoppe, Daniel J.; Sigrist, Norbert; Shi, Fang; Seo, Byoung-Joon; Bikkannavar, Siddarayappa A.

2013-01-01

Segment mirror phasing, a critical step of segment mirror alignment, requires the ability to sense and correct the relative pistons between segments from up to a few hundred microns to a fraction of wavelength in order to bring the mirror system to its full diffraction capability. When sampling the aperture of a telescope, using auto-collimating flats (ACFs) is more economical. The performance of a telescope with a segmented primary mirror strongly depends on how well those primary mirror segments can be phased. One such process to phase primary mirror segments in the axial piston direction is dispersed fringe sensing (DFS). DFS technology can be used to co-phase the ACFs. DFS is essentially a signal fitting and processing operation. It is an elegant method of coarse phasing segmented mirrors. DFS performance accuracy is dependent upon careful calibration of the system as well as other factors such as internal optical alignment, system wavefront errors, and detector quality. Novel improvements to the algorithm have led to substantial enhancements in DFS performance. The Advanced Dispersed Fringe Sensing (ADFS) Algorithm is designed to reduce the sensitivity to calibration errors by determining the optimal fringe extraction line. Applying an angular extraction line dithering procedure and combining this dithering process with an error function while minimizing the phase term of the fitted signal, defines in essence the ADFS algorithm.
Dinucleotide controlled null models for comparative RNA gene prediction.

PubMed

Gesell, Tanja; Washietl, Stefan

2008-05-27

Comparative prediction of RNA structures can be used to identify functional noncoding RNAs in genomic screens. It was shown recently by Babak et al. [BMC Bioinformatics. 8:33] that RNA gene prediction programs can be biased by the genomic dinucleotide content, in particular those programs using a thermodynamic folding model including stacking energies. As a consequence, there is need for dinucleotide-preserving control strategies to assess the significance of such predictions. While there have been randomization algorithms for single sequences for many years, the problem has remained challenging for multiple alignments and there is currently no algorithm available. We present a program called SISSIz that simulates multiple alignments of a given average dinucleotide content. Meeting additional requirements of an accurate null model, the randomized alignments are on average of the same sequence diversity and preserve local conservation and gap patterns. We make use of a phylogenetic substitution model that includes overlapping dependencies and site-specific rates. Using fast heuristics and a distance based approach, a tree is estimated under this model which is used to guide the simulations. The new algorithm is tested on vertebrate genomic alignments and the effect on RNA structure predictions is studied. In addition, we directly combined the new null model with the RNAalifold consensus folding algorithm giving a new variant of a thermodynamic structure based RNA gene finding program that is not biased by the dinucleotide content. SISSIz implements an efficient algorithm to randomize multiple alignments preserving dinucleotide content. It can be used to get more accurate estimates of false positive rates of existing programs, to produce negative controls for the training of machine learning based programs, or as standalone RNA gene finding program. Other applications in comparative genomics that require randomization of multiple alignments can be considered. SISSIz is available as open source C code that can be compiled for every major platform and downloaded here: http://sourceforge.net/projects/sissiz.
SATCHMO-JS: a webserver for simultaneous protein multiple sequence alignment and phylogenetic tree construction.

PubMed

Hagopian, Raffi; Davidson, John R; Datta, Ruchira S; Samad, Bushra; Jarvis, Glen R; Sjölander, Kimmen

2010-07-01

We present the jump-start simultaneous alignment and tree construction using hidden Markov models (SATCHMO-JS) web server for simultaneous estimation of protein multiple sequence alignments (MSAs) and phylogenetic trees. The server takes as input a set of sequences in FASTA format, and outputs a phylogenetic tree and MSA; these can be viewed online or downloaded from the website. SATCHMO-JS is an extension of the SATCHMO algorithm, and employs a divide-and-conquer strategy to jump-start SATCHMO at a higher point in the phylogenetic tree, reducing the computational complexity of the progressive all-versus-all HMM-HMM scoring and alignment. Results on a benchmark dataset of 983 structurally aligned pairs from the PREFAB benchmark dataset show that SATCHMO-JS provides a statistically significant improvement in alignment accuracy over MUSCLE, Multiple Alignment using Fast Fourier Transform (MAFFT), ClustalW and the original SATCHMO algorithm. The SATCHMO-JS webserver is available at http://phylogenomics.berkeley.edu/satchmo-js. The datasets used in these experiments are available for download at http://phylogenomics.berkeley.edu/satchmo-js/supplementary/.
Application of Improved 5th-Cubature Kalman Filter in Initial Strapdown Inertial Navigation System Alignment for Large Misalignment Angles

PubMed Central

Wang, Wei; Chen, Xiyuan

2018-01-01

In view of the fact the accuracy of the third-degree Cubature Kalman Filter (CKF) used for initial alignment under large misalignment angle conditions is insufficient, an improved fifth-degree CKF algorithm is proposed in this paper. In order to make full use of the innovation on filtering, the innovation covariance matrix is calculated recursively by an innovative sequence with an exponent fading factor. Then a new adaptive error covariance matrix scaling algorithm is proposed. The Singular Value Decomposition (SVD) method is used for improving the numerical stability of the fifth-degree CKF in this paper. In order to avoid the overshoot caused by excessive scaling of error covariance matrix during the convergence stage, the scaling scheme is terminated when the gradient of azimuth reaches the maximum. The experimental results show that the improved algorithm has better alignment accuracy with large misalignment angles than the traditional algorithm. PMID:29473912

An alternative view of protein fold space.

PubMed

Shindyalov, I N; Bourne, P E

2000-02-15

Comparing and subsequently classifying protein structures information has received significant attention concurrent with the increase in the number of experimentally derived 3-dimensional structures. Classification schemes have focused on biological function found within protein domains and on structure classification based on topology. Here an alternative view is presented that groups substructures. Substructures are long (50-150 residue) highly repetitive near-contiguous pieces of polypeptide chain that occur frequently in a set of proteins from the PDB defined as structurally non-redundant over the complete polypeptide chain. The substructure classification is based on a previously reported Combinatorial Extension (CE) algorithm that provides a significantly different set of structure alignments than those previously described, having, for example, only a 40% overlap with FSSP. Qualitatively the algorithm provides longer contiguous aligned segments at the price of a slightly higher root-mean-square deviation (rmsd). Clustering these alignments gives a discreet and highly repetitive set of substructures not detectable by sequence similarity alone. In some cases different substructures represent all or different parts of well known folds indicative of the Russian doll effect--the continuity of protein fold space. In other cases they fall into different structure and functional classifications. It is too early to determine whether these newly classified substructures represent new insights into the evolution of a structural framework important to many proteins. What is apparent from on-going work is that these substructures have the potential to be useful probes in finding remote sequence homology and in structure prediction studies. The characteristics of the complete all-by-all comparison of the polypeptide chains present in the PDB and details of the filtering procedure by pair-wise structure alignment that led to the emergent substructure gallery are discussed. Substructure classification, alignments, and tools to analyze them are available at http://cl.sdsc.edu/ce.html.
An algorithm of improving speech emotional perception for hearing aid

NASA Astrophysics Data System (ADS)

Xi, Ji; Liang, Ruiyu; Fei, Xianju

2017-07-01

In this paper, a speech emotion recognition (SER) algorithm was proposed to improve the emotional perception of hearing-impaired people. The algorithm utilizes multiple kernel technology to overcome the drawback of SVM: slow training speed. Firstly, in order to improve the adaptive performance of Gaussian Radial Basis Function (RBF), the parameter determining the nonlinear mapping was optimized on the basis of Kernel target alignment. Then, the obtained Kernel Function was used as the basis kernel of Multiple Kernel Learning (MKL) with slack variable that could solve the over-fitting problem. However, the slack variable also brings the error into the result. Therefore, a soft-margin MKL was proposed to balance the margin against the error. Moreover, the relatively iterative algorithm was used to solve the combination coefficients and hyper-plane equations. Experimental results show that the proposed algorithm can acquire an accuracy of 90% for five kinds of emotions including happiness, sadness, anger, fear and neutral. Compared with KPCA+CCA and PIM-FSVM, the proposed algorithm has the highest accuracy.
NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks.

PubMed

Hu, Jialu; Kehr, Birte; Reinert, Knut

2014-02-15

Owing to recent advancements in high-throughput technologies, protein-protein interaction networks of more and more species become available in public databases. The question of how to identify functionally conserved proteins across species attracts a lot of attention in computational biology. Network alignments provide a systematic way to solve this problem. However, most existing alignment tools encounter limitations in tackling this problem. Therefore, the demand for faster and more efficient alignment tools is growing. We present a fast and accurate algorithm, NetCoffee, which allows to find a global alignment of multiple protein-protein interaction networks. NetCoffee searches for a global alignment by maximizing a target function using simulated annealing on a set of weighted bipartite graphs that are constructed using a triplet approach similar to T-Coffee. To assess its performance, NetCoffee was applied to four real datasets. Our results suggest that NetCoffee remedies several limitations of previous algorithms, outperforms all existing alignment tools in terms of speed and nevertheless identifies biologically meaningful alignments. The source code and data are freely available for download under the GNU GPL v3 license at https://code.google.com/p/netcoffee/.
A rib-specific multimodal registration algorithm for fused unfolded rib visualization using PET/CT

NASA Astrophysics Data System (ADS)

Kaftan, Jens N.; Kopaczka, Marcin; Wimmer, Andreas; Platsch, Günther; Declerck, Jérôme

2014-03-01

Respiratory motion affects the alignment of PET and CT volumes from PET/CT examinations in a non-rigid manner. This becomes particularly apparent if reviewing fine anatomical structures such as ribs when assessing bone metastases, which frequently occur in many advanced cancers. To make this routine diagnostic task more efficient, a fused unfolded rib visualization for 18F-NaF PET/CT is presented. It allows to review the whole rib cage in a single image. This advanced visualization is enabled by a novel rib-specific registration algorithm that rigidly optimizes the local alignment of each individual rib in both modalities based on a matched filter response function. More specifically, rib centerlines are automatically extracted from CT and subsequently individually aligned to the corresponding bone-specific PET rib uptake pattern. The proposed method has been validated on 20 PET/CT scans acquired at different clinical sites. It has been demonstrated that the presented rib- specific registration method significantly improves the rib alignment without having to run complex deformable registration algorithms. At the same time, it guarantees that rib lesions are not further deformed, which may otherwise affect quantitative measurements such as SUVs. Considering clinically relevant distance thresholds, the centerline portion with good alignment compared to the ground truth improved from 60:6% to 86:7% after registration while approximately 98% can be still considered as acceptably aligned.
Approximate matching of regular expressions.

PubMed

Myers, E W; Miller, W

1989-01-01

Given a sequence A and regular expression R, the approximate regular expression matching problem is to find a sequence matching R whose optimal alignment with A is the highest scoring of all such sequences. This paper develops an algorithm to solve the problem in time O(MN), where M and N are the lengths of A and R. Thus, the time requirement is asymptotically no worse than for the simpler problem of aligning two fixed sequences. Our method is superior to an earlier algorithm by Wagner and Seiferas in several ways. First, it treats real-valued costs, in addition to integer costs, with no loss of asymptotic efficiency. Second, it requires only O(N) space to deliver just the score of the best alignment. Finally, its structure permits implementation techniques that make it extremely fast in practice. We extend the method to accommodate gap penalties, as required for typical applications in molecular biology, and further refine it to search for sub-strings of A that strongly align with a sequence in R, as required for typical data base searches. We also show how to deliver an optimal alignment between A and R in only O(N + log M) space using O(MN log M) time. Finally, an O(MN(M + N) + N2log N) time algorithm is presented for alignment scoring schemes where the cost of a gap is an arbitrary increasing function of its length.
Sequence analysis of Leukemia DNA

NASA Astrophysics Data System (ADS)

Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa

2018-03-01

Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.
Least Squares Approach to the Alignment of the Generic High Precision Tracking System

NASA Astrophysics Data System (ADS)

de Renstrom, Pawel Brückman; Haywood, Stephen

2006-04-01

A least squares method to solve a generic alignment problem of a high granularity tracking system is presented. The algorithm is based on an analytical linear expansion and allows for multiple nested fits, e.g. imposing a common vertex for groups of particle tracks is of particular interest. We present a consistent and complete recipe to impose constraints on either implicit or explicit parameters. The method has been applied to the full simulation of a subset of the ATLAS silicon tracking system. The ultimate goal is to determine ≈35,000 degrees of freedom (DoF's). We present a limited scale exercise exploring various aspects of the solution.
HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads

PubMed Central

Li, Pinghao; Jiang, Xiaoqian; Wang, Shuang; Kim, Jihoon; Xiong, Hongkai; Ohno-Machado, Lucila

2014-01-01

Background and objective Short-read sequencing is becoming the standard of practice for the study of structural variants associated with disease. However, with the growth of sequence data largely surpassing reasonable storage capability, the biomedical community is challenged with the management, transfer, archiving, and storage of sequence data. Methods We developed Hierarchical mUlti-reference Genome cOmpression (HUGO), a novel compression algorithm for aligned reads in the sorted Sequence Alignment/Map (SAM) format. We first aligned short reads against a reference genome and stored exactly mapped reads for compression. For the inexact mapped or unmapped reads, we realigned them against different reference genomes using an adaptive scheme by gradually shortening the read length. Regarding the base quality value, we offer lossy and lossless compression mechanisms. The lossy compression mechanism for the base quality values uses k-means clustering, where a user can adjust the balance between decompression quality and compression rate. The lossless compression can be produced by setting k (the number of clusters) to the number of different quality values. Results The proposed method produced a compression ratio in the range 0.5–0.65, which corresponds to 35–50% storage savings based on experimental datasets. The proposed approach achieved 15% more storage savings over CRAM and comparable compression ratio with Samcomp (CRAM and Samcomp are two of the state-of-the-art genome compression algorithms). The software is freely available at https://sourceforge.net/projects/hierachicaldnac/with a General Public License (GPL) license. Limitation Our method requires having different reference genomes and prolongs the execution time for additional alignments. Conclusions The proposed multi-reference-based compression algorithm for aligned reads outperforms existing single-reference based algorithms. PMID:24368726
TU-D-209-03: Alignment of the Patient Graphic Model Using Fluoroscopic Images for Skin Dose Mapping

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oines, A; Oines, A; Kilian-Meneghin, J

2016-06-15

Purpose: The Dose Tracking System (DTS) was developed to provide realtime feedback of skin dose and dose rate during interventional fluoroscopic procedures. A color map on a 3D graphic of the patient represents the cumulative dose distribution on the skin. Automated image correlation algorithms are described which use the fluoroscopic procedure images to align and scale the patient graphic for more accurate dose mapping. Methods: Currently, the DTS employs manual patient graphic selection and alignment. To improve the accuracy of dose mapping and automate the software, various methods are explored to extract information about the beam location and patient morphologymore » from the procedure images. To match patient anatomy with a reference projection image, preprocessing is first used, including edge enhancement, edge detection, and contour detection. Template matching algorithms from OpenCV are then employed to find the location of the beam. Once a match is found, the reference graphic is scaled and rotated to fit the patient, using image registration correlation functions in Matlab. The algorithm runs correlation functions for all points and maps all correlation confidences to a surface map. The highest point of correlation is used for alignment and scaling. The transformation data is saved for later model scaling. Results: Anatomic recognition is used to find matching features between model and image and image registration correlation provides for alignment and scaling at any rotation angle with less than onesecond runtime, and at noise levels in excess of 150% of those found in normal procedures. Conclusion: The algorithm provides the necessary scaling and alignment tools to improve the accuracy of dose distribution mapping on the patient graphic with the DTS. Partial support from NIH Grant R01-EB002873 and Toshiba Medical Systems Corp.« less
Iterative Stable Alignment and Clustering of 2D Transmission Electron Microscope Images

PubMed Central

Yang, Zhengfan; Fang, Jia; Chittuluru, Johnathan; Asturias, Francisco J.; Penczek, Pawel A.

2012-01-01

SUMMARY Identification of homogeneous subsets of images in a macromolecular electron microscopy (EM) image data set is a critical step in single-particle analysis. The task is handled by iterative algorithms, whose performance is compromised by the compounded limitations of image alignment and K-means clustering. Here we describe an approach, iterative stable alignment and clustering (ISAC) that, relying on a new clustering method and on the concepts of stability and reproducibility, can extract validated, homogeneous subsets of images. ISAC requires only a small number of simple parameters and, with minimal human intervention, can eliminate bias from two-dimensional image clustering and maximize the quality of group averages that can be used for ab initio three-dimensional structural determination and analysis of macromolecular conformational variability. Repeated testing of the stability and reproducibility of a solution within ISAC eliminates heterogeneous or incorrect classes and introduces critical validation to the process of EM image clustering. PMID:22325773
Clustering algorithms for identifying core atom sets and for assessing the precision of protein structure ensembles.

PubMed

Snyder, David A; Montelione, Gaetano T

2005-06-01

An important open question in the field of NMR-based biomolecular structure determination is how best to characterize the precision of the resulting ensemble of structures. Typically, the RMSD, as minimized in superimposing the ensemble of structures, is the preferred measure of precision. However, the presence of poorly determined atomic coordinates and multiple "RMSD-stable domains"--locally well-defined regions that are not aligned in global superimpositions--complicate RMSD calculations. In this paper, we present a method, based on a novel, structurally defined order parameter, for identifying a set of core atoms to use in determining superimpositions for RMSD calculations. In addition we present a method for deciding whether to partition that core atom set into "RMSD-stable domains" and, if so, how to determine partitioning of the core atom set. We demonstrate our algorithm and its application in calculating statistically sound RMSD values by applying it to a set of NMR-derived structural ensembles, superimposing each RMSD-stable domain (or the entire core atom set, where appropriate) found in each protein structure under consideration. A parameter calculated by our algorithm using a novel, kurtosis-based criterion, the epsilon-value, is a measure of precision of the superimposition that complements the RMSD. In addition, we compare our algorithm with previously described algorithms for determining core atom sets. The methods presented in this paper for biomolecular structure superimposition are quite general, and have application in many areas of structural bioinformatics and structural biology.
Dynamic programming algorithms for biological sequence comparison.

PubMed

Pearson, W R; Miller, W

1992-01-01

Efficient dynamic programming algorithms are available for a broad class of protein and DNA sequence comparison problems. These algorithms require computer time proportional to the product of the lengths of the two sequences being compared [O(N2)] but require memory space proportional only to the sum of these lengths [O(N)]. Although the requirement for O(N2) time limits use of the algorithms to the largest computers when searching protein and DNA sequence databases, many other applications of these algorithms, such as calculation of distances for evolutionary trees and comparison of a new sequence to a library of sequence profiles, are well within the capabilities of desktop computers. In particular, the results of library searches with rapid searching programs, such as FASTA or BLAST, should be confirmed by performing a rigorous optimal alignment. Whereas rapid methods do not overlook significant sequence similarities, FASTA limits the number of gaps that can be inserted into an alignment, so that a rigorous alignment may extend the alignment substantially in some cases. BLAST does not allow gaps in the local regions that it reports; a calculation that allows gaps is very likely to extend the alignment substantially. Although a Monte Carlo evaluation of the statistical significance of a similarity score with a rigorous algorithm is much slower than the heuristic approach used by the RDF2 program, the dynamic programming approach should take less than 1 hr on a 386-based PC or desktop Unix workstation. For descriptive purposes, we have limited our discussion to methods for calculating similarity scores and distances that use gap penalties of the form g = rk. Nevertheless, programs for the more general case (g = q+rk) are readily available. Versions of these programs that run either on Unix workstations, IBM-PC class computers, or the Macintosh can be obtained from either of the authors.
A Fast Approximate Algorithm for Mapping Long Reads to Large Reference Databases.

PubMed

Jain, Chirag; Dilthey, Alexander; Koren, Sergey; Aluru, Srinivas; Phillippy, Adam M

2018-04-30

Emerging single-molecule sequencing technologies from Pacific Biosciences and Oxford Nanopore have revived interest in long-read mapping algorithms. Alignment-based seed-and-extend methods demonstrate good accuracy, but face limited scalability, while faster alignment-free methods typically trade decreased precision for efficiency. In this article, we combine a fast approximate read mapping algorithm based on minimizers with a novel MinHash identity estimation technique to achieve both scalability and precision. In contrast to prior methods, we develop a mathematical framework that defines the types of mapping targets we uncover, establish probabilistic estimates of p-value and sensitivity, and demonstrate tolerance for alignment error rates up to 20%. With this framework, our algorithm automatically adapts to different minimum length and identity requirements and provides both positional and identity estimates for each mapping reported. For mapping human PacBio reads to the hg38 reference, our method is 290 × faster than Burrows-Wheeler Aligner-MEM with a lower memory footprint and recall rate of 96%. We further demonstrate the scalability of our method by mapping noisy PacBio reads (each ≥5 kbp in length) to the complete NCBI RefSeq database containing 838 Gbp of sequence and >60,000 genomes.
Cascaded face alignment via intimacy definition feature

NASA Astrophysics Data System (ADS)

Li, Hailiang; Lam, Kin-Man; Chiu, Man-Yau; Wu, Kangheng; Lei, Zhibin

2017-09-01

Recent years have witnessed the emerging popularity of regression-based face aligners, which directly learn mappings between facial appearance and shape-increment manifolds. We propose a random-forest based, cascaded regression model for face alignment by using a locally lightweight feature, namely intimacy definition feature. This feature is more discriminative than the pose-indexed feature, more efficient than the histogram of oriented gradients feature and the scale-invariant feature transform feature, and more compact than the local binary feature (LBF). Experimental validation of our algorithm shows that our approach achieves state-of-the-art performance when testing on some challenging datasets. Compared with the LBF-based algorithm, our method achieves about twice the speed, 20% improvement in terms of alignment accuracy and saves an order of magnitude on memory requirement.
Sparse Unorganized Point Cloud Based Relative Pose Estimation for Uncooperative Space Target.

PubMed

Yin, Fang; Chou, Wusheng; Wu, Yun; Yang, Guang; Xu, Song

2018-03-28

This paper proposes an autonomous algorithm to determine the relative pose between the chaser spacecraft and the uncooperative space target, which is essential in advanced space applications, e.g., on-orbit serving missions. The proposed method, named Congruent Tetrahedron Align (CTA) algorithm, uses the very sparse unorganized 3D point cloud acquired by a LIDAR sensor, and does not require any prior pose information. The core of the method is to determine the relative pose by looking for the congruent tetrahedron in scanning point cloud and model point cloud on the basis of its known model. The two-level index hash table is built for speeding up the search speed. In addition, the Iterative Closest Point (ICP) algorithm is used for pose tracking after CTA. In order to evaluate the method in arbitrary initial attitude, a simulated system is presented. Specifically, the performance of the proposed method to provide the initial pose needed for the tracking algorithm is demonstrated, as well as their robustness against noise. Finally, a field experiment is conducted and the results demonstrated the effectiveness of the proposed method.
Sparse Unorganized Point Cloud Based Relative Pose Estimation for Uncooperative Space Target

PubMed Central

Chou, Wusheng; Wu, Yun; Yang, Guang; Xu, Song

2018-01-01

This paper proposes an autonomous algorithm to determine the relative pose between the chaser spacecraft and the uncooperative space target, which is essential in advanced space applications, e.g., on-orbit serving missions. The proposed method, named Congruent Tetrahedron Align (CTA) algorithm, uses the very sparse unorganized 3D point cloud acquired by a LIDAR sensor, and does not require any prior pose information. The core of the method is to determine the relative pose by looking for the congruent tetrahedron in scanning point cloud and model point cloud on the basis of its known model. The two-level index hash table is built for speeding up the search speed. In addition, the Iterative Closest Point (ICP) algorithm is used for pose tracking after CTA. In order to evaluate the method in arbitrary initial attitude, a simulated system is presented. Specifically, the performance of the proposed method to provide the initial pose needed for the tracking algorithm is demonstrated, as well as their robustness against noise. Finally, a field experiment is conducted and the results demonstrated the effectiveness of the proposed method. PMID:29597323
Chunk Alignment for Corpus-Based Machine Translation

ERIC Educational Resources Information Center

Kim, Jae Dong

2011-01-01

Since sub-sentential alignment is critically important to the translation quality of an Example-Based Machine Translation (EBMT) system, which operates by finding and combining phrase-level matches against the training examples, we developed a new alignment algorithm for the purpose of improving the EBMT system's performance. This new…
SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics

PubMed Central

Will, Sebastian; Otto, Christina; Miladi, Milad; Möhl, Mathias; Backofen, Rolf

2015-01-01

Motivation: RNA-Seq experiments have revealed a multitude of novel ncRNAs. The gold standard for their analysis based on simultaneous alignment and folding suffers from extreme time complexity of O(n6). Subsequently, numerous faster ‘Sankoff-style’ approaches have been suggested. Commonly, the performance of such methods relies on sequence-based heuristics that restrict the search space to optimal or near-optimal sequence alignments; however, the accuracy of sequence-based methods breaks down for RNAs with sequence identities below 60%. Alignment approaches like LocARNA that do not require sequence-based heuristics, have been limited to high complexity (≥ quartic time). Results: Breaking this barrier, we introduce the novel Sankoff-style algorithm ‘sparsified prediction and alignment of RNAs based on their structure ensembles (SPARSE)’, which runs in quadratic time without sequence-based heuristics. To achieve this low complexity, on par with sequence alignment algorithms, SPARSE features strong sparsification based on structural properties of the RNA ensembles. Following PMcomp, SPARSE gains further speed-up from lightweight energy computation. Although all existing lightweight Sankoff-style methods restrict Sankoff’s original model by disallowing loop deletions and insertions, SPARSE transfers the Sankoff algorithm to the lightweight energy model completely for the first time. Compared with LocARNA, SPARSE achieves similar alignment and better folding quality in significantly less time (speedup: 3.7). At similar run-time, it aligns low sequence identity instances substantially more accurate than RAF, which uses sequence-based heuristics. Availability and implementation: SPARSE is freely available at http://www.bioinf.uni-freiburg.de/Software/SPARSE. Contact: backofen@informatik.uni-freiburg.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25838465
Aligning a Receiving Antenna Array to Reduce Interference

NASA Technical Reports Server (NTRS)

Jongeling, Andre P.; Rogstad, David H.

2009-01-01

A digital signal-processing algorithm has been devised as a means of aligning (as defined below) the outputs of multiple receiving radio antennas in a large array for the purpose of receiving a desired weak signal transmitted by a single distant source in the presence of an interfering signal that (1) originates at another source lying within the antenna beam and (2) occupies a frequency band significantly wider than that of the desired signal. In the original intended application of the algorithm, the desired weak signal is a spacecraft telemetry signal, the antennas are spacecraft-tracking antennas in NASA s Deep Space Network, and the source of the wide-band interfering signal is typically a radio galaxy or a planet that lies along or near the line of sight to the spacecraft. The algorithm could also afford the ability to discriminate between desired narrow-band and nearby undesired wide-band sources in related applications that include satellite and terrestrial radio communications and radio astronomy. The development of the present algorithm involved modification of a prior algorithm called SUMPLE and a predecessor called SIMPLE. SUMPLE was described in Algorithm for Aligning an Array of Receiving Radio Antennas (NPO-40574), NASA Tech Briefs Vol. 30, No. 4 (April 2006), page 54. To recapitulate: As used here, aligning signifies adjusting the delays and phases of the outputs from the various antennas so that their relatively weak replicas of the desired signal can be added coherently to increase the signal-to-noise ratio (SNR) for improved reception, as though one had a single larger antenna. Prior to the development of SUMPLE, it was common practice to effect alignment by means of a process that involves correlation of signals in pairs. SIMPLE is an example of an algorithm that effects such a process. SUMPLE also involves correlations, but the correlations are not performed in pairs. Instead, in a partly iterative process, each signal is appropriately weighted and then correlated with a composite signal equal to the sum of the other signals.
A novel image toggle tool for comparison of serial mammograms: automatic density normalization and alignment-development of the tool and initial experience.

PubMed

Honda, Satoshi; Tsunoda, Hiroko; Fukuda, Wataru; Saida, Yukihisa

2014-12-01

The purpose is to develop a new image toggle tool with automatic density normalization (ADN) and automatic alignment (AA) for comparing serial digital mammograms (DMGs). We developed an ADN and AA process to compare the images of serial DMGs. In image density normalization, a linear interpolation was applied by taking two points of high- and low-brightness areas. The alignment was calculated by determining the point of the greatest correlation while shifting the alignment between the current and prior images. These processes were performed on a PC with a 3.20-GHz Xeon processor and 8 GB of main memory. We selected 12 suspected breast cancer patients who had undergone screening DMGs in the past. Automatic processing was retrospectively performed on these images. Two radiologists subjectively evaluated them. The process of the developed algorithm took approximately 1 s per image. In our preliminary experience, two images could not be aligned approximately. When they were aligned, image toggling allowed detection of differences between examinations easily. We developed a new tool to facilitate comparative reading of DMGs on a mammography viewing system. Using this tool for toggling comparisons might improve the interpretation efficiency of serial DMGs.

Overcoming Sequence Misalignments with Weighted Structural Superposition

PubMed Central

Khazanov, Nickolay A.; Damm-Ganamet, Kelly L.; Quang, Daniel X.; Carlson, Heather A.

2012-01-01

An appropriate structural superposition identifies similarities and differences between homologous proteins that are not evident from sequence alignments alone. We have coupled our Gaussian-weighted RMSD (wRMSD) tool with a sequence aligner and seed extension (SE) algorithm to create a robust technique for overlaying structures and aligning sequences of homologous proteins (HwRMSD). HwRMSD overcomes errors in the initial sequence alignment that would normally propagate into a standard RMSD overlay. SE can generate a corrected sequence alignment from the improved structural superposition obtained by wRMSD. HwRMSD’s robust performance and its superiority over standard RMSD are demonstrated over a range of homologous proteins. Its better overlay results in corrected sequence alignments with good agreement to HOMSTRAD. Finally, HwRMSD is compared to established structural alignment methods: FATCAT, SSM, CE, and Dalilite. Most methods are comparable at placing residue pairs within 2 Å, but HwRMSD places many more residue pairs within 1 Å, providing a clear advantage. Such high accuracy is essential in drug design, where small distances can have a large impact on computational predictions. This level of accuracy is also needed to correct sequence alignments in an automated fashion, especially for omics-scale analysis. HwRMSD can align homologs with low sequence identity and large conformational differences, cases where both sequence-based and structural-based methods may fail. The HwRMSD pipeline overcomes the dependency of structural overlays on initial sequence pairing and removes the need to determine the best sequence-alignment method, substitution matrix, and gap parameters for each unique pair of homologs. PMID:22733542
Modified Mahalanobis Taguchi System for Imbalance Data Classification

PubMed Central

2017-01-01

The Mahalanobis Taguchi System (MTS) is considered one of the most promising binary classification algorithms to handle imbalance data. Unfortunately, MTS lacks a method for determining an efficient threshold for the binary classification. In this paper, a nonlinear optimization model is formulated based on minimizing the distance between MTS Receiver Operating Characteristics (ROC) curve and the theoretical optimal point named Modified Mahalanobis Taguchi System (MMTS). To validate the MMTS classification efficacy, it has been benchmarked with Support Vector Machines (SVMs), Naive Bayes (NB), Probabilistic Mahalanobis Taguchi Systems (PTM), Synthetic Minority Oversampling Technique (SMOTE), Adaptive Conformal Transformation (ACT), Kernel Boundary Alignment (KBA), Hidden Naive Bayes (HNB), and other improved Naive Bayes algorithms. MMTS outperforms the benchmarked algorithms especially when the imbalance ratio is greater than 400. A real life case study on manufacturing sector is used to demonstrate the applicability of the proposed model and to compare its performance with Mahalanobis Genetic Algorithm (MGA). PMID:28811820
Accurate and robust brain image alignment using boundary-based registration.

PubMed

Greve, Douglas N; Fischl, Bruce

2009-10-15

The fine spatial scales of the structures in the human brain represent an enormous challenge to the successful integration of information from different images for both within- and between-subject analysis. While many algorithms to register image pairs from the same subject exist, visual inspection shows that their accuracy and robustness to be suspect, particularly when there are strong intensity gradients and/or only part of the brain is imaged. This paper introduces a new algorithm called Boundary-Based Registration, or BBR. The novelty of BBR is that it treats the two images very differently. The reference image must be of sufficient resolution and quality to extract surfaces that separate tissue types. The input image is then aligned to the reference by maximizing the intensity gradient across tissue boundaries. Several lower quality images can be aligned through their alignment with the reference. Visual inspection and fMRI results show that BBR is more accurate than correlation ratio or normalized mutual information and is considerably more robust to even strong intensity inhomogeneities. BBR also excels at aligning partial-brain images to whole-brain images, a domain in which existing registration algorithms frequently fail. Even in the limit of registering a single slice, we show the BBR results to be robust and accurate.
Selection of optimal oligonucleotide probes for microarrays usingmultiple criteria, global alignment and parameter estimation.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Xingyuan; He, Zhili; Zhou, Jizhong

2005-10-30

The oligonucleotide specificity for microarray hybridizationcan be predicted by its sequence identity to non-targets, continuousstretch to non-targets, and/or binding free energy to non-targets. Mostcurrently available programs only use one or two of these criteria, whichmay choose 'false' specific oligonucleotides or miss 'true' optimalprobes in a considerable proportion. We have developed a software tool,called CommOligo using new algorithms and all three criteria forselection of optimal oligonucleotide probes. A series of filters,including sequence identity, free energy, continuous stretch, GC content,self-annealing, distance to the 3'-untranslated region (3'-UTR) andmelting temperature (Tm), are used to check each possibleoligonucleotide. A sequence identity is calculated based onmore » gapped globalalignments. A traversal algorithm is used to generate alignments for freeenergy calculation. The optimal Tm interval is determined based on probecandidates that have passed all other filters. Final probes are pickedusing a combination of user-configurable piece-wise linear functions andan iterative process. The thresholds for identity, stretch and freeenergy filters are automatically determined from experimental data by anaccessory software tool, CommOligo_PE (CommOligo Parameter Estimator).The program was used to design probes for both whole-genome and highlyhomologous sequence data. CommOligo and CommOligo_PE are freely availableto academic users upon request.« less
Alignment and Calibration of Optical and Inertial Sensors Using Stellar Observations

DTIC Science & Technology

2007-01-01

Force, Department of Defense, or the U.S Government. References [1] R. G. Brown and P. Y. Hwang . Introduction to Ran- dom Signals and Applied Kalman ...and stellar observations using an extended Kalman filter algorithm. The approach is verified using simulation and experimental data, and con- clusions...an extended Kalman filter (EKF) algorithm (see [10], [11]) to recur- sively estimate camera alignment and calibration param- eters by measuring the
An additional reference axis improves femoral rotation alignment in image-free computer navigation assisted total knee arthroplasty.

PubMed

Inui, Hiroshi; Taketomi, Shuji; Nakamura, Kensuke; Sanada, Takaki; Tanaka, Sakae; Nakagawa, Takumi

2013-05-01

Few studies have demonstrated improvement in accuracy of rotational alignment using image-free navigation systems mainly due to the inconsistent registration of anatomical landmarks. We have used an image-free navigation for total knee arthroplasty, which adopts the average algorithm between two reference axes (transepicondylar axis and axis perpendicular to the Whiteside axis) for femoral component rotation control. We hypothesized that addition of another axis (condylar twisting axis measured on a preoperative radiograph) would improve the accuracy. One group using the average algorithm (double-axis group) was compared with the other group using another axis to confirm the accuracy of the average algorithm (triple-axis group). Femoral components were more accurately implanted for rotational alignment in the triple-axis group (ideal: triple-axis group 100%, double-axis group 82%, P<0.05). Copyright © 2013 Elsevier Inc. All rights reserved.
A new graph-based method for pairwise global network alignment

PubMed Central

Klau, Gunnar W

2009-01-01

Background In addition to component-based comparative approaches, network alignments provide the means to study conserved network topology such as common pathways and more complex network motifs. Yet, unlike in classical sequence alignment, the comparison of networks becomes computationally more challenging, as most meaningful assumptions instantly lead to NP-hard problems. Most previous algorithmic work on network alignments is heuristic in nature. Results We introduce the graph-based maximum structural matching formulation for pairwise global network alignment. We relate the formulation to previous work and prove NP-hardness of the problem. Based on the new formulation we build upon recent results in computational structural biology and present a novel Lagrangian relaxation approach that, in combination with a branch-and-bound method, computes provably optimal network alignments. The Lagrangian algorithm alone is a powerful heuristic method, which produces solutions that are often near-optimal and – unlike those computed by pure heuristics – come with a quality guarantee. Conclusion Computational experiments on the alignment of protein-protein interaction networks and on the classification of metabolic subnetworks demonstrate that the new method is reasonably fast and has advantages over pure heuristics. Our software tool is freely available as part of the LISA library. PMID:19208162
What's in your next-generation sequence data? An exploration of unmapped DNA and RNA sequence reads from the bovine reference individual

USDA-ARS?s Scientific Manuscript database

BACKGROUND: Next-generation sequencing projects commonly commence by aligning reads to a reference genome assembly. While improvements in alignment algorithms and computational hardware have greatly enhanced the efficiency and accuracy of alignments, a significant percentage of reads often remain u...
Single-particle cryo-EM using alignment by classification (ABC): the structure of Lumbricus terrestris haemoglobin.

PubMed

Afanasyev, Pavel; Seer-Linnemayr, Charlotte; Ravelli, Raimond B G; Matadeen, Rishi; De Carlo, Sacha; Alewijnse, Bart; Portugal, Rodrigo V; Pannu, Navraj S; Schatz, Michael; van Heel, Marin

2017-09-01

Single-particle cryogenic electron microscopy (cryo-EM) can now yield near-atomic resolution structures of biological complexes. However, the reference-based alignment algorithms commonly used in cryo-EM suffer from reference bias, limiting their applicability (also known as the 'Einstein from random noise' problem). Low-dose cryo-EM therefore requires robust and objective approaches to reveal the structural information contained in the extremely noisy data, especially when dealing with small structures. A reference-free pipeline is presented for obtaining near-atomic resolution three-dimensional reconstructions from heterogeneous ('four-dimensional') cryo-EM data sets. The methodologies integrated in this pipeline include a posteriori camera correction, movie-based full-data-set contrast transfer function determination, movie-alignment algorithms, (Fourier-space) multivariate statistical data compression and unsupervised classification, 'random-startup' three-dimensional reconstructions, four-dimensional structural refinements and Fourier shell correlation criteria for evaluating anisotropic resolution. The procedures exclusively use information emerging from the data set itself, without external 'starting models'. Euler-angle assignments are performed by angular reconstitution rather than by the inherently slower projection-matching approaches. The comprehensive 'ABC-4D' pipeline is based on the two-dimensional reference-free 'alignment by classification' (ABC) approach, where similar images in similar orientations are grouped by unsupervised classification. Some fundamental differences between X-ray crystallography versus single-particle cryo-EM data collection and data processing are discussed. The structure of the giant haemoglobin from Lumbricus terrestris at a global resolution of ∼3.8 Å is presented as an example of the use of the ABC-4D procedure.
Incremental Ontology-Based Extraction and Alignment in Semi-structured Documents

NASA Astrophysics Data System (ADS)

Thiam, Mouhamadou; Bennacer, Nacéra; Pernelle, Nathalie; Lô, Moussa

SHIRIis an ontology-based system for integration of semi-structured documents related to a specific domain. The system’s purpose is to allow users to access to relevant parts of documents as answers to their queries. SHIRI uses RDF/OWL for representation of resources and SPARQL for their querying. It relies on an automatic, unsupervised and ontology-driven approach for extraction, alignment and semantic annotation of tagged elements of documents. In this paper, we focus on the Extract-Align algorithm which exploits a set of named entity and term patterns to extract term candidates to be aligned with the ontology. It proceeds in an incremental manner in order to populate the ontology with terms describing instances of the domain and to reduce the access to extern resources such as Web. We experiment it on a HTML corpus related to call for papers in computer science and the results that we obtain are very promising. These results show how the incremental behaviour of Extract-Align algorithm enriches the ontology and the number of terms (or named entities) aligned directly with the ontology increases.
Joint Multi-Leaf Segmentation, Alignment, and Tracking for Fluorescence Plant Videos.

PubMed

Yin, Xi; Liu, Xiaoming; Chen, Jin; Kramer, David M

2018-06-01

This paper proposes a novel framework for fluorescence plant video processing. The plant research community is interested in the leaf-level photosynthetic analysis within a plant. A prerequisite for such analysis is to segment all leaves, estimate their structures, and track them over time. We identify this as a joint multi-leaf segmentation, alignment, and tracking problem. First, leaf segmentation and alignment are applied on the last frame of a plant video to find a number of well-aligned leaf candidates. Second, leaf tracking is applied on the remaining frames with leaf candidate transformation from the previous frame. We form two optimization problems with shared terms in their objective functions for leaf alignment and tracking respectively. A quantitative evaluation framework is formulated to evaluate the performance of our algorithm with four metrics. Two models are learned to predict the alignment accuracy and detect tracking failure respectively in order to provide guidance for subsequent plant biology analysis. The limitation of our algorithm is also studied. Experimental results show the effectiveness, efficiency, and robustness of the proposed method.
Transcription Factor Map Alignment of Promoter Regions

PubMed Central

Blanco, Enrique; Messeguer, Xavier; Smith, Temple F; Guigó, Roderic

2006-01-01

We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments. PMID:16733547
Image Registration for Stability Testing of MEMS

NASA Technical Reports Server (NTRS)

Memarsadeghi, Nargess; LeMoigne, Jacqueline; Blake, Peter N.; Morey, Peter A.; Landsman, Wayne B.; Chambers, Victor J.; Moseley, Samuel H.

2011-01-01

Image registration, or alignment of two or more images covering the same scenes or objects, is of great interest in many disciplines such as remote sensing, medical imaging. astronomy, and computer vision. In this paper, we introduce a new application of image registration algorithms. We demonstrate how through a wavelet based image registration algorithm, engineers can evaluate stability of Micro-Electro-Mechanical Systems (MEMS). In particular, we applied image registration algorithms to assess alignment stability of the MicroShutters Subsystem (MSS) of the Near Infrared Spectrograph (NIRSpec) instrument of the James Webb Space Telescope (JWST). This work introduces a new methodology for evaluating stability of MEMS devices to engineers as well as a new application of image registration algorithms to computer scientists.
Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

PubMed Central

2011-01-01

Background Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. Results In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0. Conclusions The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research. PMID:21867510
Simulator for beam-based LHC collimator alignment

NASA Astrophysics Data System (ADS)

Valentino, Gianluca; Aßmann, Ralph; Redaelli, Stefano; Sammut, Nicholas

2014-02-01

In the CERN Large Hadron Collider, collimators need to be set up to form a multistage hierarchy to ensure efficient multiturn cleaning of halo particles. Automatic algorithms were introduced during the first run to reduce the beam time required for beam-based setup, improve the alignment accuracy, and reduce the risk of human errors. Simulating the alignment procedure would allow for off-line tests of alignment policies and algorithms. A simulator was developed based on a diffusion beam model to generate the characteristic beam loss signal spike and decay produced when a collimator jaw touches the beam, which is observed in a beam loss monitor (BLM). Empirical models derived from the available measurement data are used to simulate the steady-state beam loss and crosstalk between multiple BLMs. The simulator design is presented, together with simulation results and comparison to measurement data.
MANGO: a new approach to multiple sequence alignment.

PubMed

Zhang, Zefeng; Lin, Hao; Li, Ming

2007-01-01

Multiple sequence alignment is a classical and challenging task for biological sequence analysis. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state of the art multiple sequence alignment programs suffer from the 'once a gap, always a gap' phenomenon. Is there a radically new way to do multiple sequence alignment? This paper introduces a novel and orthogonal multiple sequence alignment method, using multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds are provably significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks showing that MANGO compares favorably, in both accuracy and speed, against state-of-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, Prob-ConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0 and Kalign 2.0.
A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment

PubMed Central

Freschi, Valerio; Bogliolo, Alessandro

2012-01-01

In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment. PMID:22518086
Validation of Splicing Events in Transcriptome Sequencing Data

PubMed Central

Kaisers, Wolfgang; Ptok, Johannes; Schwender, Holger; Schaal, Heiner

2017-01-01

Genomic alignments of sequenced cellular messenger RNA contain gapped alignments which are interpreted as consequence of intron removal. The resulting gap-sites, genomic locations of alignment gaps, are landmarks representing potential splice-sites. As alignment algorithms report gap-sites with a considerable false discovery rate, validations are required. We describe two quality scores, gap quality score (gqs) and weighted gap information score (wgis), developed for validation of putative splicing events: While gqs solely relies on alignment data wgis additionally considers information from the genomic sequence. FASTQ files obtained from 54 human dermal fibroblast samples were aligned against the human genome (GRCh38) using TopHat and STAR aligner. Statistical properties of gap-sites validated by gqs and wgis were evaluated by their sequence similarity to known exon-intron borders. Within the 54 samples, TopHat identifies 1,000,380 and STAR reports 6,487,577 gap-sites. Due to the lack of strand information, however, the percentage of identified GT-AG gap-sites is rather low. While gap-sites from TopHat contain ≈89% GT-AG, gap-sites from STAR only contain ≈42% GT-AG dinucleotide pairs in merged data from 54 fibroblast samples. Validation with gqs yields 156,251 gap-sites from TopHat alignments and 166,294 from STAR alignments. Validation with wgis yields 770,327 gap-sites from TopHat alignments and 1,065,596 from STAR alignments. Both alignment algorithms, TopHat and STAR, report gap-sites with considerable false discovery rate, which can drastically be reduced by validation with gqs and wgis. PMID:28545234
Evaluation of GMI and PMI diffeomorphic‐based demons algorithms for aligning PET and CT Images

PubMed Central

Yang, Juan; Zhang, You; Yin, Yong

2015-01-01

Fusion of anatomic information in computed tomography (CT) and functional information in F18‐FDG positron emission tomography (PET) is crucial for accurate differentiation of tumor from benign masses, designing radiotherapy treatment plan and staging of cancer. Although current PET and CT images can be acquired from combined F18‐FDG PET/CT scanner, the two acquisitions are scanned separately and take a long time, which may induce potential positional errors in global and local caused by respiratory motion or organ peristalsis. So registration (alignment) of whole‐body PET and CT images is a prerequisite for their meaningful fusion. The purpose of this study was to assess the performance of two multimodal registration algorithms for aligning PET and CT images. The proposed gradient of mutual information (GMI)‐based demons algorithm, which incorporated the GMI between two images as an external force to facilitate the alignment, was compared with the point‐wise mutual information (PMI) diffeomorphic‐based demons algorithm whose external force was modified by replacing the image intensity difference in diffeomorphic demons algorithm with the PMI to make it appropriate for multimodal image registration. Eight patients with esophageal cancer(s) were enrolled in this IRB‐approved study. Whole‐body PET and CT images were acquired from a combined F18‐FDG PET/CT scanner for each patient. The modified Hausdorff distance (dMH) was used to evaluate the registration accuracy of the two algorithms. Of all patients, the mean values and standard deviations (SDs) of dMH were 6.65 (± 1.90) voxels and 6.01 (± 1.90) after the GMI‐based demons and the PMI diffeomorphic‐based demons registration algorithms respectively. Preliminary results on oncological patients showed that the respiratory motion and organ peristalsis in PET/CT esophageal images could not be neglected, although a combined F18‐FDG PET/CT scanner was used for image acquisition. The PMI diffeomorphic‐based demons algorithm was more accurate than the GMI‐based demons algorithm in registering PET/CT esophageal images. PACS numbers: 87.57.nj, 87.57. Q‐, 87.57.uk PMID:26218993
Evaluation of GMI and PMI diffeomorphic-based demons algorithms for aligning PET and CT Images.

PubMed

Yang, Juan; Wang, Hongjun; Zhang, You; Yin, Yong

2015-07-08

Fusion of anatomic information in computed tomography (CT) and functional information in 18F-FDG positron emission tomography (PET) is crucial for accurate differentiation of tumor from benign masses, designing radiotherapy treatment plan and staging of cancer. Although current PET and CT images can be acquired from combined 18F-FDG PET/CT scanner, the two acquisitions are scanned separately and take a long time, which may induce potential positional errors in global and local caused by respiratory motion or organ peristalsis. So registration (alignment) of whole-body PET and CT images is a prerequisite for their meaningful fusion. The purpose of this study was to assess the performance of two multimodal registration algorithms for aligning PET and CT images. The proposed gradient of mutual information (GMI)-based demons algorithm, which incorporated the GMI between two images as an external force to facilitate the alignment, was compared with the point-wise mutual information (PMI) diffeomorphic-based demons algorithm whose external force was modified by replacing the image intensity difference in diffeomorphic demons algorithm with the PMI to make it appropriate for multimodal image registration. Eight patients with esophageal cancer(s) were enrolled in this IRB-approved study. Whole-body PET and CT images were acquired from a combined 18F-FDG PET/CT scanner for each patient. The modified Hausdorff distance (d(MH)) was used to evaluate the registration accuracy of the two algorithms. Of all patients, the mean values and standard deviations (SDs) of d(MH) were 6.65 (± 1.90) voxels and 6.01 (± 1.90) after the GMI-based demons and the PMI diffeomorphic-based demons registration algorithms respectively. Preliminary results on oncological patients showed that the respiratory motion and organ peristalsis in PET/CT esophageal images could not be neglected, although a combined 18F-FDG PET/CT scanner was used for image acquisition. The PMI diffeomorphic-based demons algorithm was more accurate than the GMI-based demons algorithm in registering PET/CT esophageal images.

Goddard Space Flight Center (GSFC) Flight Dynamics Facility (FDF) calibration of the Upper Atmosphere Research Satellite (UARS) sensors

NASA Technical Reports Server (NTRS)

Hashmall, J.; Garrick, J.

1993-01-01

Flight Dynamics Facility (FDF) responsibilities for calibration of Upper Atmosphere Research Satellite (UARS) sensors included alignment calibration of the fixed-head star trackers (FHST's) and the fine Sun sensor (FSS), determination of misalignments and scale factors for the inertial reference units (IRU's), determination of biases for the three-axis magnetometers (TAM's) and Earth sensor assemblies (ESA's), determination of gimbal misalignments of the Solar/Stellar Pointing Platform (SSPP), and field-of-view calibration for the FSS's mounted both on the Modular Attitude Control System (MACS) and on the SSPP. The calibrations, which used a combination of new and established algorithms, gave excellent results. Alignment calibration results markedly improved the accuracy of both ground and onboard Computer (OBC) attitude determination. SSPP calibration results allowed UARS to identify stars in the period immediately after yaw maneuvers, removing the delay required for the OBC to reacquire its fine pointing attitude mode. SSPP calibration considerably improved the pointing accuracy of the attached science instrument package. This paper presents a summary of the methods used and the results of all FDF UARS sensor calibration.
A novel fully automatic scheme for fiducial marker-based alignment in electron tomography.

PubMed

Han, Renmin; Wang, Liansan; Liu, Zhiyong; Sun, Fei; Zhang, Fa

2015-12-01

Although the topic of fiducial marker-based alignment in electron tomography (ET) has been widely discussed for decades, alignment without human intervention remains a difficult problem. Specifically, the emergence of subtomogram averaging has increased the demand for batch processing during tomographic reconstruction; fully automatic fiducial marker-based alignment is the main technique in this process. However, the lack of an accurate method for detecting and tracking fiducial markers precludes fully automatic alignment. In this paper, we present a novel, fully automatic alignment scheme for ET. Our scheme has two main contributions: First, we present a series of algorithms to ensure a high recognition rate and precise localization during the detection of fiducial markers. Our proposed solution reduces fiducial marker detection to a sampling and classification problem and further introduces an algorithm to solve the parameter dependence of marker diameter and marker number. Second, we propose a novel algorithm to solve the tracking of fiducial markers by reducing the tracking problem to an incomplete point set registration problem. Because a global optimization of a point set registration occurs, the result of our tracking is independent of the initial image position in the tilt series, allowing for the robust tracking of fiducial markers without pre-alignment. The experimental results indicate that our method can achieve an accurate tracking, almost identical to the current best one in IMOD with half automatic scheme. Furthermore, our scheme is fully automatic, depends on fewer parameters (only requires a gross value of the marker diameter) and does not require any manual interaction, providing the possibility of automatic batch processing of electron tomographic reconstruction. Copyright © 2015 Elsevier Inc. All rights reserved.
Automatic detection of left and right ventricles from CTA enables efficient alignment of anatomy with myocardial perfusion data.

PubMed

Piccinelli, Marina; Faber, Tracy L; Arepalli, Chesnal D; Appia, Vikram; Vinten-Johansen, Jakob; Schmarkey, Susan L; Folks, Russell D; Garcia, Ernest V; Yezzi, Anthony

2014-02-01

Accurate alignment between cardiac CT angiographic studies (CTA) and nuclear perfusion images is crucial for improved diagnosis of coronary artery disease. This study evaluated in an animal model the accuracy of a CTA fully automated biventricular segmentation algorithm, a necessary step for automatic and thus efficient PET/CT alignment. Twelve pigs with acute infarcts were imaged using Rb-82 PET and 64-slice CTA. Post-mortem myocardium mass measurements were obtained. Endocardial and epicardial myocardial boundaries were manually and automatically detected on the CTA and both segmentations used to perform PET/CT alignment. To assess the segmentation performance, image-based myocardial masses were compared to experimental data; the hand-traced profiles were used as a reference standard to assess the global and slice-by-slice robustness of the automated algorithm in extracting myocardium, LV, and RV. Mean distances between the automated and the manual 3D segmented surfaces were computed. Finally, differences in rotations and translations between the manual and automatic surfaces were estimated post-PET/CT alignment. The largest, smallest, and median distances between interactive and automatic surfaces averaged 1.2 ± 2.1, 0.2 ± 1.6, and 0.7 ± 1.9 mm. The average angular and translational differences in CT/PET alignments were 0.4°, -0.6°, and -2.3° about x, y, and z axes, and 1.8, -2.1, and 2.0 mm in x, y, and z directions. Our automatic myocardial boundary detection algorithm creates surfaces from CTA that are similar in accuracy and provide similar alignments with PET as those obtained from interactive tracing. Specific difficulties in a reliable segmentation of the apex and base regions will require further improvements in the automated technique.
Multiple DNA and protein sequence alignment on a workstation and a supercomputer.

PubMed

Tajima, K

1988-11-01

This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.
MC64-ClustalWP2: A Highly-Parallel Hybrid Strategy to Align Multiple Sequences in Many-Core Architectures

PubMed Central

Díaz, David; Esteban, Francisco J.; Hernández, Pilar; Caballero, Juan Antonio; Guevara, Antonio

2014-01-01

We have developed the MC64-ClustalWP2 as a new implementation of the Clustal W algorithm, integrating a novel parallelization strategy and significantly increasing the performance when aligning long sequences in architectures with many cores. It must be stressed that in such a process, the detailed analysis of both the software and hardware features and peculiarities is of paramount importance to reveal key points to exploit and optimize the full potential of parallelism in many-core CPU systems. The new parallelization approach has focused into the most time-consuming stages of this algorithm. In particular, the so-called progressive alignment has drastically improved the performance, due to a fine-grained approach where the forward and backward loops were unrolled and parallelized. Another key approach has been the implementation of the new algorithm in a hybrid-computing system, integrating both an Intel Xeon multi-core CPU and a Tilera Tile64 many-core card. A comparison with other Clustal W implementations reveals the high-performance of the new algorithm and strategy in many-core CPU architectures, in a scenario where the sequences to align are relatively long (more than 10 kb) and, hence, a many-core GPU hardware cannot be used. Thus, the MC64-ClustalWP2 runs multiple alignments more than 18x than the original Clustal W algorithm, and more than 7x than the best x86 parallel implementation to date, being publicly available through a web service. Besides, these developments have been deployed in cost-effective personal computers and should be useful for life-science researchers, including the identification of identities and differences for mutation/polymorphism analyses, biodiversity and evolutionary studies and for the development of molecular markers for paternity testing, germplasm management and protection, to assist breeding, illegal traffic control, fraud prevention and for the protection of the intellectual property (identification/traceability), including the protected designation of origin, among other applications. PMID:24710354
Structural alignment of protein descriptors - a combinatorial model.

PubMed

Antczak, Maciej; Kasprzak, Marta; Lukasiak, Piotr; Blazewicz, Jacek

2016-09-17

Structural alignment of proteins is one of the most challenging problems in molecular biology. The tertiary structure of a protein strictly correlates with its function and computationally predicted structures are nowadays a main premise for understanding the latter. However, computationally derived 3D models often exhibit deviations from the native structure. A way to confirm a model is a comparison with other structures. The structural alignment of a pair of proteins can be defined with the use of a concept of protein descriptors. The protein descriptors are local substructures of protein molecules, which allow us to divide the original problem into a set of subproblems and, consequently, to propose a more efficient algorithmic solution. In the literature, one can find many applications of the descriptors concept that prove its usefulness for insight into protein 3D structures, but the proposed approaches are presented rather from the biological perspective than from the computational or algorithmic point of view. Efficient algorithms for identification and structural comparison of descriptors can become crucial components of methods for structural quality assessment as well as tertiary structure prediction. In this paper, we propose a new combinatorial model and new polynomial-time algorithms for the structural alignment of descriptors. The model is based on the maximum-size assignment problem, which we define here and prove that it can be solved in polynomial time. We demonstrate suitability of this approach by comparison with an exact backtracking algorithm. Besides a simplification coming from the combinatorial modeling, both on the conceptual and complexity level, we gain with this approach high quality of obtained results, in terms of 3D alignment accuracy and processing efficiency. All the proposed algorithms were developed and integrated in a computationally efficient tool descs-standalone, which allows the user to identify and structurally compare descriptors of biological molecules, such as proteins and RNAs. Both PDB (Protein Data Bank) and mmCIF (macromolecular Crystallographic Information File) formats are supported. The proposed tool is available as an open source project stored on GitHub ( https://github.com/mantczak/descs-standalone ).
An efficient direct method for image registration of flat objects

NASA Astrophysics Data System (ADS)

Nikolaev, Dmitry; Tihonkih, Dmitrii; Makovetskii, Artyom; Voronin, Sergei

2017-09-01

Image alignment of rigid surfaces is a rapidly developing area of research and has many practical applications. Alignment methods can be roughly divided into two types: feature-based methods and direct methods. Known SURF and SIFT algorithms are examples of the feature-based methods. Direct methods refer to those that exploit the pixel intensities without resorting to image features and image-based deformations are general direct method to align images of deformable objects in 3D space. Nevertheless, it is not good for the registration of images of 3D rigid objects since the underlying structure cannot be directly evaluated. In the article, we propose a model that is suitable for image alignment of rigid flat objects under various illumination models. The brightness consistency assumptions used for reconstruction of optimal geometrical transformation. Computer simulation results are provided to illustrate the performance of the proposed algorithm for computing of an accordance between pixels of two images.
Viewing Angle Classification of Cryo-Electron Microscopy Images Using Eigenvectors

PubMed Central

Singer, A.; Zhao, Z.; Shkolnisky, Y.; Hadani, R.

2012-01-01

The cryo-electron microscopy (cryo-EM) reconstruction problem is to find the three-dimensional structure of a macromolecule given noisy versions of its two-dimensional projection images at unknown random directions. We introduce a new algorithm for identifying noisy cryo-EM images of nearby viewing angles. This identification is an important first step in three-dimensional structure determination of macromolecules from cryo-EM, because once identified, these images can be rotationally aligned and averaged to produce “class averages” of better quality. The main advantage of our algorithm is its extreme robustness to noise. The algorithm is also very efficient in terms of running time and memory requirements, because it is based on the computation of the top few eigenvectors of a specially designed sparse Hermitian matrix. These advantages are demonstrated in numerous numerical experiments. PMID:22506089
Precise 3D Lug Pose Detection Sensor for Automatic Robot Welding Using a Structured-Light Vision System

PubMed Central

Park, Jae Byung; Lee, Seung Hun; Lee, Il Jae

2009-01-01

In this study, we propose a precise 3D lug pose detection sensor for automatic robot welding of a lug to a huge steel plate used in shipbuilding, where the lug is a handle to carry the huge steel plate. The proposed sensor consists of a camera and four laser line diodes, and its design parameters are determined by analyzing its detectable range and resolution. For the lug pose acquisition, four laser lines are projected on both lug and plate, and the projected lines are detected by the camera. For robust detection of the projected lines against the illumination change, the vertical threshold, thinning, Hough transform and separated Hough transform algorithms are successively applied to the camera image. The lug pose acquisition is carried out by two stages: the top view alignment and the side view alignment. The top view alignment is to detect the coarse lug pose relatively far from the lug, and the side view alignment is to detect the fine lug pose close to the lug. After the top view alignment, the robot is controlled to move close to the side of the lug for the side view alignment. By this way, the precise 3D lug pose can be obtained. Finally, experiments with the sensor prototype are carried out to verify the feasibility and effectiveness of the proposed sensor. PMID:22400007
On decomposing stimulus and response waveforms in event-related potentials recordings.

PubMed

Yin, Gang; Zhang, Jun

2011-06-01

Event-related potentials (ERPs) reflect the brain activities related to specific behavioral events, and are obtained by averaging across many trial repetitions with individual trials aligned to the onset of a specific event, e.g., the onset of stimulus (s-aligned) or the onset of the behavioral response (r-aligned). However, the s-aligned and r-aligned ERP waveforms do not purely reflect, respectively, underlying stimulus (S-) or response (R-) component waveform, due to their cross-contaminations in the recorded ERP waveforms. Zhang [J. Neurosci. Methods, 80, pp. 49-63, 1998] proposed an algorithm to recover the pure S-component waveform and the pure R-component waveform from the s-aligned and r-aligned ERP average waveforms-however, due to the nature of this inverse problem, a direct solution is sensitive to noise that disproportionally affects low-frequency components, hindering the practical implementation of this algorithm. Here, we apply the Wiener deconvolution technique to deal with noise in input data, and investigate a Tikhonov regularization approach to obtain a stable solution that is robust against variances in the sampling of reaction-time distribution (when number of trials is low). Our method is demonstrated using data from a Go/NoGo experiment about image classification and recognition.
Indoor positioning algorithm combined with angular vibration compensation and the trust region technique based on received signal strength-visible light communication

NASA Astrophysics Data System (ADS)

Wang, Jin; Li, Haoxu; Zhang, Xiaofeng; Wu, Rangzhong

2017-05-01

Indoor positioning using visible light communication has become a topic of intensive research in recent years. Because the normal of the receiver always deviates from that of the transmitter in application, the positioning systems which require that the normal of the receiver be aligned with that of the transmitter have large positioning errors. Some algorithms take the angular vibrations into account; nevertheless, these positioning algorithms cannot meet the requirement of high accuracy or low complexity. A visible light positioning algorithm combined with angular vibration compensation is proposed. The angle information from the accelerometer or other angle acquisition devices is used to calculate the angle of incidence even when the receiver is not horizontal. Meanwhile, a received signal strength technique with high accuracy is employed to determine the location. Moreover, an eight-light-emitting-diode (LED) system model is provided to improve the accuracy. The simulation results show that the proposed system can achieve a low positioning error with low complexity, and the eight-LED system exhibits improved performance. Furthermore, trust region-based positioning is proposed to determine three-dimensional locations and achieves high accuracy in both the horizontal and the vertical components.
B-cell Ligand Processing Pathways Detected by Large-scale Comparative Analysis

PubMed Central

Towfic, Fadi; Gupta, Shakti; Honavar, Vasant; Subramaniam, Shankar

2012-01-01

The initiation of B-cell ligand recognition is a critical step for the generation of an immune response against foreign bodies. We sought to identify the biochemical pathways involved in the B-cell ligand recognition cascade and sets of ligands that trigger similar immunological responses. We utilized several comparative approaches to analyze the gene coexpression networks generated from a set of microarray experiments spanning 33 different ligands. First, we compared the degree distributions of the generated networks. Second, we utilized a pairwise network alignment algorithm, BiNA, to align the networks based on the hubs in the networks. Third, we aligned the networks based on a set of KEGG pathways. We summarized our results by constructing a consensus hierarchy of pathways that are involved in B cell ligand recognition. The resulting pathways were further validated through literature for their common physiological responses. Collectively, the results based on our comparative analyses of degree distributions, alignment of hubs, and alignment based on KEGG pathways provide a basis for molecular characterization of the immune response states of B-cells and demonstrate the power of comparative approaches (e.g., gene coexpression network alignment algorithms) in elucidating biochemical pathways involved in complex signaling events in cells. PMID:22917187
Application of fast Fourier transform cross-correlation and mass spectrometry data for accurate alignment of chromatograms.

PubMed

Zheng, Yi-Bao; Zhang, Zhi-Min; Liang, Yi-Zeng; Zhan, De-Jian; Huang, Jian-Hua; Yun, Yong-Huan; Xie, Hua-Lin

2013-04-19

Chromatography has been established as one of the most important analytical methods in the modern analytical laboratory. However, preprocessing of the chromatograms, especially peak alignment, is usually a time-consuming task prior to extracting useful information from the datasets because of the small unavoidable differences in the experimental conditions caused by minor changes and drift. Most of the alignment algorithms are performed on reduced datasets using only the detected peaks in the chromatograms, which means a loss of data and introduces the problem of extraction of peak data from the chromatographic profiles. These disadvantages can be overcome by using the full chromatographic information that is generated from hyphenated chromatographic instruments. A new alignment algorithm called CAMS (Chromatogram Alignment via Mass Spectra) is present here to correct the retention time shifts among chromatograms accurately and rapidly. In this report, peaks of each chromatogram were detected based on Continuous Wavelet Transform (CWT) with Haar wavelet and were aligned against the reference chromatogram via the correlation of mass spectra. The aligning procedure was accelerated by Fast Fourier Transform cross correlation (FFT cross correlation). This approach has been compared with several well-known alignment methods on real chromatographic datasets, which demonstrates that CAMS can preserve the shape of peaks and achieve a high quality alignment result. Furthermore, the CAMS method was implemented in the Matlab language and available as an open source package at http://www.github.com/matchcoder/CAMS. Copyright © 2013. Published by Elsevier B.V.
A novel surface registration algorithm with biomedical modeling applications.

PubMed

Huang, Heng; Shen, Li; Zhang, Rong; Makedon, Fillia; Saykin, Andrew; Pearlman, Justin

2007-07-01

In this paper, we propose a novel surface matching algorithm for arbitrarily shaped but simply connected 3-D objects. The spherical harmonic (SPHARM) method is used to describe these 3-D objects, and a novel surface registration approach is presented. The proposed technique is applied to various applications of medical image analysis. The results are compared with those using the traditional method, in which the first-order ellipsoid is used for establishing surface correspondence and aligning objects. In these applications, our surface alignment method is demonstrated to be more accurate and flexible than the traditional approach. This is due in large part to the fact that a new surface parameterization is generated by a shortcut that employs a useful rotational property of spherical harmonic basis functions for a fast implementation. In order to achieve a suitable computational speed for practical applications, we propose a fast alignment algorithm that improves computational complexity of the new surface registration method from O(n3) to O(n2).
Initial Navigation Alignment of Optical Instruments on GOES-R

NASA Technical Reports Server (NTRS)

Isaacson, Peter J.; DeLuccia, Frank J.; Reth, Alan D.; Igli, David A.; Carter, Delano R.

2016-01-01

Post-launch alignment errors for the Advanced Baseline Imager (ABI) and Geospatial Lightning Mapper (GLM) on GOES-R may be too large for the image navigation and registration (INR) processing algorithms to function without an initial adjustment to calibration parameters. We present an approach that leverages a combination of user-selected image-to-image tie points and image correlation algorithms to estimate this initial launch-induced offset and calculate adjustments to the Line of Sight Motion Compensation (LMC) parameters. We also present an approach to generate synthetic test images, to which shifts and rotations of known magnitude are applied. Results of applying the initial alignment tools to a subset of these synthetic test images are presented. The results for both ABI and GLM are within the specifications established for these tools, and indicate that application of these tools during the post-launch test (PLT) phase of GOES-R operations will enable the automated INR algorithms for both instruments to function as intended.
Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm.

PubMed

Khaled, Heba; Faheem, Hossam El Deen Mostafa; El Gohary, Rania

2015-01-01

This paper provides a novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA. The proposed model targets homogeneous cluster nodes equipped with similar Graphical Processing Unit (GPU) cards. The model consists of the Master Node Dispatcher (MND) and the Worker GPU Nodes (WGN). The MND distributes the workload among the cluster working nodes and then aggregates the results. The WGN performs the multiple pair-wise sequence alignments using the Smith-Waterman algorithm. We also propose a modified implementation to the Smith-Waterman algorithm based on computing the alignment matrices row-wise. The experimental results demonstrate a considerable reduction in the running time by increasing the number of the working GPU nodes. The proposed model achieved a performance of about 12 Giga cell updates per second when we tested against the SWISS-PROT protein knowledge base running on four nodes.
A Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm

PubMed Central

Guo, Xinyu; Wang, Hong; Devabhaktuni, Vijay

2012-01-01

A design of systolic array-based Field Programmable Gate Array (FPGA) parallel architecture for Basic Local Alignment Search Tool (BLAST) Algorithm is proposed. BLAST is a heuristic biological sequence alignment algorithm which has been used by bioinformatics experts. In contrast to other designs that detect at most one hit in one-clock-cycle, our design applies a Multiple Hits Detection Module which is a pipelining systolic array to search multiple hits in a single-clock-cycle. Further, we designed a Hits Combination Block which combines overlapping hits from systolic array into one hit. These implementations completed the first and second step of BLAST architecture and achieved significant speedup comparing with previously published architectures. PMID:25969747
Single-particle cryo-EM using alignment by classification (ABC): the structure of Lumbricus terrestris haemoglobin

PubMed Central

Seer-Linnemayr, Charlotte; Ravelli, Raimond B. G.; Matadeen, Rishi; De Carlo, Sacha; Alewijnse, Bart; Portugal, Rodrigo V.; Pannu, Navraj S.; Schatz, Michael; van Heel, Marin

2017-01-01

Single-particle cryogenic electron microscopy (cryo-EM) can now yield near-atomic resolution structures of biological complexes. However, the reference-based alignment algorithms commonly used in cryo-EM suffer from reference bias, limiting their applicability (also known as the ‘Einstein from random noise’ problem). Low-dose cryo-EM therefore requires robust and objective approaches to reveal the structural information contained in the extremely noisy data, especially when dealing with small structures. A reference-free pipeline is presented for obtaining near-atomic resolution three-dimensional reconstructions from heterogeneous (‘four-dimensional’) cryo-EM data sets. The methodologies integrated in this pipeline include a posteriori camera correction, movie-based full-data-set contrast transfer function determination, movie-alignment algorithms, (Fourier-space) multivariate statistical data compression and unsupervised classification, ‘random-startup’ three-dimensional reconstructions, four-dimensional structural refinements and Fourier shell correlation criteria for evaluating anisotropic resolution. The procedures exclusively use information emerging from the data set itself, without external ‘starting models’. Euler-angle assignments are performed by angular reconstitution rather than by the inherently slower projection-matching approaches. The comprehensive ‘ABC-4D’ pipeline is based on the two-dimensional reference-free ‘alignment by classification’ (ABC) approach, where similar images in similar orientations are grouped by unsupervised classification. Some fundamental differences between X-ray crystallography versus single-particle cryo-EM data collection and data processing are discussed. The structure of the giant haemoglobin from Lumbricus terrestris at a global resolution of ∼3.8 Å is presented as an example of the use of the ABC-4D procedure. PMID:28989723
Vertebra identification using template matching modelmp and K-means clustering.

PubMed

Larhmam, Mohamed Amine; Benjelloun, Mohammed; Mahmoudi, Saïd

2014-03-01

Accurate vertebra detection and segmentation are essential steps for automating the diagnosis of spinal disorders. This study is dedicated to vertebra alignment measurement, the first step in a computer-aided diagnosis tool for cervical spine trauma. Automated vertebral segment alignment determination is a challenging task due to low contrast imaging and noise. A software tool for segmenting vertebrae and detecting subluxations has clinical significance. A robust method was developed and tested for cervical vertebra identification and segmentation that extracts parameters used for vertebra alignment measurement. Our contribution involves a novel combination of a template matching method and an unsupervised clustering algorithm. In this method, we build a geometric vertebra mean model. To achieve vertebra detection, manual selection of the region of interest is performed initially on the input image. Subsequent preprocessing is done to enhance image contrast and detect edges. Candidate vertebra localization is then carried out by using a modified generalized Hough transform (GHT). Next, an adapted cost function is used to compute local voted centers and filter boundary data. Thereafter, a K-means clustering algorithm is applied to obtain clusters distribution corresponding to the targeted vertebrae. These clusters are combined with the vote parameters to detect vertebra centers. Rigid segmentation is then carried out by using GHT parameters. Finally, cervical spine curves are extracted to measure vertebra alignment. The proposed approach was successfully applied to a set of 66 high-resolution X-ray images. Robust detection was achieved in 97.5 % of the 330 tested cervical vertebrae. An automated vertebral identification method was developed and demonstrated to be robust to noise and occlusion. This work presents a first step toward an automated computer-aided diagnosis system for cervical spine trauma detection.
Algorithm for Aligning an Array of Receiving Radio Antennas

NASA Technical Reports Server (NTRS)

Rogstad, David

2006-01-01

A digital-signal-processing algorithm (somewhat arbitrarily) called SUMPLE has been devised as a means of aligning the outputs of multiple receiving radio antennas in a large array for the purpose of receiving a weak signal transmitted by a single distant source. As used here, aligning signifies adjusting the delays and phases of the outputs from the various antennas so that their relatively weak replicas of the desired signal can be added coherently to increase the signal-to-noise ratio (SNR) for improved reception, as though one had a single larger antenna. The method was devised to enhance spacecraft-tracking and telemetry operations in NASA's Deep Space Network (DSN); the method could also be useful in such other applications as both satellite and terrestrial radio communications and radio astronomy. Heretofore, most commonly, alignment has been effected by a process that involves correlation of signals in pairs. This approach necessitates the use of a large amount of hardware most notably, the N(N - 1)/2 correlators needed to process signals from all possible pairs of N antennas. Moreover, because the incoming signals typically have low SNRs, the delay and phase adjustments are poorly determined from the pairwise correlations. SUMPLE also involves correlations, but the correlations are not performed in pairs. Instead, in a partly iterative process, each signal is appropriately weighted and then correlated with a composite signal equal to the sum of the other signals (see Figure 1). One benefit of this approach is that only N correlators are needed; in an array of N much greater than 1 antennas, this results in a significant reduction of the amount of hardware. Another benefit is that once the array achieves coherence, the correlation SNR is N - 1 times that of a pair of antennas.

LS-align: an atom-level, flexible ligand structural alignment algorithm for high-throughput virtual screening.

PubMed

Hu, Jun; Liu, Zi; Yu, Dong-Jun; Zhang, Yang

2018-02-15

Sequence-order independent structural comparison, also called structural alignment, of small ligand molecules is often needed for computer-aided virtual drug screening. Although many ligand structure alignment programs are proposed, most of them build the alignments based on rigid-body shape comparison which cannot provide atom-specific alignment information nor allow structural variation; both abilities are critical to efficient high-throughput virtual screening. We propose a novel ligand comparison algorithm, LS-align, to generate fast and accurate atom-level structural alignments of ligand molecules, through an iterative heuristic search of the target function that combines inter-atom distance with mass and chemical bond comparisons. LS-align contains two modules of Rigid-LS-align and Flexi-LS-align, designed for rigid-body and flexible alignments, respectively, where a ligand-size independent, statistics-based scoring function is developed to evaluate the similarity of ligand molecules relative to random ligand pairs. Large-scale benchmark tests are performed on prioritizing chemical ligands of 102 protein targets involving 1,415,871 candidate compounds from the DUD-E (Database of Useful Decoys: Enhanced) database, where LS-align achieves an average enrichment factor (EF) of 22.0 at the 1% cutoff and the AUC score of 0.75, which are significantly higher than other state-of-the-art methods. Detailed data analyses show that the advanced performance is mainly attributed to the design of the target function that combines structural and chemical information to enhance the sensitivity of recognizing subtle difference of ligand molecules and the introduces of structural flexibility that help capture the conformational changes induced by the ligand-receptor binding interactions. These data demonstrate a new avenue to improve the virtual screening efficiency through the development of sensitive ligand structural alignments. http://zhanglab.ccmb.med.umich.edu/LS-align/. njyudj@njust.edu.cn or zhng@umich.edu. Supplementary data are available at Bioinformatics online.
FASMA: a service to format and analyze sequences in multiple alignments.

PubMed

Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M

2007-12-01

Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.
PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

PubMed

Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

2015-05-01

We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.
Parallel-SymD: A Parallel Approach to Detect Internal Symmetry in Protein Domains.

PubMed

Jha, Ashwani; Flurchick, K M; Bikdash, Marwan; Kc, Dukka B

2016-01-01

Internally symmetric proteins are proteins that have a symmetrical structure in their monomeric single-chain form. Around 10-15% of the protein domains can be regarded as having some sort of internal symmetry. In this regard, we previously published SymD (symmetry detection), an algorithm that determines whether a given protein structure has internal symmetry by attempting to align the protein to its own copy after the copy is circularly permuted by all possible numbers of residues. SymD has proven to be a useful algorithm to detect symmetry. In this paper, we present a new parallelized algorithm called Parallel-SymD for detecting symmetry of proteins on clusters of computers. The achieved speedup of the new Parallel-SymD algorithm scales well with the number of computing processors. Scaling is better for proteins with a larger number of residues. For a protein of 509 residues, a speedup of 63 was achieved on a parallel system with 100 processors.
Parallel-SymD: A Parallel Approach to Detect Internal Symmetry in Protein Domains

PubMed Central

Jha, Ashwani; Flurchick, K. M.; Bikdash, Marwan

2016-01-01

Internally symmetric proteins are proteins that have a symmetrical structure in their monomeric single-chain form. Around 10–15% of the protein domains can be regarded as having some sort of internal symmetry. In this regard, we previously published SymD (symmetry detection), an algorithm that determines whether a given protein structure has internal symmetry by attempting to align the protein to its own copy after the copy is circularly permuted by all possible numbers of residues. SymD has proven to be a useful algorithm to detect symmetry. In this paper, we present a new parallelized algorithm called Parallel-SymD for detecting symmetry of proteins on clusters of computers. The achieved speedup of the new Parallel-SymD algorithm scales well with the number of computing processors. Scaling is better for proteins with a larger number of residues. For a protein of 509 residues, a speedup of 63 was achieved on a parallel system with 100 processors. PMID:27747230
Sequence comparison alignment-free approach based on suffix tree and L-words frequency.

PubMed

Soares, Inês; Goios, Ana; Amorim, António

2012-01-01

The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions/deletions). In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length L-L-words--in each sequence is rapidly calculated. Based on the L-words frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.
Determination of stores pointing error due to wing flexibility under flight load

NASA Technical Reports Server (NTRS)

Lokos, William A.; Bahm, Catherine M.; Heinle, Robert A.

1995-01-01

The in-flight elastic wing twist of a fighter-type aircraft was studied to provide for an improved on-board real-time computed prediction of pointing variations of three wing store stations. This is an important capability to correct sensor pod alignment variation or to establish initial conditions of iron bombs or smart weapons prior to release. The original algorithm was based upon coarse measurements. The electro-optical Flight Deflection Measurement System measured the deformed wing shape in flight under maneuver loads to provide a higher resolution database from which an improved twist prediction algorithm could be developed. The FDMS produced excellent repeatable data. In addition, a NASTRAN finite-element analysis was performed to provide additional elastic deformation data. The FDMS data combined with the NASTRAN analysis indicated that an improved prediction algorithm could be derived by using a different set of aircraft parameters, namely normal acceleration, stores configuration, Mach number, and gross weight.
A new method for recognizing quadric surfaces from range data and its application to telerobotics and automation, final phase

NASA Technical Reports Server (NTRS)

Mielke, Roland; Dcunha, Ivan; Alvertos, Nicolas

1994-01-01

In the final phase of the proposed research a complete top to down three dimensional object recognition scheme has been proposed. The various three dimensional objects included spheres, cones, cylinders, ellipsoids, paraboloids, and hyperboloids. Utilizing a newly developed blob determination technique, a given range scene with several non-cluttered quadric surfaces is segmented. Next, using the earlier (phase 1) developed alignment scheme, each of the segmented objects are then aligned in a desired coordinate system. For each of the quadric surfaces based upon their intersections with certain pre-determined planes, a set of distinct features (curves) are obtained. A database with entities such as the equations of the planes and angular bounds of these planes has been created for each of the quadric surfaces. Real range data of spheres, cones, cylinders, and parallelpipeds have been utilized for the recognition process. The developed algorithm gave excellent results for the real data as well as for several sets of simulated range data.
Assessment of the Clinical Relevance of BRCA2 Missense Variants by Functional and Computational Approaches.

PubMed

Guidugli, Lucia; Shimelis, Hermela; Masica, David L; Pankratz, Vernon S; Lipton, Gary B; Singh, Namit; Hu, Chunling; Monteiro, Alvaro N A; Lindor, Noralane M; Goldgar, David E; Karchin, Rachel; Iversen, Edwin S; Couch, Fergus J

2018-01-17

Many variants of uncertain significance (VUS) have been identified in BRCA2 through clinical genetic testing. VUS pose a significant clinical challenge because the contribution of these variants to cancer risk has not been determined. We conducted a comprehensive assessment of VUS in the BRCA2 C-terminal DNA binding domain (DBD) by using a validated functional assay of BRCA2 homologous recombination (HR) DNA-repair activity and defined a classifier of variant pathogenicity. Among 139 variants evaluated, 54 had ≥99% probability of pathogenicity, and 73 had ≥95% probability of neutrality. Functional assay results were compared with predictions of variant pathogenicity from the Align-GVGD protein-sequence-based prediction algorithm, which has been used for variant classification. Relative to the HR assay, Align-GVGD significantly (p < 0.05) over-predicted pathogenic variants. We subsequently combined functional and Align-GVGD prediction results in a Bayesian hierarchical model (VarCall) to estimate the overall probability of pathogenicity for each VUS. In addition, to predict the effects of all other BRCA2 DBD variants and to prioritize variants for functional studies, we used the endoPhenotype-Optimized Sequence Ensemble (ePOSE) algorithm to train classifiers for BRCA2 variants by using data from the HR functional assay. Together, the results show that systematic functional assays in combination with in silico predictors of pathogenicity provide robust tools for clinical annotation of BRCA2 VUS. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Phylogenomic analyses data of the avian phylogenomics project.

PubMed

Jarvis, Erich D; Mirarab, Siavash; Aberer, Andre J; Li, Bo; Houde, Peter; Li, Cai; Ho, Simon Y W; Faircloth, Brant C; Nabholz, Benoit; Howard, Jason T; Suh, Alexander; Weber, Claudia C; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Narula, Nitish; Liu, Liang; Burt, Dave; Ellegren, Hans; Edwards, Scott V; Stamatakis, Alexandros; Mindell, David P; Cracraft, Joel; Braun, Edward L; Warnow, Tandy; Jun, Wang; Gilbert, M Thomas Pius; Zhang, Guojie

2015-01-01

Determining the evolutionary relationships among the major lineages of extant birds has been one of the biggest challenges in systematic biology. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders. We used these genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomic analyses. Here we present the datasets associated with the phylogenomic analyses, which include sequence alignment files consisting of nucleotides, amino acids, indels, and transposable elements, as well as tree files containing gene trees and species trees. Inferring an accurate phylogeny required generating: 1) A well annotated data set across species based on genome synteny; 2) Alignments with unaligned or incorrectly overaligned sequences filtered out; and 3) Diverse data sets, including genes and their inferred trees, indels, and transposable elements. Our total evidence nucleotide tree (TENT) data set (consisting of exons, introns, and UCEs) gave what we consider our most reliable species tree when using the concatenation-based ExaML algorithm or when using statistical binning with the coalescence-based MP-EST algorithm (which we refer to as MP-EST*). Other data sets, such as the coding sequence of some exons, revealed other properties of genome evolution, namely convergence. The Avian Phylogenomics Project is the largest vertebrate phylogenomics project to date that we are aware of. The sequence, alignment, and tree data are expected to accelerate analyses in phylogenomics and other related areas.
Mango: multiple alignment with N gapped oligos.

PubMed

Zhang, Zefeng; Lin, Hao; Li, Ming

2008-06-01

Multiple sequence alignment is a classical and challenging task. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state-of-the-art works suffer from the "once a gap, always a gap" phenomenon. Is there a radically new way to do multiple sequence alignment? In this paper, we introduce a novel and orthogonal multiple sequence alignment method, using both multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole and tries to build the alignment vertically, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds have proved significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks, showing that MANGO compares favorably, in both accuracy and speed, against state-of-the-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, ProbConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0, and Kalign 2.0. We have further demonstrated the scalability of MANGO on very large datasets of repeat elements. MANGO can be downloaded at http://www.bioinfo.org.cn/mango/ and is free for academic usage.
A Novel Partial Sequence Alignment Tool for Finding Large Deletions

PubMed Central

Aruk, Taner; Ustek, Duran; Kursun, Olcay

2012-01-01

Finding large deletions in genome sequences has become increasingly more useful in bioinformatics, such as in clinical research and diagnosis. Although there are a number of publically available next generation sequencing mapping and sequence alignment programs, these software packages do not correctly align fragments containing deletions larger than one kb. We present a fast alignment software package, BinaryPartialAlign, that can be used by wet lab scientists to find long structural variations in their experiments. For BinaryPartialAlign, we make use of the Smith-Waterman (SW) algorithm with a binary-search-based approach for alignment with large gaps that we called partial alignment. BinaryPartialAlign implementation is compared with other straight-forward applications of SW. Simulation results on mtDNA fragments demonstrate the effectiveness (runtime and accuracy) of the proposed method. PMID:22566777
Long Read Alignment with Parallel MapReduce Cloud Platform

PubMed Central

Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki

2015-01-01

Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms. PMID:26839887
Long Read Alignment with Parallel MapReduce Cloud Platform.

PubMed

Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki

2015-01-01

Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms.
Computational complexity of algorithms for sequence comparison, short-read assembly and genome alignment.

PubMed

Baichoo, Shakuntala; Ouzounis, Christos A

A multitude of algorithms for sequence comparison, short-read assembly and whole-genome alignment have been developed in the general context of molecular biology, to support technology development for high-throughput sequencing, numerous applications in genome biology and fundamental research on comparative genomics. The computational complexity of these algorithms has been previously reported in original research papers, yet this often neglected property has not been reviewed previously in a systematic manner and for a wider audience. We provide a review of space and time complexity of key sequence analysis algorithms and highlight their properties in a comprehensive manner, in order to identify potential opportunities for further research in algorithm or data structure optimization. The complexity aspect is poised to become pivotal as we will be facing challenges related to the continuous increase of genomic data on unprecedented scales and complexity in the foreseeable future, when robust biological simulation at the cell level and above becomes a reality. Copyright © 2017 Elsevier B.V. All rights reserved.
Iterative non-sequential protein structural alignment.

PubMed

Salem, Saeed; Zaki, Mohammed J; Bystroff, Christopher

2009-06-01

Structural similarity between proteins gives us insights into their evolutionary relationships when there is low sequence similarity. In this paper, we present a novel approach called SNAP for non-sequential pair-wise structural alignment. Starting from an initial alignment, our approach iterates over a two-step process consisting of a superposition step and an alignment step, until convergence. We propose a novel greedy algorithm to construct both sequential and non-sequential alignments. The quality of SNAP alignments were assessed by comparing against the manually curated reference alignments in the challenging SISY and RIPC datasets. Moreover, when applied to a dataset of 4410 protein pairs selected from the CATH database, SNAP produced longer alignments with lower rmsd than several state-of-the-art alignment methods. Classification of folds using SNAP alignments was both highly sensitive and highly selective. The SNAP software along with the datasets are available online at http://www.cs.rpi.edu/~zaki/software/SNAP.
An efficient multi-resolution GA approach to dental image alignment

NASA Astrophysics Data System (ADS)

Nassar, Diaa Eldin; Ogirala, Mythili; Adjeroh, Donald; Ammar, Hany

2006-02-01

Automating the process of postmortem identification of individuals using dental records is receiving an increased attention in forensic science, especially with the large volume of victims encountered in mass disasters. Dental radiograph alignment is a key step required for automating the dental identification process. In this paper, we address the problem of dental radiograph alignment using a Multi-Resolution Genetic Algorithm (MR-GA) approach. We use location and orientation information of edge points as features; we assume that affine transformations suffice to restore geometric discrepancies between two images of a tooth, we efficiently search the 6D space of affine parameters using GA progressively across multi-resolution image versions, and we use a Hausdorff distance measure to compute the similarity between a reference tooth and a query tooth subject to a possible alignment transform. Testing results based on 52 teeth-pair images suggest that our algorithm converges to reasonable solutions in more than 85% of the test cases, with most of the error in the remaining cases due to excessive misalignments.
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures.

PubMed

Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

2009-01-01

ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures

PubMed Central

Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

2009-01-01

ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/ PMID:18971256
Automated Registration of Multimodal Optic Disc Images: Clinical Assessment of Alignment Accuracy.

PubMed

Ng, Wai Siene; Legg, Phil; Avadhanam, Venkat; Aye, Kyaw; Evans, Steffan H P; North, Rachel V; Marshall, Andrew D; Rosin, Paul; Morgan, James E

2016-04-01

To determine the accuracy of automated alignment algorithms for the registration of optic disc images obtained by 2 different modalities: fundus photography and scanning laser tomography. Images obtained with the Heidelberg Retina Tomograph II and paired photographic optic disc images of 135 eyes were analyzed. Three state-of-the-art automated registration techniques Regional Mutual Information, rigid Feature Neighbourhood Mutual Information (FNMI), and nonrigid FNMI (NRFNMI) were used to align these image pairs. Alignment of each composite picture was assessed on a 5-point grading scale: "Fail" (no alignment of vessels with no vessel contact), "Weak" (vessels have slight contact), "Good" (vessels with <50% contact), "Very Good" (vessels with >50% contact), and "Excellent" (complete alignment). Custom software generated an image mosaic in which the modalities were interleaved as a series of alternate 5×5-pixel blocks. These were graded independently by 3 clinically experienced observers. A total of 810 image pairs were assessed. All 3 registration techniques achieved a score of "Good" or better in >95% of the image sets. NRFNMI had the highest percentage of "Excellent" (mean: 99.6%; range, 95.2% to 99.6%), followed by Regional Mutual Information (mean: 81.6%; range, 86.3% to 78.5%) and FNMI (mean: 73.1%; range, 85.2% to 54.4%). Automated registration of optic disc images by different modalities is a feasible option for clinical application. All 3 methods provided useful levels of alignment, but the NRFNMI technique consistently outperformed the others and is recommended as a practical approach to the automated registration of multimodal disc images.

Chromaligner: a web server for chromatogram alignment.

PubMed

Wang, San-Yuan; Ho, Tsung-Jung; Kuo, Ching-Hua; Tseng, Yufeng J

2010-09-15

Chromaligner is a tool for chromatogram alignment to align retention time for chromatographic methods coupled to spectrophotometers such as high performance liquid chromatography and capillary electrophoresis for metabolomics works. Chromaligner resolves peak shifts by a constrained chromatogram alignment. For a collection of chromatograms and a set of defined peaks, Chromaligner aligns the chromatograms on defined peaks using correlation warping (COW). Chromaligner is faster than the original COW algorithm by k(2) times, where k is the number of defined peaks in a chromatogram. It also provides alignments based on known component peaks to reach the best results for further chemometric analysis. Chromaligner is freely accessible at http://cmdd.csie.ntu.edu.tw/~chromaligner.
PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences

PubMed Central

Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong

2015-01-01

Abstract We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate—slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory. PMID:25549288
Tensor Rank Preserving Discriminant Analysis for Facial Recognition.

PubMed

Tao, Dapeng; Guo, Yanan; Li, Yaotang; Gao, Xinbo

2017-10-12

Facial recognition, one of the basic topics in computer vision and pattern recognition, has received substantial attention in recent years. However, for those traditional facial recognition algorithms, the facial images are reshaped to a long vector, thereby losing part of the original spatial constraints of each pixel. In this paper, a new tensor-based feature extraction algorithm termed tensor rank preserving discriminant analysis (TRPDA) for facial image recognition is proposed; the proposed method involves two stages: in the first stage, the low-dimensional tensor subspace of the original input tensor samples was obtained; in the second stage, discriminative locality alignment was utilized to obtain the ultimate vector feature representation for subsequent facial recognition. On the one hand, the proposed TRPDA algorithm fully utilizes the natural structure of the input samples, and it applies an optimization criterion that can directly handle the tensor spectral analysis problem, thereby decreasing the computation cost compared those traditional tensor-based feature selection algorithms. On the other hand, the proposed TRPDA algorithm extracts feature by finding a tensor subspace that preserves most of the rank order information of the intra-class input samples. Experiments on the three facial databases are performed here to determine the effectiveness of the proposed TRPDA algorithm.
A 2D range Hausdorff approach to 3D facial recognition.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Koch, Mark William; Russ, Trina Denise; Little, Charles Quentin

2004-11-01

This paper presents a 3D facial recognition algorithm based on the Hausdorff distance metric. The standard 3D formulation of the Hausdorff matching algorithm has been modified to operate on a 2D range image, enabling a reduction in computation from O(N2) to O(N) without large storage requirements. The Hausdorff distance is known for its robustness to data outliers and inconsistent data between two data sets, making it a suitable choice for dealing with the inherent problems in many 3D datasets due to sensor noise and object self-occlusion. For optimal performance, the algorithm assumes a good initial alignment between probe and templatemore » datasets. However, to minimize the error between two faces, the alignment can be iteratively refined. Results from the algorithm are presented using 3D face images from the Face Recognition Grand Challenge database version 1.0.« less
A phase and frequency alignment protocol for 1H MRSI data of the prostate.

PubMed

Wright, Alan J; Buydens, Lutgarde M C; Heerschap, Arend

2012-05-01

(1)H MRSI of the prostate reveals relative metabolite levels that vary according to the presence or absence of tumour, providing a sensitive method for the identification of patients with cancer. Current interpretations of prostate data rely on quantification algorithms that fit model metabolite resonances to individual voxel spectra and calculate relative levels of metabolites, such as choline, creatine, citrate and polyamines. Statistical pattern recognition techniques can potentially improve the detection of prostate cancer, but these analyses are hampered by artefacts and sources of noise in the data, such as variations in phase and frequency of resonances. Phase and frequency variations may arise as a result of spatial field gradients or local physiological conditions affecting the frequency of resonances, in particular those of citrate. Thus, there are unique challenges in developing a peak alignment algorithm for these data. We have developed a frequency and phase correction algorithm for automatic alignment of the resonances in prostate MRSI spectra. We demonstrate, with a simulated dataset, that alignment can be achieved to a phase standard deviation of 0.095 rad and a frequency standard deviation of 0.68 Hz for the citrate resonances. Three parameters were used to assess the improvement in peak alignment in the MRSI data of five patients: the percentage of variance in all MRSI spectra explained by their first principal component; the signal-to-noise ratio of a spectrum formed by taking the median value of the entire set at each spectral point; and the mean cross-correlation between all pairs of spectra. These parameters showed a greater similarity between spectra in all five datasets and the simulated data, demonstrating improved alignment for phase and frequency in these spectra. This peak alignment program is expected to improve pattern recognition significantly, enabling accurate detection and localisation of prostate cancer with MRSI. Copyright © 2011 John Wiley & Sons, Ltd.
CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment

PubMed Central

Manavski, Svetlin A; Valle, Giorgio

2008-01-01

Background Searching for similarities in protein and DNA databases has become a routine procedure in Molecular Biology. The Smith-Waterman algorithm has been available for more than 25 years. It is based on a dynamic programming approach that explores all the possible alignments between two sequences; as a result it returns the optimal local alignment. Unfortunately, the computational cost is very high, requiring a number of operations proportional to the product of the length of two sequences. Furthermore, the exponential growth of protein and DNA databases makes the Smith-Waterman algorithm unrealistic for searching similarities in large sets of sequences. For these reasons heuristic approaches such as those implemented in FASTA and BLAST tend to be preferred, allowing faster execution times at the cost of reduced sensitivity. The main motivation of our work is to exploit the huge computational power of commonly available graphic cards, to develop high performance solutions for sequence alignment. Results In this paper we present what we believe is the fastest solution of the exact Smith-Waterman algorithm running on commodity hardware. It is implemented in the recently released CUDA programming environment by NVidia. CUDA allows direct access to the hardware primitives of the last-generation Graphics Processing Units (GPU) G80. Speeds of more than 3.5 GCUPS (Giga Cell Updates Per Second) are achieved on a workstation running two GeForce 8800 GTX. Exhaustive tests have been done to compare our implementation to SSEARCH and BLAST, running on a 3 GHz Intel Pentium IV processor. Our solution was also compared to a recently published GPU implementation and to a Single Instruction Multiple Data (SIMD) solution. These tests show that our implementation performs from 2 to 30 times faster than any other previous attempt available on commodity hardware. Conclusions The results show that graphic cards are now sufficiently advanced to be used as efficient hardware accelerators for sequence alignment. Their performance is better than any alternative available on commodity hardware platforms. The solution presented in this paper allows large scale alignments to be performed at low cost, using the exact Smith-Waterman algorithm instead of the largely adopted heuristic approaches. PMID:18387198
Accelerated Profile HMM Searches

PubMed Central

Eddy, Sean R.

2011-01-01

Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call “sparse rescaling”. These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches. PMID:22039361
Extracting DNA words based on the sequence features: non-uniform distribution and integrity.

PubMed

Li, Zhi; Cao, Hongyan; Cui, Yuehua; Zhang, Yanbo

2016-01-25

DNA sequence can be viewed as an unknown language with words as its functional units. Given that most sequence alignment algorithms such as the motif discovery algorithms depend on the quality of background information about sequences, it is necessary to develop an ab initio algorithm for extracting the "words" based only on the DNA sequences. We considered that non-uniform distribution and integrity were two important features of a word, based on which we developed an ab initio algorithm to extract "DNA words" that have potential functional meaning. A Kolmogorov-Smirnov test was used for consistency test of uniform distribution of DNA sequences, and the integrity was judged by the sequence and position alignment. Two random base sequences were adopted as negative control, and an English book was used as positive control to verify our algorithm. We applied our algorithm to the genomes of Saccharomyces cerevisiae and 10 strains of Escherichia coli to show the utility of the methods. The results provide strong evidences that the algorithm is a promising tool for ab initio building a DNA dictionary. Our method provides a fast way for large scale screening of important DNA elements and offers potential insights into the understanding of a genome.
DISCO: Distance and Spectrum Correlation Optimization Alignment for Two Dimensional Gas Chromatography Time-of-Flight Mass Spectrometry-based Metabolomics

PubMed Central

Wang, Bing; Fang, Aiqin; Heim, John; Bogdanov, Bogdan; Pugh, Scott; Libardoni, Mark; Zhang, Xiang

2010-01-01

A novel peak alignment algorithm using a distance and spectrum correlation optimization (DISCO) method has been developed for two-dimensional gas chromatography time-of-flight mass spectrometry (GC×GC/TOF-MS) based metabolomics. This algorithm uses the output of the instrument control software, ChromaTOF, as its input data. It detects and merges multiple peak entries of the same metabolite into one peak entry in each input peak list. After a z-score transformation of metabolite retention times, DISCO selects landmark peaks from all samples based on both two-dimensional retention times and mass spectrum similarity of fragment ions measured by Pearson’s correlation coefficient. A local linear fitting method is employed in the original two-dimensional retention time space to correct retention time shifts. A progressive retention time map searching method is used to align metabolite peaks in all samples together based on optimization of the Euclidean distance and mass spectrum similarity. The effectiveness of the DISCO algorithm is demonstrated using data sets acquired under different experiment conditions and a spiked-in experiment. PMID:20476746
Measuring the visual salience of alignments by their non-accidentalness.

PubMed

Blusseau, S; Carboni, A; Maiche, A; Morel, J M; Grompone von Gioi, R

2016-09-01

Quantitative approaches are part of the understanding of contour integration and the Gestalt law of good continuation. The present study introduces a new quantitative approach based on the a contrario theory, which formalizes the non-accidentalness principle for good continuation. This model yields an ideal observer algorithm, able to detect non-accidental alignments in Gabor patterns. More precisely, this parameterless algorithm associates with each candidate percept a measure, the Number of False Alarms (NFA), quantifying its degree of masking. To evaluate the approach, we compared this ideal observer with the human attentive performance on three experiments of straight contours detection in arrays of Gabor patches. The experiments showed a strong correlation between the detectability of the target stimuli and their degree of non-accidentalness, as measured by our model. What is more, the algorithm's detection curves were very similar to the ones of human subjects. This fact seems to validate our proposed measurement method as a convenient way to predict the visibility of alignments. This framework could be generalized to other Gestalts. Copyright © 2015 Elsevier Ltd. All rights reserved.
Spacecraft Alignment Determination and Control for Dual Spacecraft Precision Formation Flying

NASA Technical Reports Server (NTRS)

Calhoun, Philip; Novo-Gradac, Anne-Marie; Shah, Neerav

2017-01-01

Many proposed formation flying missions seek to advance the state of the art in spacecraft science imaging by utilizing precision dual spacecraft formation flying to enable a virtual space telescope. Using precision dual spacecraft alignment, very long focal lengths can be achieved by locating the optics on one spacecraft and the detector on the other. Proposed science missions include astrophysics concepts with spacecraft separations from 1000 km to 25,000 km, such as the Milli-Arc-Second Structure Imager (MASSIM) and the New Worlds Observer, and Heliophysics concepts for solar coronagraphs and X-ray imaging with smaller separations (50m-500m). All of these proposed missions require advances in guidance, navigation, and control (GNC) for precision formation flying. In particular, very precise astrometric alignment control and estimation is required for precise inertial pointing of the virtual space telescope to enable science imaging orders of magnitude better than can be achieved with conventional single spacecraft instruments. This work develops design architectures, algorithms, and performance analysis of proposed GNC systems for precision dual spacecraft astrometric alignment. These systems employ a variety of GNC sensors and actuators, including laser-based alignment and ranging systems, optical imaging sensors (e.g. guide star telescope), inertial measurement units (IMU), as well as microthruster and precision stabilized platforms. A comprehensive GNC performance analysis is given for Heliophysics dual spacecraft PFF imaging mission concept.
Spacecraft Alignment Determination and Control for Dual Spacecraft Precision Formation Flying

NASA Technical Reports Server (NTRS)

Calhoun, Philip C.; Novo-Gradac, Anne-Marie; Shah, Neerav

2017-01-01

Many proposed formation flying missions seek to advance the state of the art in spacecraft science imaging by utilizing precision dual spacecraft formation flying to enable a virtual space telescope. Using precision dual spacecraft alignment, very long focal lengths can be achieved by locating the optics on one spacecraft and the detector on the other. Proposed science missions include astrophysics concepts with spacecraft separations from 1000 km to 25,000 km, such as the Milli-Arc-Second Structure Imager (MASSIM) and the New Worlds Observer, and Heliophysics concepts for solar coronagraphs and X-ray imaging with smaller separations (50m 500m). All of these proposed missions require advances in guidance, navigation, and control (GNC) for precision formation flying. In particular, very precise astrometric alignment control and estimation is required for precise inertial pointing of the virtual space telescope to enable science imaging orders of magnitude better than can be achieved with conventional single spacecraft instruments. This work develops design architectures, algorithms, and performance analysis of proposed GNC systems for precision dual spacecraft astrometric alignment. These systems employ a variety of GNC sensors and actuators, including laser-based alignment and ranging systems, optical imaging sensors (e.g. guide star telescope), inertial measurement units (IMU), as well as micro-thruster and precision stabilized platforms. A comprehensive GNC performance analysis is given for Heliophysics dual spacecraft PFF imaging mission concept.
Optimal image alignment with random projections of manifolds: algorithm and geometric analysis.

PubMed

Kokiopoulou, Effrosyni; Kressner, Daniel; Frossard, Pascal

2011-06-01

This paper addresses the problem of image alignment based on random measurements. Image alignment consists of estimating the relative transformation between a query image and a reference image. We consider the specific problem where the query image is provided in compressed form in terms of linear measurements captured by a vision sensor. We cast the alignment problem as a manifold distance minimization problem in the linear subspace defined by the measurements. The transformation manifold that represents synthesis of shift, rotation, and isotropic scaling of the reference image can be given in closed form when the reference pattern is sparsely represented over a parametric dictionary. We show that the objective function can then be decomposed as the difference of two convex functions (DC) in the particular case where the dictionary is built on Gaussian functions. Thus, the optimization problem becomes a DC program, which in turn can be solved globally by a cutting plane method. The quality of the solution is typically affected by the number of random measurements and the condition number of the manifold that describes the transformations of the reference image. We show that the curvature, which is closely related to the condition number, remains bounded in our image alignment problem, which means that the relative transformation between two images can be determined optimally in a reduced subspace.
Heuristics for multiobjective multiple sequence alignment.

PubMed

Abbasi, Maryam; Paquete, Luís; Pereira, Francisco B

2016-07-15

Aligning multiple sequences arises in many tasks in Bioinformatics. However, the alignments produced by the current software packages are highly dependent on the parameters setting, such as the relative importance of opening gaps with respect to the increase of similarity. Choosing only one parameter setting may provide an undesirable bias in further steps of the analysis and give too simplistic interpretations. In this work, we reformulate multiple sequence alignment from a multiobjective point of view. The goal is to generate several sequence alignments that represent a trade-off between maximizing the substitution score and minimizing the number of indels/gaps in the sum-of-pairs score function. This trade-off gives to the practitioner further information about the similarity of the sequences, from which she could analyse and choose the most plausible alignment. We introduce several heuristic approaches, based on local search procedures, that compute a set of sequence alignments, which are representative of the trade-off between the two objectives (substitution score and indels). Several algorithm design options are discussed and analysed, with particular emphasis on the influence of the starting alignment and neighborhood search definitions on the overall performance. A perturbation technique is proposed to improve the local search, which provides a wide range of high-quality alignments. The proposed approach is tested experimentally on a wide range of instances. We performed several experiments with sequences obtained from the benchmark database BAliBASE 3.0. To evaluate the quality of the results, we calculate the hypervolume indicator of the set of score vectors returned by the algorithms. The results obtained allow us to identify reasonably good choices of parameters for our approach. Further, we compared our method in terms of correctly aligned pairs ratio and columns correctly aligned ratio with respect to reference alignments. Experimental results show that our approaches can obtain better results than TCoffee and Clustal Omega in terms of the first ratio.
GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters.

PubMed

Sela, Itamar; Ashkenazy, Haim; Katoh, Kazutaka; Pupko, Tal

2015-07-01

Inference of multiple sequence alignments (MSAs) is a critical part of phylogenetic and comparative genomics studies. However, from the same set of sequences different MSAs are often inferred, depending on the methodologies used and the assumed parameters. Much effort has recently been devoted to improving the ability to identify unreliable alignment regions. Detecting such unreliable regions was previously shown to be important for downstream analyses relying on MSAs, such as the detection of positive selection. Here we developed GUIDANCE2, a new integrative methodology that accounts for: (i) uncertainty in the process of indel formation, (ii) uncertainty in the assumed guide tree and (iii) co-optimal solutions in the pairwise alignments, used as building blocks in progressive alignment algorithms. We compared GUIDANCE2 with seven methodologies to detect unreliable MSA regions using extensive simulations and empirical benchmarks. We show that GUIDANCE2 outperforms all previously developed methodologies. Furthermore, GUIDANCE2 also provides a set of alternative MSAs which can be useful for downstream analyses. The novel algorithm is implemented as a web-server, available at: http://guidance.tau.ac.il. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Fluorescein thiocarbamyl amino acids as internal standards for migration time correction in capillary sieving electrophoresis

PubMed Central

Pugsley, Haley R.; Swearingen, Kristian E.; Dovichi, Norman J.

2009-01-01

A number of algorithms have been developed to correct for migration time drift in capillary electrophoresis. Those algorithms require identification of common components in each run. However, not all components may be present or resolved in separations of complex samples, which can confound attempts for alignment. This paper reports the use of fluorescein thiocarbamyl derivatives of amino acids as internal standards for alignment of 3-(2-furoyl)quinoline-2-carboxaldehyde (FQ)-labeled proteins in capillary sieving electrophoresis. The fluorescein thiocarbamyl derivative of aspartic acid migrates before FQ-labeled proteins and the fluorescein thiocarbamyl derivative of arginine migrates after the FQ-labeled proteins. These compounds were used as internal standards to correct for variations in migration time over a two-week period in the separation of a cellular homogenate. The experimental conditions were deliberately manipulated by varying electric field and sample preparation conditions. Three components of the homogenate were used to evaluate the alignment efficiency. Before alignment, the average relative standard deviation in migration time for these components was 13.3%. After alignment, the average relative standard deviation in migration time for these components was reduced to 0.5%. PMID:19249052
Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign

PubMed Central

2007-01-01

Background Joint alignment and secondary structure prediction of two RNA sequences can significantly improve the accuracy of the structural predictions. Methods addressing this problem, however, are forced to employ constraints that reduce computation by restricting the alignments and/or structures (i.e. folds) that are permissible. In this paper, a new methodology is presented for the purpose of establishing alignment constraints based on nucleotide alignment and insertion posterior probabilities. Using a hidden Markov model, posterior probabilities of alignment and insertion are computed for all possible pairings of nucleotide positions from the two sequences. These alignment and insertion posterior probabilities are additively combined to obtain probabilities of co-incidence for nucleotide position pairs. A suitable alignment constraint is obtained by thresholding the co-incidence probabilities. The constraint is integrated with Dynalign, a free energy minimization algorithm for joint alignment and secondary structure prediction. The resulting method is benchmarked against the previous version of Dynalign and against other programs for pairwise RNA structure prediction. Results The proposed technique eliminates manual parameter selection in Dynalign and provides significant computational time savings in comparison to prior constraints in Dynalign while simultaneously providing a small improvement in the structural prediction accuracy. Savings are also realized in memory. In experiments over a 5S RNA dataset with average sequence length of approximately 120 nucleotides, the method reduces computation by a factor of 2. The method performs favorably in comparison to other programs for pairwise RNA structure prediction: yielding better accuracy, on average, and requiring significantly lesser computational resources. Conclusion Probabilistic analysis can be utilized in order to automate the determination of alignment constraints for pairwise RNA structure prediction methods in a principled fashion. These constraints can reduce the computational and memory requirements of these methods while maintaining or improving their accuracy of structural prediction. This extends the practical reach of these methods to longer length sequences. The revised Dynalign code is freely available for download. PMID:17445273
Second Language Writing Classification System Based on Word-Alignment Distribution

ERIC Educational Resources Information Center

Kotani, Katsunori; Yoshimi, Takehiko

2010-01-01

The present paper introduces an automatic classification system for assisting second language (L2) writing evaluation. This system, which classifies sentences written by L2 learners as either native speaker-like or learner-like sentences, is constructed by machine learning algorithms using word-alignment distributions as classification features…
Approximate static condensation algorithm for solving multi-material diffusion problems on meshes non-aligned with material interfaces

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kikinzon, Evgeny; Kuznetsov, Yuri; Lipnikov, Konstatin

In this study, we describe a new algorithm for solving multi-material diffusion problem when material interfaces are not aligned with the mesh. In this case interface reconstruction methods are used to construct approximate representation of interfaces between materials. They produce so-called multi-material cells, in which materials are represented by material polygons that contain only one material. The reconstructed interface is not continuous between cells. Finally, we suggest the new method for solving multi-material diffusion problems on such meshes and compare its performance with known homogenization methods.
Active angular alignment of gauge blocks in double-ended interferometers.

PubMed

Buchta, Zdeněk; Reřucha, Simon; Hucl, Václav; Cížek, Martin; Sarbort, Martin; Lazar, Josef; Cíp, Ondřej

2013-09-27

This paper presents a method implemented in a system for automatic contactless calibration of gauge blocks designed at ISI ASCR. The system combines low-coherence interferometry and laser interferometry, where the first identifies the gauge block sides position and the second one measures the gauge block length itself. A crucial part of the system is the algorithm for gauge block alignment to the measuring beam which is able to compensate the gauge block lateral and longitudinal tilt up to 0.141 mrad. The algorithm is also important for the gauge block position monitoring during its length measurement.

Active Angular Alignment of Gauge Blocks in Double-Ended Interferometers

PubMed Central

Buchta, Zdeněk; Řeřucha, Šimon; Hucl, Václav; Čížek, Martin; Šarbort, Martin; Lazar, Josef; Číp, Ondřej

2013-01-01

This paper presents a method implemented in a system for automatic contactless calibration of gauge blocks designed at ISI ASCR. The system combines low-coherence interferometry and laser interferometry, where the first identifies the gauge block sides position and the second one measures the gauge block length itself. A crucial part of the system is the algorithm for gauge block alignment to the measuring beam which is able to compensate the gauge block lateral and longitudinal tilt up to 0.141 mrad. The algorithm is also important for the gauge block position monitoring during its length measurement. PMID:24084107
Approximate static condensation algorithm for solving multi-material diffusion problems on meshes non-aligned with material interfaces

DOE PAGES

Kikinzon, Evgeny; Kuznetsov, Yuri; Lipnikov, Konstatin; ...

2017-07-08

In this study, we describe a new algorithm for solving multi-material diffusion problem when material interfaces are not aligned with the mesh. In this case interface reconstruction methods are used to construct approximate representation of interfaces between materials. They produce so-called multi-material cells, in which materials are represented by material polygons that contain only one material. The reconstructed interface is not continuous between cells. Finally, we suggest the new method for solving multi-material diffusion problems on such meshes and compare its performance with known homogenization methods.
Electron tomography simulator with realistic 3D phantom for evaluation of acquisition, alignment and reconstruction methods.

PubMed

Wan, Xiaohua; Katchalski, Tsvi; Churas, Christopher; Ghosh, Sreya; Phan, Sebastien; Lawrence, Albert; Hao, Yu; Zhou, Ziying; Chen, Ruijuan; Chen, Yu; Zhang, Fa; Ellisman, Mark H

2017-05-01

Because of the significance of electron microscope tomography in the investigation of biological structure at nanometer scales, ongoing improvement efforts have been continuous over recent years. This is particularly true in the case of software developments. Nevertheless, verification of improvements delivered by new algorithms and software remains difficult. Current analysis tools do not provide adaptable and consistent methods for quality assessment. This is particularly true with images of biological samples, due to image complexity, variability, low contrast and noise. We report an electron tomography (ET) simulator with accurate ray optics modeling of image formation that includes curvilinear trajectories through the sample, warping of the sample and noise. As a demonstration of the utility of our approach, we have concentrated on providing verification of the class of reconstruction methods applicable to wide field images of stained plastic-embedded samples. Accordingly, we have also constructed digital phantoms derived from serial block face scanning electron microscope images. These phantoms are also easily modified to include alignment features to test alignment algorithms. The combination of more realistic phantoms with more faithful simulations facilitates objective comparison of acquisition parameters, alignment and reconstruction algorithms and their range of applicability. With proper phantoms, this approach can also be modified to include more complex optical models, including distance-dependent blurring and phase contrast functions, such as may occur in cryotomography. Copyright © 2017 Elsevier Inc. All rights reserved.
Development and application of an algorithm to compute weighted multiple glycan alignments.

PubMed

Hosoda, Masae; Akune, Yukie; Aoki-Kinoshita, Kiyoko F

2017-05-01

A glycan consists of monosaccharides linked by glycosidic bonds, has branches and forms complex molecular structures. Databases have been developed to store large amounts of glycan-binding experiments, including glycan arrays with glycan-binding proteins. However, there are few bioinformatics techniques to analyze large amounts of data for glycans because there are few tools that can handle the complexity of glycan structures. Thus, we have developed the MCAW (Multiple Carbohydrate Alignment with Weights) tool that can align multiple glycan structures, to aid in the understanding of their function as binding recognition molecules. We have described in detail the first algorithm to perform multiple glycan alignments by modeling glycans as trees. To test our tool, we prepared several data sets, and as a result, we found that the glycan motif could be successfully aligned without any prior knowledge applied to the tool, and the known recognition binding sites of glycans could be aligned at a high rate amongst all our datasets tested. We thus claim that our tool is able to find meaningful glycan recognition and binding patterns using data obtained by glycan-binding experiments. The development and availability of an effective multiple glycan alignment tool opens possibilities for many other glycoinformatics analysis, making this work a big step towards furthering glycomics analysis. http://www.rings.t.soka.ac.jp. kkiyoko@soka.ac.jp. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.
Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers.

PubMed

Galpert, Deborah; Fernández, Alberto; Herrera, Francisco; Antunes, Agostinho; Molina-Ruiz, Reinaldo; Agüero-Chapin, Guillermin

2018-05-03

The development of new ortholog detection algorithms and the improvement of existing ones are of major importance in functional genomics. We have previously introduced a successful supervised pairwise ortholog classification approach implemented in a big data platform that considered several pairwise protein features and the low ortholog pair ratios found between two annotated proteomes (Galpert, D et al., BioMed Research International, 2015). The supervised models were built and tested using a Saccharomycete yeast benchmark dataset proposed by Salichos and Rokas (2011). Despite several pairwise protein features being combined in a supervised big data approach; they all, to some extent were alignment-based features and the proposed algorithms were evaluated on a unique test set. Here, we aim to evaluate the impact of alignment-free features on the performance of supervised models implemented in the Spark big data platform for pairwise ortholog detection in several related yeast proteomes. The Spark Random Forest and Decision Trees with oversampling and undersampling techniques, and built with only alignment-based similarity measures or combined with several alignment-free pairwise protein features showed the highest classification performance for ortholog detection in three yeast proteome pairs. Although such supervised approaches outperformed traditional methods, there were no significant differences between the exclusive use of alignment-based similarity measures and their combination with alignment-free features, even within the twilight zone of the studied proteomes. Just when alignment-based and alignment-free features were combined in Spark Decision Trees with imbalance management, a higher success rate (98.71%) within the twilight zone could be achieved for a yeast proteome pair that underwent a whole genome duplication. The feature selection study showed that alignment-based features were top-ranked for the best classifiers while the runners-up were alignment-free features related to amino acid composition. The incorporation of alignment-free features in supervised big data models did not significantly improve ortholog detection in yeast proteomes regarding the classification qualities achieved with just alignment-based similarity measures. However, the similarity of their classification performance to that of traditional ortholog detection methods encourages the evaluation of other alignment-free protein pair descriptors in future research.
Spatio-temporal alignment of multiple sensors

NASA Astrophysics Data System (ADS)

Zhang, Tinghua; Ni, Guoqiang; Fan, Guihua; Sun, Huayan; Yang, Biao

2018-01-01

Aiming to achieve the spatio-temporal alignment of multi sensor on the same platform for space target observation, a joint spatio-temporal alignment method is proposed. To calibrate the parameters and measure the attitude of cameras, an astronomical calibration method is proposed based on star chart simulation and collinear invariant features of quadrilateral diagonal between the observed star chart. In order to satisfy a temporal correspondence and spatial alignment similarity simultaneously, the method based on the astronomical calibration and attitude measurement in this paper formulates the video alignment to fold the spatial and temporal alignment into a joint alignment framework. The advantage of this method is reinforced by exploiting the similarities and prior knowledge of velocity vector field between adjacent frames, which is calculated by the SIFT Flow algorithm. The proposed method provides the highest spatio-temporal alignment accuracy compared to the state-of-the-art methods on sequences recorded from multi sensor at different times.
Profile-Based LC-MS Data Alignment—A Bayesian Approach

PubMed Central

Tsai, Tsung-Heng; Tadesse, Mahlet G.; Wang, Yue; Ressom, Habtom W.

2014-01-01

A Bayesian alignment model (BAM) is proposed for alignment of liquid chromatography-mass spectrometry (LC-MS) data. BAM belongs to the category of profile-based approaches, which are composed of two major components: a prototype function and a set of mapping functions. Appropriate estimation of these functions is crucial for good alignment results. BAM uses Markov chain Monte Carlo (MCMC) methods to draw inference on the model parameters and improves on existing MCMC-based alignment methods through 1) the implementation of an efficient MCMC sampler and 2) an adaptive selection of knots. A block Metropolis-Hastings algorithm that mitigates the problem of the MCMC sampler getting stuck at local modes of the posterior distribution is used for the update of the mapping function coefficients. In addition, a stochastic search variable selection (SSVS) methodology is used to determine the number and positions of knots. We applied BAM to a simulated data set, an LC-MS proteomic data set, and two LC-MS metabolomic data sets, and compared its performance with the Bayesian hierarchical curve registration (BHCR) model, the dynamic time-warping (DTW) model, and the continuous profile model (CPM). The advantage of applying appropriate profile-based retention time correction prior to performing a feature-based approach is also demonstrated through the metabolomic data sets. PMID:23929872
Automatic classification of protein structures relying on similarities between alignments

PubMed Central

2012-01-01

Background Identification of protein structural cores requires isolation of sets of proteins all sharing a same subset of structural motifs. In the context of an ever growing number of available 3D protein structures, standard and automatic clustering algorithms require adaptations so as to allow for efficient identification of such sets of proteins. Results When considering a pair of 3D structures, they are stated as similar or not according to the local similarities of their matching substructures in a structural alignment. This binary relation can be represented in a graph of similarities where a node represents a 3D protein structure and an edge states that two 3D protein structures are similar. Therefore, classifying proteins into structural families can be viewed as a graph clustering task. Unfortunately, because such a graph encodes only pairwise similarity information, clustering algorithms may include in the same cluster a subset of 3D structures that do not share a common substructure. In order to overcome this drawback we first define a ternary similarity on a triple of 3D structures as a constraint to be satisfied by the graph of similarities. Such a ternary constraint takes into account similarities between pairwise alignments, so as to ensure that the three involved protein structures do have some common substructure. We propose hereunder a modification algorithm that eliminates edges from the original graph of similarities and gives a reduced graph in which no ternary constraints are violated. Our approach is then first to build a graph of similarities, then to reduce the graph according to the modification algorithm, and finally to apply to the reduced graph a standard graph clustering algorithm. Such method was used for classifying ASTRAL-40 non-redundant protein domains, identifying significant pairwise similarities with Yakusa, a program devised for rapid 3D structure alignments. Conclusions We show that filtering similarities prior to standard graph based clustering process by applying ternary similarity constraints i) improves the separation of proteins of different classes and consequently ii) improves the classification quality of standard graph based clustering algorithms according to the reference classification SCOP. PMID:22974051
biobambam: tools for read pair collation based algorithms on BAM files

PubMed Central

2014-01-01

Background Sequence alignment data is often ordered by coordinate (id of the reference sequence plus position on the sequence where the fragment was mapped) when stored in BAM files, as this simplifies the extraction of variants between the mapped data and the reference or of variants within the mapped data. In this order paired reads are usually separated in the file, which complicates some other applications like duplicate marking or conversion to the FastQ format which require to access the full information of the pairs. Results In this paper we introduce biobambam, a set of tools based on the efficient collation of alignments in BAM files by read name. The employed collation algorithm avoids time and space consuming sorting of alignments by read name where this is possible without using more than a specified amount of main memory. Using this algorithm tasks like duplicate marking in BAM files and conversion of BAM files to the FastQ format can be performed very efficiently with limited resources. We also make the collation algorithm available in the form of an API for other projects. This API is part of the libmaus package. Conclusions In comparison with previous approaches to problems involving the collation of alignments by read name like the BAM to FastQ or duplication marking utilities our approach can often perform an equivalent task more efficiently in terms of the required main memory and run-time. Our BAM to FastQ conversion is faster than all widely known alternatives including Picard and bamUtil. Our duplicate marking is about as fast as the closest competitor bamUtil for small data sets and faster than all known alternatives on large and complex data sets.
A stochastic tabu search algorithm to align physician schedule with patient flow.

PubMed

Niroumandrad, Nazgol; Lahrichi, Nadia

2018-06-01

In this study, we consider the pretreatment phase for cancer patients. This is defined as the period between the referral to a cancer center and the confirmation of the treatment plan. Physicians have been identified as bottlenecks in this process, and the goal is to determine a weekly cyclic schedule that improves the patient flow and shortens the pretreatment duration. High uncertainty is associated with the arrival day, profile and type of cancer of each patient. We also include physician satisfaction in the objective function. We present a MIP model for the problem and develop a tabu search algorithm, considering both deterministic and stochastic cases. Experiments show that our method compares very well to CPLEX under deterministic conditions. We describe the stochastic approach in detail and present a real application.
3D registration of depth data of porous surface coatings based on 3D phase correlation and the trimmed ICP algorithm

NASA Astrophysics Data System (ADS)

Loftfield, Nina; Kästner, Markus; Reithmeier, Eduard

2017-06-01

A critical factor of endoprostheses is the quality of the tribological pairing. The objective of this research project is to manufacture stochastically porous aluminum oxide surface coatings with high wear resistance and an active friction minimization. There are many experimental and computational techniques from mercury porosimetry to imaging methods for studying porous materials, however, the characterization of disordered pore networks is still a great challenge. To meet this challenge it is striven to gain a three dimensional high resolution reconstruction of the surface. In this work, the reconstruction is approached by repeatedly milling down the surface by a fixed decrement while measuring each layer using a confocal laser scanning microscope (CLSM). The so acquired depth data of the successive layers is then registered pairwise. Within this work a direct registration approach is deployed and implemented in two steps, a coarse and a fine alignment. The coarse alignment of the depth data is limited to a translational shift which occurs in horizontal direction due to placing the sample in turns under the CLSM and the milling machine and in vertical direction due to the milling process itself. The shift is determined by an approach utilizing 3D phase correlation. The fine alignment is implemented by the Trimmed Iterative Closest Point algorithm, matching the most likely common pixels roughly specified by an estimated overlap rate. With the presented two-step approach a proper 3D registration of the successive depth data of the layer is obtained.
Lord-Wingersky Algorithm Version 2.0 for Hierarchical Item Factor Models with Applications in Test Scoring, Scale Alignment, and Model Fit Testing. CRESST Report 830

ERIC Educational Resources Information Center

Cai, Li

2013-01-01

Lord and Wingersky's (1984) recursive algorithm for creating summed score based likelihoods and posteriors has a proven track record in unidimensional item response theory (IRT) applications. Extending the recursive algorithm to handle multidimensionality is relatively simple, especially with fixed quadrature because the recursions can be defined…
A multimedia retrieval framework based on semi-supervised ranking and relevance feedback.

PubMed

Yang, Yi; Nie, Feiping; Xu, Dong; Luo, Jiebo; Zhuang, Yueting; Pan, Yunhe

2012-04-01

We present a new framework for multimedia content analysis and retrieval which consists of two independent algorithms. First, we propose a new semi-supervised algorithm called ranking with Local Regression and Global Alignment (LRGA) to learn a robust Laplacian matrix for data ranking. In LRGA, for each data point, a local linear regression model is used to predict the ranking scores of its neighboring points. A unified objective function is then proposed to globally align the local models from all the data points so that an optimal ranking score can be assigned to each data point. Second, we propose a semi-supervised long-term Relevance Feedback (RF) algorithm to refine the multimedia data representation. The proposed long-term RF algorithm utilizes both the multimedia data distribution in multimedia feature space and the history RF information provided by users. A trace ratio optimization problem is then formulated and solved by an efficient algorithm. The algorithms have been applied to several content-based multimedia retrieval applications, including cross-media retrieval, image retrieval, and 3D motion/pose data retrieval. Comprehensive experiments on four data sets have demonstrated its advantages in precision, robustness, scalability, and computational efficiency.
Experimental Evaluation of a Deformable Registration Algorithm for Motion Correction in PET-CT Guided Biopsy.

PubMed

Khare, Rahul; Sala, Guillaume; Kinahan, Paul; Esposito, Giuseppe; Banovac, Filip; Cleary, Kevin; Enquobahrie, Andinet

2013-01-01

Positron emission tomography computed tomography (PET-CT) images are increasingly being used for guidance during percutaneous biopsy. However, due to the physics of image acquisition, PET-CT images are susceptible to problems due to respiratory and cardiac motion, leading to inaccurate tumor localization, shape distortion, and attenuation correction. To address these problems, we present a method for motion correction that relies on respiratory gated CT images aligned using a deformable registration algorithm. In this work, we use two deformable registration algorithms and two optimization approaches for registering the CT images obtained over the respiratory cycle. The two algorithms are the BSpline and the symmetric forces Demons registration. In the first optmization approach, CT images at each time point are registered to a single reference time point. In the second approach, deformation maps are obtained to align each CT time point with its adjacent time point. These deformations are then composed to find the deformation with respect to a reference time point. We evaluate these two algorithms and optimization approaches using respiratory gated CT images obtained from 7 patients. Our results show that overall the BSpline registration algorithm with the reference optimization approach gives the best results.
CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions.

PubMed

Liu, Yongchao; Wirawan, Adrianto; Schmidt, Bertil

2013-04-04

The maximal sensitivity for local alignments makes the Smith-Waterman algorithm a popular choice for protein sequence database search based on pairwise alignment. However, the algorithm is compute-intensive due to a quadratic time complexity. Corresponding runtimes are further compounded by the rapid growth of sequence databases. We present CUDASW++ 3.0, a fast Smith-Waterman protein database search algorithm, which couples CPU and GPU SIMD instructions and carries out concurrent CPU and GPU computations. For the CPU computation, this algorithm employs SSE-based vector execution units as accelerators. For the GPU computation, we have investigated for the first time a GPU SIMD parallelization, which employs CUDA PTX SIMD video instructions to gain more data parallelism beyond the SIMT execution model. Moreover, sequence alignment workloads are automatically distributed over CPUs and GPUs based on their respective compute capabilities. Evaluation on the Swiss-Prot database shows that CUDASW++ 3.0 gains a performance improvement over CUDASW++ 2.0 up to 2.9 and 3.2, with a maximum performance of 119.0 and 185.6 GCUPS, on a single-GPU GeForce GTX 680 and a dual-GPU GeForce GTX 690 graphics card, respectively. In addition, our algorithm has demonstrated significant speedups over other top-performing tools: SWIPE and BLAST+. CUDASW++ 3.0 is written in CUDA C++ and PTX assembly languages, targeting GPUs based on the Kepler architecture. This algorithm obtains significant speedups over its predecessor: CUDASW++ 2.0, by benefiting from the use of CPU and GPU SIMD instructions as well as the concurrent execution on CPUs and GPUs. The source code and the simulated data are available at http://cudasw.sourceforge.net.
Optical derotator alignment using image-processing algorithm for tracking laser vibrometer measurements of rotating objects.

PubMed

Khalil, Hossam; Kim, Dongkyu; Jo, Youngjoon; Park, Kyihwan

2017-06-01

An optical component called a Dove prism is used to rotate the laser beam of a laser-scanning vibrometer (LSV). This is called a derotator and is used for measuring the vibration of rotating objects. The main advantage of a derotator is that it works independently from an LSV. However, this device requires very specific alignment, in which the axis of the Dove prism must coincide with the rotational axis of the object. If the derotator is misaligned with the rotating object, the results of the vibration measurement are imprecise, owing to the alteration of the laser beam on the surface of the rotating object. In this study, a method is proposed for aligning a derotator with a rotating object through an image-processing algorithm that obtains the trajectory of a landmark attached to the object. After the trajectory of the landmark is mathematically modeled, the amount of derotator misalignment with respect to the object is calculated. The accuracy of the proposed method for aligning the derotator with the rotating object is experimentally tested.
KISS for STRAP: user extensions for a protein alignment editor.

PubMed

Gille, Christoph; Lorenzen, Stephan; Michalsky, Elke; Frömmel, Cornelius

2003-12-12

The Structural Alignment Program STRAP is a comfortable comprehensive editor and analyzing tool for protein alignments. A wide range of functions related to protein sequences and protein structures are accessible with an intuitive graphical interface. Recent features include mapping of mutations and polymorphisms onto structures and production of high quality figures for publication. Here we address the general problem of multi-purpose program packages to keep up with the rapid development of bioinformatical methods and the demand for specific program functions. STRAP was remade implementing a novel design which aims at Keeping Interfaces in STRAP Simple (KISS). KISS renders STRAP extendable to bio-scientists as well as to bio-informaticians. Scientists with basic computer skills are capable of implementing statistical methods or embedding existing bioinformatical tools in STRAP themselves. For bio-informaticians STRAP may serve as an environment for rapid prototyping and testing of complex algorithms such as automatic alignment algorithms or phylogenetic methods. Further, STRAP can be applied as an interactive web applet to present data related to a particular protein family and as a teaching tool. JAVA-1.4 or higher. http://www.charite.de/bioinf/strap/
Darwin v. 2.0: an interpreted computer language for the biosciences.

PubMed

Gonnet, G H; Hallett, M T; Korostensky, C; Bernardin, L

2000-02-01

We announce the availability of the second release of Darwin v. 2.0, an interpreted computer language especially tailored to researchers in the biosciences. The system is a general tool applicable to a wide range of problems. This second release improves Darwin version 1.6 in several ways: it now contains (1) a larger set of libraries touching most of the classical problems from computational biology (pairwise alignment, all versus all alignments, tree construction, multiple sequence alignment), (2) an expanded set of general purpose algorithms (search algorithms for discrete problems, matrix decomposition routines, complex/long integer arithmetic operations), (3) an improved language with a cleaner syntax, (4) better on-line help, and (5) a number of fixes to user-reported bugs. Darwin is made available for most operating systems free of char ge from the Computational Biochemistry Research Group (CBRG), reachable at http://chrg.inf.ethz.ch. darwin@inf.ethz.ch
The practical evaluation of DNA barcode efficacy.

PubMed

Spouge, John L; Mariño-Ramírez, Leonardo

2012-01-01

This chapter describes a workflow for measuring the efficacy of a barcode in identifying species. First, assemble individual sequence databases corresponding to each barcode marker. A controlled collection of taxonomic data is preferable to GenBank data, because GenBank data can be problematic, particularly when comparing barcodes based on more than one marker. To ensure proper controls when evaluating species identification, specimens not having a sequence in every marker database should be discarded. Second, select a computer algorithm for assigning species to barcode sequences. No algorithm has yet improved notably on assigning a specimen to the species of its nearest neighbor within a barcode database. Because global sequence alignments (e.g., with the Needleman-Wunsch algorithm, or some related algorithm) examine entire barcode sequences, they generally produce better species assignments than local sequence alignments (e.g., with BLAST). No neighboring method (e.g., global sequence similarity, global sequence distance, or evolutionary distance based on a global alignment) has yet shown a notable superiority in identifying species. Finally, "the probability of correct identification" (PCI) provides an appropriate measurement of barcode efficacy. The overall PCI for a data set is the average of the species PCIs, taken over all species in the data set. This chapter states explicitly how to calculate PCI, how to estimate its statistical sampling error, and how to use data on PCR failure to set limits on how much improvements in PCR technology can improve species identification.
SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.

PubMed

Liu, Kevin; Warnow, Tandy J; Holder, Mark T; Nelesen, Serita M; Yu, Jiaye; Stamatakis, Alexandros P; Linder, C Randal

2012-01-01

Highly accurate estimation of phylogenetic trees for large data sets is difficult, in part because multiple sequence alignments must be accurate for phylogeny estimation methods to be accurate. Coestimation of alignments and trees has been attempted but currently only SATé estimates reasonably accurate trees and alignments for large data sets in practical time frames (Liu K., Raghavan S., Nelesen S., Linder C.R., Warnow T. 2009b. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science. 324:1561-1564). Here, we present a modification to the original SATé algorithm that improves upon SATé (which we now call SATé-I) in terms of speed and of phylogenetic and alignment accuracy. SATé-II uses a different divide-and-conquer strategy than SATé-I and so produces smaller more closely related subsets than SATé-I; as a result, SATé-II produces more accurate alignments and trees, can analyze larger data sets, and runs more efficiently than SATé-I. Generally, SATé is a metamethod that takes an existing multiple sequence alignment method as an input parameter and boosts the quality of that alignment method. SATé-II-boosted alignment methods are significantly more accurate than their unboosted versions, and trees based upon these improved alignments are more accurate than trees based upon the original alignments. Because SATé-I used maximum likelihood (ML) methods that treat gaps as missing data to estimate trees and because we found a correlation between the quality of tree/alignment pairs and ML scores, we explored the degree to which SATé's performance depends on using ML with gaps treated as missing data to determine the best tree/alignment pair. We present two lines of evidence that using ML with gaps treated as missing data to optimize the alignment and tree produces very poor results. First, we show that the optimization problem where a set of unaligned DNA sequences is given and the output is the tree and alignment of those sequences that maximize likelihood under the Jukes-Cantor model is uninformative in the worst possible sense. For all inputs, all trees optimize the likelihood score. Second, we show that a greedy heuristic that uses GTR+Gamma ML to optimize the alignment and the tree can produce very poor alignments and trees. Therefore, the excellent performance of SATé-II and SATé-I is not because ML is used as an optimization criterion for choosing the best tree/alignment pair but rather due to the particular divide-and-conquer realignment techniques employed.

Automation of the targeting and reflective alignment concept

NASA Technical Reports Server (NTRS)

Redfield, Robin C.

1992-01-01

The automated alignment system, described herein, employs a reflective, passive (requiring no power) target and includes a PC-based imaging system and one camera mounted on a six degree of freedom robot manipulator. The system detects and corrects for manipulator misalignment in three translational and three rotational directions by employing the Targeting and Reflective Alignment Concept (TRAC), which simplifies alignment by decoupling translational and rotational alignment control. The concept uses information on the camera and the target's relative position based on video feedback from the camera. These relative positions are converted into alignment errors and minimized by motions of the robot. The system is robust to exogenous lighting by virtue of a subtraction algorithm which enables the camera to only see the target. These capabilities are realized with relatively minimal complexity and expense.
SU-E-J-218: Novel Validation Paradigm of MRI to CT Deformation of Prostate

DOE Office of Scientific and Technical Information (OSTI.GOV)

Padgett, K; University of Miami School of Medicine - Radiology, Miami, FL; Pirozzi, S

2015-06-15

Purpose: Deformable registration algorithms are inherently difficult to characterize in the multi-modality setting due to a significant differences in the characteristics of the different modalities (CT and MRI) as well as tissue deformations. We present a unique paradigm where this is overcome by utilizing a planning-MRI acquired within an hour of the planning-CT serving as a surrogate for quantifying MRI to CT deformation by eliminating the issues of multi-modality comparisons. Methods: For nine subjects, T2 fast-spin-echo images were acquired at two different time points, the first several weeks prior to planning (diagnostic-MRI) and the second on the same day asmore » the planning-CT (planning-MRI). Significant effort in patient positioning and bowel/bladder preparation was undertaken to minimize distortion of the prostate in all datasets. The diagnostic-MRI was rigidly and deformably aligned to the planning-CT utilizing a commercially available deformable registration algorithm synthesized from local registrations. Additionally, the quality of rigid alignment was ranked by an imaging physicist. The distances between corresponding anatomical landmarks on rigid and deformed registrations (diagnostic-MR to planning-CT) were evaluated. Results: It was discovered that in cases where the rigid registration was of acceptable quality the deformable registration didn’t improve the alignment, this was true of all metrics employed. If the analysis is separated into cases where the rigid alignment was ranked as unacceptable the deformable registration significantly improved the alignment, 4.62mm residual error in landmarks as compared to 5.72mm residual error in rigid alignments with a p-value of 0.0008. Conclusion: This paradigm provides an ideal testing ground for MR to CT deformable registration algorithms by allowing for inter-modality comparisons of multi-modality registrations. Consistent positioning, bowel and bladder preparation may Result in higher quality rigid registrations than typically achieved which limits the impact of deformable registrations. In this study cases where significant differences exist, deformable registrations provide significant value.« less
High-speed peak matching algorithm for retention time alignment of gas chromatographic data for chemometric analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnson, Kevin J.; Wright, Bob W.; Jarman, Kristin H.

2003-05-09

A rapid retention time alignment algorithm was developed as a preprocessing utility to be used prior to chemometric analysis of large datasets of diesel fuel gas chromatographic profiles. Retention time variation from chromatogram-to-chromatogram has been a significant impediment against the use of chemometric techniques in the analysis of chromatographic data due to the inability of current multivariate techniques to correctly model information that shifts from variable to variable within a dataset. The algorithm developed is shown to increase the efficacy of pattern recognition methods applied to a set of diesel fuel chromatograms by retaining chemical selectivity while reducing chromatogram-to-chromatogram retentionmore » time variations and to do so on a time scale that makes analysis of large sets of chromatographic data practical.« less
Polynomial interpretation of multipole vectors

NASA Astrophysics Data System (ADS)

Katz, Gabriel; Weeks, Jeff

2004-09-01

Copi, Huterer, Starkman, and Schwarz introduced multipole vectors in a tensor context and used them to demonstrate that the first-year Wilkinson microwave anisotropy probe (WMAP) quadrupole and octopole planes align at roughly the 99.9% confidence level. In the present article, the language of polynomials provides a new and independent derivation of the multipole vector concept. Bézout’s theorem supports an elementary proof that the multipole vectors exist and are unique (up to rescaling). The constructive nature of the proof leads to a fast, practical algorithm for computing multipole vectors. We illustrate the algorithm by finding exact solutions for some simple toy examples and numerical solutions for the first-year WMAP quadrupole and octopole. We then apply our algorithm to Monte Carlo skies to independently reconfirm the estimate that the WMAP quadrupole and octopole planes align at the 99.9% level.
TotalReCaller: improved accuracy and performance via integrated alignment and base-calling.

PubMed

Menges, Fabian; Narzisi, Giuseppe; Mishra, Bud

2011-09-01

Currently, re-sequencing approaches use multiple modules serially to interpret raw sequencing data from next-generation sequencing platforms, while remaining oblivious to the genomic information until the final alignment step. Such approaches fail to exploit the full information from both raw sequencing data and the reference genome that can yield better quality sequence reads, SNP-calls, variant detection, as well as an alignment at the best possible location in the reference genome. Thus, there is a need for novel reference-guided bioinformatics algorithms for interpreting analog signals representing sequences of the bases ({A, C, G, T}), while simultaneously aligning possible sequence reads to a source reference genome whenever available. Here, we propose a new base-calling algorithm, TotalReCaller, to achieve improved performance. A linear error model for the raw intensity data and Burrows-Wheeler transform (BWT) based alignment are combined utilizing a Bayesian score function, which is then globally optimized over all possible genomic locations using an efficient branch-and-bound approach. The algorithm has been implemented in soft- and hardware [field-programmable gate array (FPGA)] to achieve real-time performance. Empirical results on real high-throughput Illumina data were used to evaluate TotalReCaller's performance relative to its peers-Bustard, BayesCall, Ibis and Rolexa-based on several criteria, particularly those important in clinical and scientific applications. Namely, it was evaluated for (i) its base-calling speed and throughput, (ii) its read accuracy and (iii) its specificity and sensitivity in variant calling. A software implementation of TotalReCaller as well as additional information, is available at: http://bioinformatics.nyu.edu/wordpress/projects/totalrecaller/ fabian.menges@nyu.edu.
Accuracy in breast shape alignment with 3D surface fitting algorithms.

PubMed

Riboldi, Marco; Gierga, David P; Chen, George T Y; Baroni, Guido

2009-04-01

Surface imaging is in use in radiotherapy clinical practice for patient setup optimization and monitoring. Breast alignment is accomplished by searching for a tentative spatial correspondence between the reference and daily surface shape models. In this study, the authors quantify whole breast shape alignment by relying on texture features digitized on 3D surface models. Texture feature localization was validated through repeated measurements in a silicone breast phantom, mounted on a high precision mechanical stage. Clinical investigations on breast shape alignment included 133 fractions in 18 patients treated with accelerated partial breast irradiation. The breast shape was detected with a 3D video based surface imaging system so that breathing was compensated. An in-house algorithm for breast alignment, based on surface fitting constrained by nipple matching (constrained surface fitting), was applied. Results were compared with a commercial software where no constraints are utilized (unconstrained surface fitting). Texture feature localization was validated within 2 mm in each anatomical direction. Clinical data show that unconstrained surface fitting achieves adequate accuracy in most cases, though nipple mismatch is considerably higher than residual surface distances (3.9 mm vs 0.6 mm on average). Outliers beyond 1 cm can be experienced as the result of a degenerate surface fit, where unconstrained surface fitting is not sufficient to establish spatial correspondence. In the constrained surface fitting algorithm, average surface mismatch within 1 mm was obtained when nipple position was forced to match in the [1.5; 5] mm range. In conclusion, optimal results can be obtained by trading off the desired overall surface congruence vs matching of selected landmarks (constraint). Constrained surface fitting is put forward to represent an improvement in setup accuracy for those applications where whole breast positional reproducibility is an issue.
GASP: Gapped Ancestral Sequence Prediction for proteins

PubMed Central

Edwards, Richard J; Shields, Denis C

2004-01-01

Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199
fRMSDPred: Predicting Local RMSD Between Structural Fragments Using Sequence Information

DTIC Science & Technology

2007-04-04

machine learning approaches for estimating the RMSD value of a pair of protein fragments. These estimated fragment-level RMSD values can be used to construct the alignment, assess the quality of an alignment, and identify high-quality alignment segments. We present algorithms to solve this fragment-level RMSD prediction problem using a supervised learning framework based on support vector regression and classification that incorporates protein profiles, predicted secondary structure, effective information encoding schemes, and novel second-order pairwise exponential kernel
GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

PubMed

Alser, Mohammed; Hassan, Hasan; Xin, Hongyi; Ergin, Oguz; Mutlu, Onur; Alkan, Can

2017-11-01

High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10. https://github.com/BilkentCompGen/GateKeeper. mohammedalser@bilkent.edu.tr or onur.mutlu@inf.ethz.ch or calkan@cs.bilkent.edu.tr. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

PubMed

Luczak, Brian B; James, Benjamin T; Girgis, Hani Z

2017-12-06

Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.
A CAD System for Evaluating Footwear Fit

NASA Astrophysics Data System (ADS)

Savadkoohi, Bita Ture; de Amicis, Raffaele

With the great growth in footwear demand, the footwear manufacturing industry, for achieving commercial success, must be able to provide the footwear that fulfills consumer's requirement better than it's competitors. Accurate fitting for shoes is an important factor in comfort and functionality. Footwear fitter measurement have been using manual measurement for a long time, but the development of 3D acquisition devices and the advent of powerful 3D visualization and modeling techniques, automatically analyzing, searching and interpretation of the models have now made automatic determination of different foot dimensions feasible. In this paper, we proposed an approach for finding footwear fit within the shoe last data base. We first properly aligned the 3D models using "Weighted" Principle Component Analysis (WPCA). After solving the alignment problem we used an efficient algorithm for cutting the 3D model in order to find the footwear fit from shoe last data base.
Understanding the critical challenges of self-aligned octuple patterning

NASA Astrophysics Data System (ADS)

Yu, Ji; Xiao, Wei; Kang, Weiling; Chen, Yijian

2014-03-01

In this paper, we present a thorough investigation of self-aligned octuple patterning (SAOP) process characteristics, cost structure, integration challenges, and layout decomposition. The statistical characteristics of SAOP CD variations such as multi-modality are analyzed and contributions from various features to CDU and MTT (mean-to-target) budgets are estimated. The gap space is found to have the worst CDU+MTT performance and is used to determine the required overlay accuracy to ensure a satisfactory edge-placement yield of a cut process. Moreover, we propose a 5-mask positive-tone SAOP (pSAOP) process for memory FEOL patterning and a 3-mask negative-tone SAOP (nSAOP) process for logic BEOL patterning. The potential challenges of 2-D SAOP layout decomposition for BEOL applications are identified. Possible decomposition approaches are explored and the functionality of several developed algorithm is verified using 2-D layout examples from Open Cell Library.
Investigation of Nitride Morphology After Self-Aligned Contact Etch

NASA Technical Reports Server (NTRS)

Hwang, Helen H.; Keil, J.; Helmer, B. A.; Chien, T.; Gopaladasu, P.; Kim, J.; Shon, J.; Biegel, Bryan (Technical Monitor)

2001-01-01

Self-Aligned Contact (SAC) etch has emerged as a key enabling technology for the fabrication of very large-scale memory devices. However, this is also a very challenging technology to implement from an etch viewpoint. The issues that arise range from poor oxide etch selectivity to nitride to problems with post etch nitride surface morphology. Unfortunately, the mechanisms that drive nitride loss and surface behavior remain poorly understood. Using a simple langmuir site balance model, SAC nitride etch simulations have been performed and compared to actual etched results. This approach permits the study of various etch mechanisms that may play a role in determining nitride loss and surface morphology. Particle trajectories and fluxes are computed using Monte-Carlo techniques and initial data obtained from double Langmuir probe measurements. Etched surface advancement is implemented using a shock tracking algorithm. Sticking coefficients and etch yields are adjusted to obtain the best agreement between actual etched results and simulated profiles.
Hidden Markov models of biological primary sequence information.

PubMed Central

Baldi, P; Chauvin, Y; Hunkapiller, T; McClure, M A

1994-01-01

Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences. PMID:8302831
Reconstruction software of the silicon tracker of DAMPE mission

NASA Astrophysics Data System (ADS)

Tykhonov, A.; Gallo, V.; Wu, X.; Zimmer, S.

2017-10-01

DAMPE is a satellite-borne experiment aimed to probe astroparticle physics in the GeV-TeV energy range. The Silicon tracker (STK) is one of the key components of DAMPE, which allows the reconstruction of trajectories (tracks) of detected particles. The non-negligible amount of material in the tracker poses a challenge to its reconstruction and alignment. In this paper we describe methods to address this challenge. We present the track reconstruction algorithm and give insight into the alignment algorithm. We also present our CAD-to-GDML converter, an in-house tool for implementing detector geometry in the software from the CAD drawings of the detector.
Automatic segmentation of mammogram and tomosynthesis images

NASA Astrophysics Data System (ADS)

Sargent, Dusty; Park, Sun Young

2016-03-01

Breast cancer is a one of the most common forms of cancer in terms of new cases and deaths both in the United States and worldwide. However, the survival rate with breast cancer is high if it is detected and treated before it spreads to other parts of the body. The most common screening methods for breast cancer are mammography and digital tomosynthesis, which involve acquiring X-ray images of the breasts that are interpreted by radiologists. The work described in this paper is aimed at optimizing the presentation of mammography and tomosynthesis images to the radiologist, thereby improving the early detection rate of breast cancer and the resulting patient outcomes. Breast cancer tissue has greater density than normal breast tissue, and appears as dense white image regions that are asymmetrical between the breasts. These irregularities are easily seen if the breast images are aligned and viewed side-by-side. However, since the breasts are imaged separately during mammography, the images may be poorly centered and aligned relative to each other, and may not properly focus on the tissue area. Similarly, although a full three dimensional reconstruction can be created from digital tomosynthesis images, the same centering and alignment issues can occur for digital tomosynthesis. Thus, a preprocessing algorithm that aligns the breasts for easy side-by-side comparison has the potential to greatly increase the speed and accuracy of mammogram reading. Likewise, the same preprocessing can improve the results of automatic tissue classification algorithms for mammography. In this paper, we present an automated segmentation algorithm for mammogram and tomosynthesis images that aims to improve the speed and accuracy of breast cancer screening by mitigating the above mentioned problems. Our algorithm uses information in the DICOM header to facilitate preprocessing, and incorporates anatomical region segmentation and contour analysis, along with a hidden Markov model (HMM) for processing the multi-frame tomosynthesis images. The output of the algorithm is a new set of images that have been processed to show only the diagnostically relevant region and align the breasts so that they can be easily compared side-by-side. Our method has been tested on approximately 750 images, including various examples of mammogram, tomosynthesis, and scanned images, and has correctly segmented the diagnostically relevant image region in 97% of cases.
A Palmprint Recognition Algorithm Using Phase-Only Correlation

NASA Astrophysics Data System (ADS)

Ito, Koichi; Aoki, Takafumi; Nakajima, Hiroshi; Kobayashi, Koji; Higuchi, Tatsuo

This paper presents a palmprint recognition algorithm using Phase-Only Correlation (POC). The use of phase components in 2D (two-dimensional) discrete Fourier transforms of palmprint images makes it possible to achieve highly robust image registration and matching. In the proposed algorithm, POC is used to align scaling, rotation and translation between two palmprint images, and evaluate similarity between them. Experimental evaluation using a palmprint image database clearly demonstrates efficient matching performance of the proposed algorithm.
Navigation Algorithms for the SeaWiFS Mission

NASA Technical Reports Server (NTRS)

Hooker, Stanford B. (Editor); Firestone, Elaine R. (Editor); Patt, Frederick S.; McClain, Charles R. (Technical Monitor)

2002-01-01

The navigation algorithms for the Sea-viewing Wide Field-of-view Sensor (SeaWiFS) were designed to meet the requirement of 1-pixel accuracy-a standard deviation (sigma) of 2. The objective has been to extract the best possible accuracy from the spacecraft telemetry and avoid the need for costly manual renavigation or geometric rectification. The requirement is addressed by postprocessing of both the Global Positioning System (GPS) receiver and Attitude Control System (ACS) data in the spacecraft telemetry stream. The navigation algorithms described are separated into four areas: orbit processing, attitude sensor processing, attitude determination, and final navigation processing. There has been substantial modification during the mission of the attitude determination and attitude sensor processing algorithms. For the former, the basic approach was completely changed during the first year of the mission, from a single-frame deterministic method to a Kalman smoother. This was done for several reasons: a) to improve the overall accuracy of the attitude determination, particularly near the sub-solar point; b) to reduce discontinuities; c) to support the single-ACS-string spacecraft operation that was started after the first mission year, which causes gaps in attitude sensor coverage; and d) to handle data quality problems (which became evident after launch) in the direct-broadcast data. The changes to the attitude sensor processing algorithms primarily involved the development of a model for the Earth horizon height, also needed for single-string operation; the incorporation of improved sensor calibration data; and improved data quality checking and smoothing to handle the data quality issues. The attitude sensor alignments have also been revised multiple times, generally in conjunction with the other changes. The orbit and final navigation processing algorithms have remained largely unchanged during the mission, aside from refinements to data quality checking. Although further improvements are certainly possible, future evolution of the algorithms is expected to be limited to refinements of the methods presented here, and no substantial changes are anticipated.
Self calibrating autoTRAC

NASA Technical Reports Server (NTRS)

Everett, Louis J.

1994-01-01

The work reported here demonstrates how to automatically compute the position and attitude of a targeting reflective alignment concept (TRAC) camera relative to the robot end effector. In the robotics literature this is known as the sensor registration problem. The registration problem is important to solve if TRAC images need to be related to robot position. Previously, when TRAC operated on the end of a robot arm, the camera had to be precisely located at the correct orientation and position. If this location is in error, then the robot may not be able to grapple an object even though the TRAC sensor indicates it should. In addition, if the camera is significantly far from the alignment it is expected to be at, TRAC may give incorrect feedback for the control of the robot. A simple example is if the robot operator thinks the camera is right side up but the camera is actually upside down, the camera feedback will tell the operator to move in an incorrect direction. The automatic calibration algorithm requires the operator to translate and rotate the robot arbitrary amounts along (about) two coordinate directions. After the motion, the algorithm determines the transformation matrix from the robot end effector to the camera image plane. This report discusses the TRAC sensor registration problem.
Customisation of the exome data analysis pipeline using a combinatorial approach.

PubMed

Pattnaik, Swetansu; Vaidyanathan, Srividya; Pooja, Durgad G; Deepak, Sa; Panda, Binay

2012-01-01

The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis. Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination determined by a simple framework of pre-existing metrics to create significant datasets.

In-flight alignment using H ∞ filter for strapdown INS on aircraft.

PubMed

Pei, Fu-Jun; Liu, Xuan; Zhu, Li

2014-01-01

In-flight alignment is an effective way to improve the accuracy and speed of initial alignment for strapdown inertial navigation system (INS). During the aircraft flight, strapdown INS alignment was disturbed by lineal and angular movements of the aircraft. To deal with the disturbances in dynamic initial alignment, a novel alignment method for SINS is investigated in this paper. In this method, an initial alignment error model of SINS in the inertial frame is established. The observability of the system is discussed by piece-wise constant system (PWCS) theory and observable degree is computed by the singular value decomposition (SVD) theory. It is demonstrated that the system is completely observable, and all the system state parameters can be estimated by optimal filter. Then a H ∞ filter was designed to resolve the uncertainty of measurement noise. The simulation results demonstrate that the proposed algorithm can reach a better accuracy under the dynamic disturbance condition.
Using structure to explore the sequence alignment space of remote homologs.

PubMed

Kuziemko, Andrew; Honig, Barry; Petrey, Donald

2011-10-01

Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP) are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is "optimal" in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are "suboptimal" in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for "modelability", we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended.
Two Simple and Efficient Algorithms to Compute the SP-Score Objective Function of a Multiple Sequence Alignment.

PubMed

Ranwez, Vincent

2016-01-01

Multiple sequence alignment (MSA) is a crucial step in many molecular analyses and many MSA tools have been developed. Most of them use a greedy approach to construct a first alignment that is then refined by optimizing the sum of pair score (SP-score). The SP-score estimation is thus a bottleneck for most MSA tools since it is repeatedly required and is time consuming. Given an alignment of n sequences and L sites, I introduce here optimized solutions reaching O(nL) time complexity for affine gap cost, instead of O(n2L), which are easy to implement.
Heuristic reusable dynamic programming: efficient updates of local sequence alignment.

PubMed

Hong, Changjin; Tewfik, Ahmed H

2009-01-01

Recomputation of the previously evaluated similarity results between biological sequences becomes inevitable when researchers realize errors in their sequenced data or when the researchers have to compare nearly similar sequences, e.g., in a family of proteins. We present an efficient scheme for updating local sequence alignments with an affine gap model. In principle, using the previous matching result between two amino acid sequences, we perform a forward-backward alignment to generate heuristic searching bands which are bounded by a set of suboptimal paths. Given a correctly updated sequence, we initially predict a new score of the alignment path for each contour to select the best candidates among them. Then, we run the Smith-Waterman algorithm in this confined space. Furthermore, our heuristic alignment for an updated sequence shows that it can be further accelerated by using reusable dynamic programming (rDP), our prior work. In this study, we successfully validate "relative node tolerance bound" (RNTB) in the pruned searching space. Furthermore, we improve the computational performance by quantifying the successful RNTB tolerance probability and switch to rDP on perturbation-resilient columns only. In our searching space derived by a threshold value of 90 percent of the optimal alignment score, we find that 98.3 percent of contours contain correctly updated paths. We also find that our method consumes only 25.36 percent of the runtime cost of sparse dynamic programming (sDP) method, and to only 2.55 percent of that of a normal dynamic programming with the Smith-Waterman algorithm.
Concurrent and Accurate Short Read Mapping on Multicore Processors.

PubMed

Martínez, Héctor; Tárraga, Joaquín; Medina, Ignacio; Barrachina, Sergio; Castillo, Maribel; Dopazo, Joaquín; Quintana-Ortí, Enrique S

2015-01-01

We introduce a parallel aligner with a work-flow organization for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, HPG Aligner SA (HPG Aligner SA is an open-source application. The software is available at http://www.opencb.org, exploits a suffix array to rapidly map a large fraction of the RNA fragments (reads), as well as leverages the accuracy of the Smith-Waterman algorithm to deal with conflictive reads. The aligner is enhanced with a careful strategy to detect splice junctions based on an adaptive division of RNA reads into small segments (or seeds), which are then mapped onto a number of candidate alignment locations, providing crucial information for the successful alignment of the complete reads. The experimental results on a platform with Intel multicore technology report the parallel performance of HPG Aligner SA, on RNA reads of 100-400 nucleotides, which excels in execution time/sensitivity to state-of-the-art aligners such as TopHat 2+Bowtie 2, MapSplice, and STAR.
Dark Energy Camera for Blanco

DOE Office of Scientific and Technical Information (OSTI.GOV)

Binder, Gary A.; /Caltech /SLAC

2010-08-25

In order to make accurate measurements of dark energy, a system is needed to monitor the focus and alignment of the Dark Energy Camera (DECam) to be located on the Blanco 4m Telescope for the upcoming Dark Energy Survey. One new approach under development is to fit out-of-focus star images to a point spread function from which information about the focus and tilt of the camera can be obtained. As a first test of a new algorithm using this idea, simulated star images produced from a model of DECam in the optics software Zemax were fitted. Then, real images frommore » the Mosaic II imager currently installed on the Blanco telescope were used to investigate the algorithm's capabilities. A number of problems with the algorithm were found, and more work is needed to understand its limitations and improve its capabilities so it can reliably predict camera alignment and focus.« less
Reducing the number of templates for aligned-spin compact binary coalescence gravitational wave searches using metric-agnostic template nudging

NASA Astrophysics Data System (ADS)

Indik, Nathaniel; Fehrmann, Henning; Harke, Franz; Krishnan, Badri; Nielsen, Alex B.

2018-06-01

Efficient multidimensional template placement is crucial in computationally intensive matched-filtering searches for gravitational waves (GWs). Here, we implement the neighboring cell algorithm (NCA) to improve the detection volume of an existing compact binary coalescence (CBC) template bank. This algorithm has already been successfully applied for a binary millisecond pulsar search in data from the Fermi satellite. It repositions templates from overdense regions to underdense regions and reduces the number of templates that would have been required by a stochastic method to achieve the same detection volume. Our method is readily generalizable to other CBC parameter spaces. Here we apply this method to the aligned-single-spin neutron star-black hole binary coalescence inspiral-merger-ringdown gravitational wave parameter space. We show that the template nudging algorithm can attain the equivalent effectualness of the stochastic method with 12% fewer templates.
Successive approximation algorithm for beam-position-monitor-based LHC collimator alignment

NASA Astrophysics Data System (ADS)

Valentino, Gianluca; Nosych, Andriy A.; Bruce, Roderik; Gasior, Marek; Mirarchi, Daniele; Redaelli, Stefano; Salvachua, Belen; Wollmann, Daniel

2014-02-01

Collimators with embedded beam position monitor (BPM) button electrodes will be installed in the Large Hadron Collider (LHC) during the current long shutdown period. For the subsequent operation, BPMs will allow the collimator jaws to be kept centered around the beam orbit. In this manner, a better beam cleaning efficiency and machine protection can be provided at unprecedented higher beam energies and intensities. A collimator alignment algorithm is proposed to center the jaws automatically around the beam. The algorithm is based on successive approximation and takes into account a correction of the nonlinear BPM sensitivity to beam displacement and an asymmetry of the electronic channels processing the BPM electrode signals. A software implementation was tested with a prototype collimator in the Super Proton Synchrotron. This paper presents results of the tests along with some considerations for eventual operation in the LHC.
A New Polar Transfer Alignment Algorithm with the Aid of a Star Sensor and Based on an Adaptive Unscented Kalman Filter.

PubMed

Cheng, Jianhua; Wang, Tongda; Wang, Lu; Wang, Zhenmin

2017-10-23

Because of the harsh polar environment, the master strapdown inertial navigation system (SINS) has low accuracy and the system model information becomes abnormal. In this case, existing polar transfer alignment (TA) algorithms which use the measurement information provided by master SINS would lose their effectiveness. In this paper, a new polar TA algorithm with the aid of a star sensor and based on an adaptive unscented Kalman filter (AUKF) is proposed to deal with the problems. Since the measurement information provided by master SINS is inaccurate, the accurate information provided by the star sensor is chosen as the measurement. With the compensation of lever-arm effect and the model of star sensor, the nonlinear navigation equations are derived. Combined with the attitude matching method, the filter models for polar TA are designed. An AUKF is introduced to solve the abnormal information of system model. Then, the AUKF is used to estimate the states of TA. Results have demonstrated that the performance of the new polar TA algorithm is better than the state-of-the-art polar TA algorithms. Therefore, the new polar TA algorithm proposed in this paper is effectively to ensure and improve the accuracy of TA in the harsh polar environment.
A New Polar Transfer Alignment Algorithm with the Aid of a Star Sensor and Based on an Adaptive Unscented Kalman Filter

PubMed Central

Cheng, Jianhua; Wang, Tongda; Wang, Lu; Wang, Zhenmin

2017-01-01

Because of the harsh polar environment, the master strapdown inertial navigation system (SINS) has low accuracy and the system model information becomes abnormal. In this case, existing polar transfer alignment (TA) algorithms which use the measurement information provided by master SINS would lose their effectiveness. In this paper, a new polar TA algorithm with the aid of a star sensor and based on an adaptive unscented Kalman filter (AUKF) is proposed to deal with the problems. Since the measurement information provided by master SINS is inaccurate, the accurate information provided by the star sensor is chosen as the measurement. With the compensation of lever-arm effect and the model of star sensor, the nonlinear navigation equations are derived. Combined with the attitude matching method, the filter models for polar TA are designed. An AUKF is introduced to solve the abnormal information of system model. Then, the AUKF is used to estimate the states of TA. Results have demonstrated that the performance of the new polar TA algorithm is better than the state-of-the-art polar TA algorithms. Therefore, the new polar TA algorithm proposed in this paper is effectively to ensure and improve the accuracy of TA in the harsh polar environment. PMID:29065521
Modified compensation algorithm of lever-arm effect and flexural deformation for polar shipborne transfer alignment based on improved adaptive Kalman filter

NASA Astrophysics Data System (ADS)

Wang, Tongda; Cheng, Jianhua; Guan, Dongxue; Kang, Yingyao; Zhang, Wei

2017-09-01

Due to the lever-arm effect and flexural deformation in the practical application of transfer alignment (TA), the TA performance is decreased. The existing polar TA algorithm only compensates a fixed lever-arm without considering the dynamic lever-arm caused by flexural deformation; traditional non-polar TA algorithms also have some limitations. Thus, the performance of existing compensation algorithms is unsatisfactory. In this paper, a modified compensation algorithm of the lever-arm effect and flexural deformation is proposed to promote the accuracy and speed of the polar TA. On the basis of a dynamic lever-arm model and a noise compensation method for flexural deformation, polar TA equations are derived in grid frames. Based on the velocity-plus-attitude matching method, the filter models of polar TA are designed. An adaptive Kalman filter (AKF) is improved to promote the robustness and accuracy of the system, and then applied to the estimation of the misalignment angles. Simulation and experiment results have demonstrated that the modified compensation algorithm based on the improved AKF for polar TA can effectively compensate the lever-arm effect and flexural deformation, and then improve the accuracy and speed of TA in the polar region.
Ontology Matching with Semantic Verification.

PubMed

Jean-Mary, Yves R; Shironoshita, E Patrick; Kabuka, Mansur R

2009-09-01

ASMOV (Automated Semantic Matching of Ontologies with Verification) is a novel algorithm that uses lexical and structural characteristics of two ontologies to iteratively calculate a similarity measure between them, derives an alignment, and then verifies it to ensure that it does not contain semantic inconsistencies. In this paper, we describe the ASMOV algorithm, and then present experimental results that measure its accuracy using the OAEI 2008 tests, and that evaluate its use with two different thesauri: WordNet, and the Unified Medical Language System (UMLS). These results show the increased accuracy obtained by combining lexical, structural and extensional matchers with semantic verification, and demonstrate the advantage of using a domain-specific thesaurus for the alignment of specialized ontologies.
Knowledge-Guided Docking of WW Domain Proteins and Flexible Ligands

NASA Astrophysics Data System (ADS)

Lu, Haiyun; Li, Hao; Banu Bte Sm Rashid, Shamima; Leow, Wee Kheng; Liou, Yih-Cherng

Studies of interactions between protein domains and ligands are important in many aspects such as cellular signaling. We present a knowledge-guided approach for docking protein domains and flexible ligands. The approach is applied to the WW domain, a small protein module mediating signaling complexes which have been implicated in diseases such as muscular dystrophy and Liddle’s syndrome. The first stage of the approach employs a substring search for two binding grooves of WW domains and possible binding motifs of peptide ligands based on known features. The second stage aligns the ligand’s peptide backbone to the two binding grooves using a quasi-Newton constrained optimization algorithm. The backbone-aligned ligands produced serve as good starting points to the third stage which uses any flexible docking algorithm to perform the docking. The experimental results demonstrate that the backbone alignment method in the second stage performs better than conventional rigid superposition given two binding constraints. It is also shown that using the backbone-aligned ligands as initial configurations improves the flexible docking in the third stage. The presented approach can also be applied to other protein domains that involve binding of flexible ligand to two or more binding sites.
Cellular and Nuclear Alignment Analysis for Determining Epithelial Cell Chirality

PubMed Central

Raymond, Michael J.; Ray, Poulomi; Kaur, Gurleen; Singh, Ajay V.; Wan, Leo Q.

2015-01-01

Left-right (LR) asymmetry is a biologically conserved property in living organisms that can be observed in the asymmetrical arrangement of organs and tissues and in tissue morphogenesis, such as the directional looping of the gastrointestinal tract and heart. The expression of LR asymmetry in embryonic tissues can be appreciated in biased cell alignment. Previously an in vitro chirality assay was reported by patterning multiple cells on microscale defined geometries and quantified the cell phenotype–dependent LR asymmetry, or cell chirality. However, morphology and chirality of individual cells on micropatterned surfaces has not been well characterized. Here, a Python-based algorithm was developed to identify and quantify immunofluorescence stained individual epithelial cells on multicellular patterns. This approach not only produces results similar to the image intensity gradient-based method reported previously, but also can capture properties of single cells such as area and aspect ratio. We also found that cell nuclei exhibited biased alignment. Around 35% cells were misaligned and were typically smaller and less elongated. This new imaging analysis approach is an effective tool for measuring single cell chirality inside multicellular structures and can potentially help unveil biophysical mechanisms underlying cellular chiral bias both in vitro and in vivo. PMID:26294010
Image navigation and registration for the geostationary lightning mapper (GLM)

NASA Astrophysics Data System (ADS)

van Bezooijen, Roel W. H.; Demroff, Howard; Burton, Gregory; Chu, Donald; Yang, Shu S.

2016-10-01

The Geostationary Lightning Mappers (GLM) for the Geostationary Operational Environmental Satellite (GOES) GOES-R series will, for the first time, provide hemispherical lightning information 24 hours a day from longitudes of 75 and 137 degrees west. The first GLM of a series of four is planned for launch in November, 2016. Observation of lightning patterns by GLM holds promise to improve tornado warning lead times to greater than 20 minutes while halving the present false alarm rates. In addition, GLM will improve airline traffic flow management, and provide climatology data allowing us to understand the Earth's evolving climate. The paper describes the method used for translating the pixel position of a lightning event to its corresponding geodetic longitude and latitude, using the J2000 attitude of the GLM mount frame reported by the spacecraft, the position of the spacecraft, and the alignment of the GLM coordinate frame relative to its mount frame. Because the latter alignment will experience seasonal variation, this alignment is determined daily using GLM background images collected over the previous 7 days. The process involves identification of coastlines in the background images and determination of the alignment change necessary to match the detected coastline with the coastline predicted using the GSHHS database. Registration is achieved using a variation of the Lucas-Kanade algorithm where we added a dither and average technique to improve performance significantly. An innovative water mask technique was conceived to enable self-contained detection of clear coastline sections usable for registration. Extensive simulations using accurate visible images from GOES13 and GOES15 have been used to demonstrate the performance of the coastline registration method, the results of which are presented in the paper.
Application of Quaternion in improving the quality of global sequence alignment scores for an ambiguous sequence target in Streptococcus pneumoniae DNA

NASA Astrophysics Data System (ADS)

Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.

2017-07-01

DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.
BatMis: a fast algorithm for k-mismatch mapping.

PubMed

Tennakoon, Chandana; Purbojati, Rikky W; Sung, Wing-Kin

2012-08-15

Second-generation sequencing (SGS) generates millions of reads that need to be aligned to a reference genome allowing errors. Although current aligners can efficiently map reads allowing a small number of mismatches, they are not well suited for handling a large number of mismatches. The efficiency of aligners can be improved using various heuristics, but the sensitivity and accuracy of the alignments are sacrificed. In this article, we introduce Basic Alignment tool for Mismatches (BatMis)--an efficient method to align short reads to a reference allowing k mismatches. BatMis is a Burrows-Wheeler transformation based aligner that uses a seed and extend approach, and it is an exact method. Benchmark tests show that BatMis performs better than competing aligners in solving the k-mismatch problem. Furthermore, it can compete favorably even when compared with the heuristic modes of the other aligners. BatMis is a useful alternative for applications where fast k-mismatch mappings, unique mappings or multiple mappings of SGS data are required. BatMis is written in C/C++ and is freely available from http://code.google.com/p/batmis/
ChimeRScope: a novel alignment-free algorithm for fusion transcript prediction using paired-end RNA-Seq data

PubMed Central

Li, You; Heavican, Tayla B.; Vellichirammal, Neetha N.; Iqbal, Javeed

2017-01-01

Abstract The RNA-Seq technology has revolutionized transcriptome characterization not only by accurately quantifying gene expression, but also by the identification of novel transcripts like chimeric fusion transcripts. The ‘fusion’ or ‘chimeric’ transcripts have improved the diagnosis and prognosis of several tumors, and have led to the development of novel therapeutic regimen. The fusion transcript detection is currently accomplished by several software packages, primarily relying on sequence alignment algorithms. The alignment of sequencing reads from fusion transcript loci in cancer genomes can be highly challenging due to the incorrect mapping induced by genomic alterations, thereby limiting the performance of alignment-based fusion transcript detection methods. Here, we developed a novel alignment-free method, ChimeRScope that accurately predicts fusion transcripts based on the gene fingerprint (as k-mers) profiles of the RNA-Seq paired-end reads. Results on published datasets and in-house cancer cell line datasets followed by experimental validations demonstrate that ChimeRScope consistently outperforms other popular methods irrespective of the read lengths and sequencing depth. More importantly, results on our in-house datasets show that ChimeRScope is a better tool that is capable of identifying novel fusion transcripts with potential oncogenic functions. ChimeRScope is accessible as a standalone software at (https://github.com/ChimeRScope/ChimeRScope/wiki) or via the Galaxy web-interface at (https://galaxy.unmc.edu/). PMID:28472320
Robust group-wise rigid registration of point sets using t-mixture model

NASA Astrophysics Data System (ADS)

Ravikumar, Nishant; Gooya, Ali; Frangi, Alejandro F.; Taylor, Zeike A.

2016-03-01

A probabilistic framework for robust, group-wise rigid alignment of point-sets using a mixture of Students t-distribution especially when the point sets are of varying lengths, are corrupted by an unknown degree of outliers or in the presence of missing data. Medical images (in particular magnetic resonance (MR) images), their segmentations and consequently point-sets generated from these are highly susceptible to corruption by outliers. This poses a problem for robust correspondence estimation and accurate alignment of shapes, necessary for training statistical shape models (SSMs). To address these issues, this study proposes to use a t-mixture model (TMM), to approximate the underlying joint probability density of a group of similar shapes and align them to a common reference frame. The heavy-tailed nature of t-distributions provides a more robust registration framework in comparison to state of the art algorithms. Significant reduction in alignment errors is achieved in the presence of outliers, using the proposed TMM-based group-wise rigid registration method, in comparison to its Gaussian mixture model (GMM) counterparts. The proposed TMM-framework is compared with a group-wise variant of the well-known Coherent Point Drift (CPD) algorithm and two other group-wise methods using GMMs, using both synthetic and real data sets. Rigid alignment errors for groups of shapes are quantified using the Hausdorff distance (HD) and quadratic surface distance (QSD) metrics.
Embedding strategies for effective use of information from multiple sequence alignments.

PubMed Central

Henikoff, S.; Henikoff, J. G.

1997-01-01

We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain. PMID:9070452

Attitude sensor alignment calibration for the solar maximum mission

NASA Technical Reports Server (NTRS)

Pitone, Daniel S.; Shuster, Malcolm D.

1990-01-01

An earlier heuristic study of the fine attitude sensors for the Solar Maximum Mission (SMM) revealed a temperature dependence of the alignment about the yaw axis of the pair of fixed-head star trackers relative to the fine pointing Sun sensor. Here, new sensor alignment algorithms which better quantify the dependence of the alignments on the temperature are developed and applied to the SMM data. Comparison with the results from the previous study reveals the limitations of the heuristic approach. In addition, some of the basic assumptions made in the prelaunch analysis of the alignments of the SMM are examined. The results of this work have important consequences for future missions with stringent attitude requirements and where misalignment variations due to variations in the temperature will be significant.
Intra-operative registration for image enhanced endoscopic sinus surgery using photo-consistency.

PubMed

Chen, Min Si; Gonzalez, Gerardo; Lapeer, Rudy

2007-01-01

The purpose of this paper is to present an intensity based algorithm for aligning 2D endoscopic images with virtual images generated from pre-operative 3D data. The proposed algorithm uses photo-consistency as the measurement of similarity between images, provided the illumination is independent from the viewing direction.
Feature-based three-dimensional registration for repetitive geometry in machine vision

PubMed Central

Gong, Yuanzheng; Seibel, Eric J.

2016-01-01

As an important step in three-dimensional (3D) machine vision, 3D registration is a process of aligning two or multiple 3D point clouds that are collected from different perspectives together into a complete one. The most popular approach to register point clouds is to minimize the difference between these point clouds iteratively by Iterative Closest Point (ICP) algorithm. However, ICP does not work well for repetitive geometries. To solve this problem, a feature-based 3D registration algorithm is proposed to align the point clouds that are generated by vision-based 3D reconstruction. By utilizing texture information of the object and the robustness of image features, 3D correspondences can be retrieved so that the 3D registration of two point clouds is to solve a rigid transformation. The comparison of our method and different ICP algorithms demonstrates that our proposed algorithm is more accurate, efficient and robust for repetitive geometry registration. Moreover, this method can also be used to solve high depth uncertainty problem caused by little camera baseline in vision-based 3D reconstruction. PMID:28286703
SnapDock—template-based docking by Geometric Hashing

PubMed Central

Estrin, Michael; Wolfson, Haim J.

2017-01-01

Abstract Motivation: A highly efficient template-based protein–protein docking algorithm, nicknamed SnapDock, is presented. It employs a Geometric Hashing-based structural alignment scheme to align the target proteins to the interfaces of non-redundant protein–protein interface libraries. Docking of a pair of proteins utilizing the 22 600 interface PIFACE library is performed in < 2 min on the average. A flexible version of the algorithm allowing hinge motion in one of the proteins is presented as well. Results: To evaluate the performance of the algorithm a blind re-modelling of 3547 PDB complexes, which have been uploaded after the PIFACE publication has been performed with success ratio of about 35%. Interestingly, a similar experiment with the template free PatchDock docking algorithm yielded a success rate of about 23% with roughly 1/3 of the solutions different from those of SnapDock. Consequently, the combination of the two methods gave a 42% success ratio. Availability and implementation: A web server of the application is under development. Contact: michaelestrin@gmail.com or wolfson@tau.ac.il PMID:28881968
QuickProbs—A Fast Multiple Sequence Alignment Algorithm Designed for Graphics Processors

PubMed Central

Gudyś, Adam; Deorowicz, Sebastian

2014-01-01

Multiple sequence alignment is a crucial task in a number of biological analyses like secondary structure prediction, domain searching, phylogeny, etc. MSAProbs is currently the most accurate alignment algorithm, but its effectiveness is obtained at the expense of computational time. In the paper we present QuickProbs, the variant of MSAProbs customised for graphics processors. We selected the two most time consuming stages of MSAProbs to be redesigned for GPU execution: the posterior matrices calculation and the consistency transformation. Experiments on three popular benchmarks (BAliBASE, PREFAB, OXBench-X) on quad-core PC equipped with high-end graphics card show QuickProbs to be 5.7 to 9.7 times faster than original CPU-parallel MSAProbs. Additional tests performed on several protein families from Pfam database give overall speed-up of 6.7. Compared to other algorithms like MAFFT, MUSCLE, or ClustalW, QuickProbs proved to be much more accurate at similar speed. Additionally we introduce a tuned variant of QuickProbs which is significantly more accurate on sets of distantly related sequences than MSAProbs without exceeding its computation time. The GPU part of QuickProbs was implemented in OpenCL, thus the package is suitable for graphics processors produced by all major vendors. PMID:24586435
Search algorithm complexity modeling with application to image alignment and matching

NASA Astrophysics Data System (ADS)

DelMarco, Stephen

2014-05-01

Search algorithm complexity modeling, in the form of penetration rate estimation, provides a useful way to estimate search efficiency in application domains which involve searching over a hypothesis space of reference templates or models, as in model-based object recognition, automatic target recognition, and biometric recognition. The penetration rate quantifies the expected portion of the database that must be searched, and is useful for estimating search algorithm computational requirements. In this paper we perform mathematical modeling to derive general equations for penetration rate estimates that are applicable to a wide range of recognition problems. We extend previous penetration rate analyses to use more general probabilistic modeling assumptions. In particular we provide penetration rate equations within the framework of a model-based image alignment application domain in which a prioritized hierarchical grid search is used to rank subspace bins based on matching probability. We derive general equations, and provide special cases based on simplifying assumptions. We show how previously-derived penetration rate equations are special cases of the general formulation. We apply the analysis to model-based logo image alignment in which a hierarchical grid search is used over a geometric misalignment transform hypothesis space. We present numerical results validating the modeling assumptions and derived formulation.
Group sparse multiview patch alignment framework with view consistency for image classification.

PubMed

Gui, Jie; Tao, Dacheng; Sun, Zhenan; Luo, Yong; You, Xinge; Tang, Yuan Yan

2014-07-01

No single feature can satisfactorily characterize the semantic concepts of an image. Multiview learning aims to unify different kinds of features to produce a consensual and efficient representation. This paper redefines part optimization in the patch alignment framework (PAF) and develops a group sparse multiview patch alignment framework (GSM-PAF). The new part optimization considers not only the complementary properties of different views, but also view consistency. In particular, view consistency models the correlations between all possible combinations of any two kinds of view. In contrast to conventional dimensionality reduction algorithms that perform feature extraction and feature selection independently, GSM-PAF enjoys joint feature extraction and feature selection by exploiting l(2,1)-norm on the projection matrix to achieve row sparsity, which leads to the simultaneous selection of relevant features and learning transformation, and thus makes the algorithm more discriminative. Experiments on two real-world image data sets demonstrate the effectiveness of GSM-PAF for image classification.
In-Flight Alignment Using H ∞ Filter for Strapdown INS on Aircraft

PubMed Central

Pei, Fu-Jun; Liu, Xuan; Zhu, Li

2014-01-01

In-flight alignment is an effective way to improve the accuracy and speed of initial alignment for strapdown inertial navigation system (INS). During the aircraft flight, strapdown INS alignment was disturbed by lineal and angular movements of the aircraft. To deal with the disturbances in dynamic initial alignment, a novel alignment method for SINS is investigated in this paper. In this method, an initial alignment error model of SINS in the inertial frame is established. The observability of the system is discussed by piece-wise constant system (PWCS) theory and observable degree is computed by the singular value decomposition (SVD) theory. It is demonstrated that the system is completely observable, and all the system state parameters can be estimated by optimal filter. Then a H ∞ filter was designed to resolve the uncertainty of measurement noise. The simulation results demonstrate that the proposed algorithm can reach a better accuracy under the dynamic disturbance condition. PMID:24511300
Progressive Stereo Locking (PSL): A Residual Dipolar Coupling Based Force Field Method for Determining the Relative Configuration of Natural Products and Other Small Molecules.

PubMed

Cornilescu, Gabriel; Ramos Alvarenga, René F; Wyche, Thomas P; Bugni, Tim S; Gil, Roberto R; Cornilescu, Claudia C; Westler, William M; Markley, John L; Schwieters, Charles D

2017-08-18

Establishing the relative configuration of a bioactive natural product represents the most challenging part in determining its structure. Residual dipolar couplings (RDCs) are sensitive probes of the relative spatial orientation of internuclear vectors. We adapted a force field structure calculation methodology to allow free sampling of both R and S configurations of the stereocenters of interest. The algorithm uses a floating alignment tensor in a simulated annealing protocol to identify the conformations and configurations that best fit experimental RDC and distance restraints (from NOE and J-coupling data). A unique configuration (for rigid molecules) or a very small number of configurations (for less rigid molecules) of the structural models having the lowest chiral angle energies and reasonable magnitudes of the alignment tensor are provided as the best predictions of the unknown configuration. For highly flexible molecules, the progressive locking of their stereocenters into their statistically dominant R or S state dramatically reduces the number of possible relative configurations. The result is verified by checking that the same configuration is obtained by initiating the locking from different regions of the molecule. For all molecules tested having known configurations (with conformations ranging from mostly rigid to highly flexible), the method accurately determined the correct configuration.
Tyre-road grip coefficient assessment - Part II: online estimation using instrumented vehicle, extended Kalman filter, and neural network

NASA Astrophysics Data System (ADS)

Luque, Pablo; Mántaras, Daniel A.; Fidalgo, Eloy; Álvarez, Javier; Riva, Paolo; Girón, Pablo; Compadre, Diego; Ferran, Jordi

2013-12-01

The main objective of this work is to determine the limit of safe driving conditions by identifying the maximal friction coefficient in a real vehicle. The study will focus on finding a method to determine this limit before reaching the skid, which is valuable information in the context of traffic safety. Since it is not possible to measure the friction coefficient directly, it will be estimated using the appropriate tools in order to get the most accurate information. A real vehicle is instrumented to collect information of general kinematics and steering tie-rod forces. A real-time algorithm is developed to estimate forces and aligning torque in the tyres using an extended Kalman filter and neural networks techniques. The methodology is based on determining the aligning torque; this variable allows evaluation of the behaviour of the tyre. It transmits interesting information from the tyre-road contact and can be used to predict the maximal tyre grip and safety margin. The maximal grip coefficient is estimated according to a knowledge base, extracted from computer simulation of a high detailed three-dimensional model, using Adams® software. The proposed methodology is validated and applied to real driving conditions, in which maximal grip and safety margin are properly estimated.
SW#db: GPU-Accelerated Exact Sequence Similarity Database Search.

PubMed

Korpar, Matija; Šošić, Martin; Blažeka, Dino; Šikić, Mile

2015-01-01

In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result-the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goals: how to keep the running times acceptable while maintaining a high-enough level of sensitivity. The most time consuming step of similarity search are the local alignments between query and database sequences. This step is usually performed using exact local alignment algorithms such as Smith-Waterman. Due to its quadratic time complexity, alignments of a query to the whole database are usually too slow. Therefore, the majority of the protein similarity search methods prior to doing the exact local alignment apply heuristics to reduce the number of possible candidate sequences in the database. However, there is still a need for the alignment of a query sequence to a reduced database. In this paper we present the SW#db tool and a library for fast exact similarity search. Although its running times, as a standalone tool, are comparable to the running times of BLAST, it is primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced. It uses both GPU and CPU parallelization and was 4-5 times faster than SSEARCH, 6-25 times faster than CUDASW++ and more than 20 times faster than SSW at the time of writing, using multiple queries on Swiss-prot and Uniref90 databases.
Collaborative Beamfocusing Radio (COBRA)

NASA Astrophysics Data System (ADS)

Rode, Jeremy P.; Hsu, Mark J.; Smith, David; Husain, Anis

2013-05-01

A Ziva team has recently demonstrated a novel technique called Collaborative Beamfocusing Radios (COBRA) which enables an ad-hoc collection of distributed commercial off-the-shelf software defined radios to coherently align and beamform to a remote radio. COBRA promises to operate even in high multipath and non-line-of-sight environments as well as mobile applications without resorting to computationally expensive closed loop techniques that are currently unable to operate with significant movement. COBRA exploits two key technologies to achieve coherent beamforming. The first is Time Reversal (TR) which compensates for multipath and automatically discovers the optimal spatio-temporal matched filter to enable peak signal gains (up to 20 dB) and diffraction-limited focusing at the intended receiver in NLOS and severe multipath environments. The second is time-aligned buffering which enables TR to synchronize distributed transmitters into a collaborative array. This time alignment algorithm avoids causality violations through the use of reciprocal buffering. Preserving spatio-temporal reciprocity through the TR capture and retransmission process achieves coherent alignment across multiple radios at ~GHz carriers using only standard quartz-oscillators. COBRA has been demonstrated in the lab, aligning two off-the-shelf software defined radios over-the-air to an accuracy of better than 2 degrees of carrier alignment at 450 MHz. The COBRA algorithms are lightweight, with computation in 5 ms on a smartphone class microprocessor. COBRA also has low start-up latency, achieving high accuracy from a cold-start in 30 ms. The COBRA technique opens up a large number of new capabilities in communications, and electronic warfare including selective spatial jamming, geolocation and anti-geolocation.
SPHERE: SPherical Harmonic Elastic REgistration of HARDI Data

PubMed Central

Yap, Pew-Thian; Chen, Yasheng; An, Hongyu; Yang, Yang; Gilmore, John H.; Lin, Weili

2010-01-01

In contrast to the more common Diffusion Tensor Imaging (DTI), High Angular Resolution Diffusion Imaging (HARDI) allows superior delineation of angular microstructures of brain white matter, and makes possible multiple-fiber modeling of each voxel for better characterization of brain connectivity. However, the complex orientation information afforded by HARDI makes registration of HARDI images more complicated than scalar images. In particular, the question of how much orientation information is needed for satisfactory alignment has not been sufficiently addressed. Low order orientation representation is generally more robust than high order representation, although the latter provides more information for correct alignment of fiber pathways. However, high order representation, when naïvely utilized, might not necessarily be conducive to improving registration accuracy since similar structures with significant orientation differences prior to proper alignment might be mistakenly taken as non-matching structures. We present in this paper a HARDI registration algorithm, called SPherical Harmonic Elastic REgistration (SPHERE), which in a principled means hierarchically extracts orientation information from HARDI data for structural alignment. The image volumes are first registered using robust, relatively direction invariant features derived from the Orientation Distribution Function (ODF), and the alignment is then further refined using spherical harmonic (SH) representation with gradually increasing orders. This progression from non-directional, single-directional to multi-directional representation provides a systematic means of extracting directional information given by diffusion-weighted imaging. Coupled with a template-subject-consistent soft-correspondence-matching scheme, this approach allows robust and accurate alignment of HARDI data. Experimental results show marked increase in accuracy over a state-of-the-art DTI registration algorithm. PMID:21147231
The protein structure prediction problem could be solved using the current PDB library

PubMed Central

Zhang, Yang; Skolnick, Jeffrey

2005-01-01

For single-domain proteins, we examine the completeness of the structures in the current Protein Data Bank (PDB) library for use in full-length model construction of unknown sequences. To address this issue, we employ a comprehensive benchmark set of 1,489 medium-size proteins that cover the PDB at the level of 35% sequence identity and identify templates by structure alignment. With homologous proteins excluded, we can always find similar folds to native with an average rms deviation (RMSD) from native of 2.5 Å with ≈82% alignment coverage. These template structures often contain a significant number of insertions/deletions. The tasser algorithm was applied to build full-length models, where continuous fragments are excised from the top-scoring templates and reassembled under the guide of an optimized force field, which includes consensus restraints taken from the templates and knowledge-based statistical potentials. For almost all targets (except for 2/1,489), the resultant full-length models have an RMSD to native below 6 Å (97% of them below 4 Å). On average, the RMSD of full-length models is 2.25 Å, with aligned regions improved from 2.5 Å to 1.88 Å, comparable with the accuracy of low-resolution experimental structures. Furthermore, starting from state-of-the-art structural alignments, we demonstrate a methodology that can consistently bring template-based alignments closer to native. These results are highly suggestive that the protein-folding problem can in principle be solved based on the current PDB library by developing efficient fold recognition algorithms that can recover such initial alignments. PMID:15653774
Global Network Alignment in the Context of Aging.

PubMed

Faisal, Fazle Elahi; Zhao, Han; Milenkovic, Tijana

2015-01-01

Analogous to sequence alignment, network alignment (NA) can be used to transfer biological knowledge across species between conserved network regions. NA faces two algorithmic challenges: 1) Which cost function to use to capture "similarities" between nodes in different networks? 2) Which alignment strategy to use to rapidly identify "high-scoring" alignments from all possible alignments? We "break down" existing state-of-the-art methods that use both different cost functions and different alignment strategies to evaluate each combination of their cost functions and alignment strategies. We find that a combination of the cost function of one method and the alignment strategy of another method beats the existing methods. Hence, we propose this combination as a novel superior NA method. Then, since human aging is hard to study experimentally due to long lifespan, we use NA to transfer aging-related knowledge from well annotated model species to poorly annotated human. By doing so, we produce novel human aging-related knowledge, which complements currently available knowledge about aging that has been obtained mainly by sequence alignment. We demonstrate significant similarity between topological and functional properties of our novel predictions and those of known aging-related genes. We are the first to use NA to learn more about aging.
libgapmis: extending short-read alignments

PubMed Central

2013-01-01

Background A wide variety of short-read alignment programmes have been published recently to tackle the problem of mapping millions of short reads to a reference genome, focusing on different aspects of the procedure such as time and memory efficiency, sensitivity, and accuracy. These tools allow for a small number of mismatches in the alignment; however, their ability to allow for gaps varies greatly, with many performing poorly or not allowing them at all. The seed-and-extend strategy is applied in most short-read alignment programmes. After aligning a substring of the reference sequence against the high-quality prefix of a short read--the seed--an important problem is to find the best possible alignment between a substring of the reference sequence succeeding and the remaining suffix of low quality of the read--extend. The fact that the reads are rather short and that the gap occurrence frequency observed in various studies is rather low suggest that aligning (parts of) those reads with a single gap is in fact desirable. Results In this article, we present libgapmis, a library for extending pairwise short-read alignments. Apart from the standard CPU version, it includes ultrafast SSE- and GPU-based implementations. libgapmis is based on an algorithm computing a modified version of the traditional dynamic-programming matrix for sequence alignment. Extensive experimental results demonstrate that the functions of the CPU version provided in this library accelerate the computations by a factor of 20 compared to other programmes. The analogous SSE- and GPU-based implementations accelerate the computations by a factor of 6 and 11, respectively, compared to the CPU version. The library also provides the user the flexibility to split the read into fragments, based on the observed gap occurrence frequency and the length of the read, thereby allowing for a variable, but bounded, number of gaps in the alignment. Conclusions We present libgapmis, a library for extending pairwise short-read alignments. We show that libgapmis is better-suited and more efficient than existing algorithms for this task. The importance of our contribution is underlined by the fact that the provided functions may be seamlessly integrated into any short-read alignment pipeline. The open-source code of libgapmis is available at http://www.exelixis-lab.org/gapmis. PMID:24564250
Chemometric Analysis of Gas Chromatography – Mass Spectrometry Data using Fast Retention Time Alignment via a Total Ion Current Shift Function

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nadeau, Jeremy S.; Wright, Bob W.; Synovec, Robert E.

2010-04-15

A critical comparison of methods for correcting severely retention time shifted gas chromatography-mass spectrometry (GC-MS) data is presented. The method reported herein is an adaptation to the Piecewise Alignment Algorithm to quickly align severely shifted one-dimensional (1D) total ion current (TIC) data, then applying these shifts to broadly align all mass channels throughout the separation, referred to as a TIC shift function (SF). The maximum shift varied from (-) 5 s in the beginning of the chromatographic separation to (+) 20 s toward the end of the separation, equivalent to a maximum shift of over 5 peak widths. Implementing themore » TIC shift function (TIC SF) prior to Fisher Ratio (F-Ratio) feature selection and then principal component analysis (PCA) was found to be a viable approach to classify complex chromatograms, that in this study were obtained from GC-MS separations of three gasoline samples serving as complex test mixtures, referred to as types C, M and S. The reported alignment algorithm via the TIC SF approach corrects for large dynamic shifting in the data as well as subtle peak-to-peak shifts. The benefits of the overall TIC SF alignment and feature selection approach were quantified using the degree-of-class separation (DCS) metric of the PCA scores plots using the type C and M samples, since they were the most similar, and thus the most challenging samples to properly classify. The DCS values showed an increase from an initial value of essentially zero for the unaligned GC-TIC data to a value of 7.9 following alignment; however, the DCS was unchanged by feature selection using F-Ratios for the GC-TIC data. The full mass spectral data provided an increase to a final DCS of 13.7 after alignment and two-dimensional (2D) F-Ratio feature selection.« less
A distributed system for fast alignment of next-generation sequencing data.

PubMed

Srimani, Jaydeep K; Wu, Po-Yen; Phan, John H; Wang, May D

2010-12-01

We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment results. Moreover, as the data and alignment algorithms become more prevalent, it will become necessary to examine the effect of the multitude of alignment parameters on various NGS systems. We validate the distributed software system by (1) computing simple timing results to show the speed-up gained by using multiple computers, (2) optimizing alignment parameters using simulated NGS data, and (3) computing NGS expression levels for a single biological sample using optimal parameters and comparing these expression levels to that of a microarray sample. Results indicate that the distributed alignment system achieves approximately a linear speed-up and correctly distributes sequence data to and gathers alignment results from multiple compute clients.
k-merSNP discovery: Software for alignment-and reference-free scalable SNP discovery, phylogenetics, and annotation for hundreds of microbial genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs or raw, unassembled reads. The method is fast to compute, finding SNPs and building a SNP phylogeny in minutes to hours, depending on the size and diversity of the input sequences. The SNP-based trees that result are consistent with known taxonomy and treesmore » determined in other studies. The approach we describe can handle many gigabases of sequence in a single run. The algorithm is based on k-mer analysis.« less
Memory-efficient dynamic programming backtrace and pairwise local sequence alignment.

PubMed

Newberg, Lee A

2008-08-15

A backtrace through a dynamic programming algorithm's intermediate results in search of an optimal path, or to sample paths according to an implied probability distribution, or as the second stage of a forward-backward algorithm, is a task of fundamental importance in computational biology. When there is insufficient space to store all intermediate results in high-speed memory (e.g. cache) existing approaches store selected stages of the computation, and recompute missing values from these checkpoints on an as-needed basis. Here we present an optimal checkpointing strategy, and demonstrate its utility with pairwise local sequence alignment of sequences of length 10,000. Sample C++-code for optimal backtrace is available in the Supplementary Materials. Supplementary data is available at Bioinformatics online.

Whole brain diffeomorphic metric mapping via integration of sulcal and gyral curves, cortical surfaces, and images

PubMed Central

Du, Jia; Younes, Laurent; Qiu, Anqi

2011-01-01

This paper introduces a novel large deformation diffeomorphic metric mapping algorithm for whole brain registration where sulcal and gyral curves, cortical surfaces, and intensity images are simultaneously carried from one subject to another through a flow of diffeomorphisms. To the best of our knowledge, this is the first time that the diffeomorphic metric from one brain to another is derived in a shape space of intensity images and point sets (such as curves and surfaces) in a unified manner. We describe the Euler–Lagrange equation associated with this algorithm with respect to momentum, a linear transformation of the velocity vector field of the diffeomorphic flow. The numerical implementation for solving this variational problem, which involves large-scale kernel convolution in an irregular grid, is made feasible by introducing a class of computationally friendly kernels. We apply this algorithm to align magnetic resonance brain data. Our whole brain mapping results show that our algorithm outperforms the image-based LDDMM algorithm in terms of the mapping accuracy of gyral/sulcal curves, sulcal regions, and cortical and subcortical segmentation. Moreover, our algorithm provides better whole brain alignment than combined volumetric and surface registration (Postelnicu et al., 2009) and hierarchical attribute matching mechanism for elastic registration (HAMMER) (Shen and Davatzikos, 2002) in terms of cortical and subcortical volume segmentation. PMID:21281722
A Secure Alignment Algorithm for Mapping Short Reads to Human Genome.

PubMed

Zhao, Yongan; Wang, Xiaofeng; Tang, Haixu

2018-05-09

The elastic and inexpensive computing resources such as clouds have been recognized as a useful solution to analyzing massive human genomic data (e.g., acquired by using next-generation sequencers) in biomedical researches. However, outsourcing human genome computation to public or commercial clouds was hindered due to privacy concerns: even a small number of human genome sequences contain sufficient information for identifying the donor of the genomic data. This issue cannot be directly addressed by existing security and cryptographic techniques (such as homomorphic encryption), because they are too heavyweight to carry out practical genome computation tasks on massive data. In this article, we present a secure algorithm to accomplish the read mapping, one of the most basic tasks in human genomic data analysis based on a hybrid cloud computing model. Comparing with the existing approaches, our algorithm delegates most computation to the public cloud, while only performing encryption and decryption on the private cloud, and thus makes the maximum use of the computing resource of the public cloud. Furthermore, our algorithm reports similar results as the nonsecure read mapping algorithms, including the alignment between reads and the reference genome, which can be directly used in the downstream analysis such as the inference of genomic variations. We implemented the algorithm in C++ and Python on a hybrid cloud system, in which the public cloud uses an Apache Spark system.
Increasing feasibility of the field-programmable gate array implementation of an iterative image registration using a kernel-warping algorithm

NASA Astrophysics Data System (ADS)

Nguyen, An Hung; Guillemette, Thomas; Lambert, Andrew J.; Pickering, Mark R.; Garratt, Matthew A.

2017-09-01

Image registration is a fundamental image processing technique. It is used to spatially align two or more images that have been captured at different times, from different sensors, or from different viewpoints. There have been many algorithms proposed for this task. The most common of these being the well-known Lucas-Kanade (LK) and Horn-Schunck approaches. However, the main limitation of these approaches is the computational complexity required to implement the large number of iterations necessary for successful alignment of the images. Previously, a multi-pass image interpolation algorithm (MP-I2A) was developed to considerably reduce the number of iterations required for successful registration compared with the LK algorithm. This paper develops a kernel-warping algorithm (KWA), a modified version of the MP-I2A, which requires fewer iterations to successfully register two images and less memory space for the field-programmable gate array (FPGA) implementation than the MP-I2A. These reductions increase feasibility of the implementation of the proposed algorithm on FPGAs with very limited memory space and other hardware resources. A two-FPGA system rather than single FPGA system is successfully developed to implement the KWA in order to compensate insufficiency of hardware resources supported by one FPGA, and increase parallel processing ability and scalability of the system.
Multispectra CWT-based algorithm (MCWT) in mass spectra for peak extraction.

PubMed

Hsueh, Huey-Miin; Kuo, Hsun-Chih; Tsai, Chen-An

2008-01-01

An important objective in mass spectrometry (MS) is to identify a set of biomarkers that can be used to potentially distinguish patients between distinct treatments (or conditions) from tens or hundreds of spectra. A common two-step approach involving peak extraction and quantification is employed to identify the features of scientific interest. The selected features are then used for further investigation to understand underlying biological mechanism of individual protein or for development of genomic biomarkers to early diagnosis. However, the use of inadequate or ineffective peak detection and peak alignment algorithms in peak extraction step may lead to a high rate of false positives. Also, it is crucial to reduce the false positive rate in detecting biomarkers from ten or hundreds of spectra. Here a new procedure is introduced for feature extraction in mass spectrometry data that extends the continuous wavelet transform-based (CWT-based) algorithm to multiple spectra. The proposed multispectra CWT-based algorithm (MCWT) not only can perform peak detection for multiple spectra but also carry out peak alignment at the same time. The author' MCWT algorithm constructs a reference, which integrates information of multiple raw spectra, for feature extraction. The algorithm is applied to a SELDI-TOF mass spectra data set provided by CAMDA 2006 with known polypeptide m/z positions. This new approach is easy to implement and it outperforms the existing peak extraction method from the Bioconductor PROcess package.
Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms.

PubMed

Ortegon, Patricia; Poot-Hernández, Augusto C; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

2015-01-01

In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case.
Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms

PubMed Central

Ortegon, Patricia; Poot-Hernández, Augusto C.; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

2015-01-01

In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case. PMID:25973143
A Demons algorithm for image registration with locally adaptive regularization.

PubMed

Cahill, Nathan D; Noble, J Alison; Hawkes, David J

2009-01-01

Thirion's Demons is a popular algorithm for nonrigid image registration because of its linear computational complexity and ease of implementation. It approximately solves the diffusion registration problem by successively estimating force vectors that drive the deformation toward alignment and smoothing the force vectors by Gaussian convolution. In this article, we show how the Demons algorithm can be generalized to allow image-driven locally adaptive regularization in a manner that preserves both the linear complexity and ease of implementation of the original Demons algorithm. We show that the proposed algorithm exhibits lower target registration error and requires less computational effort than the original Demons algorithm on the registration of serial chest CT scans of patients with lung nodules.
RaptorX server: a resource for template-based protein structure modeling.

PubMed

Källberg, Morten; Margaryan, Gohar; Wang, Sheng; Ma, Jianzhu; Xu, Jinbo

2014-01-01

Assigning functional properties to a newly discovered protein is a key challenge in modern biology. To this end, computational modeling of the three-dimensional atomic arrangement of the amino acid chain is often crucial in determining the role of the protein in biological processes. We present a community-wide web-based protocol, RaptorX server ( http://raptorx.uchicago.edu ), for automated protein secondary structure prediction, template-based tertiary structure modeling, and probabilistic alignment sampling.Given a target sequence, RaptorX server is able to detect even remotely related template sequences by means of a novel nonlinear context-specific alignment potential and probabilistic consistency algorithm. Using the protocol presented here it is thus possible to obtain high-quality structural models for many target protein sequences when only distantly related protein domains have experimentally solved structures. At present, RaptorX server can perform secondary and tertiary structure prediction of a 200 amino acid target sequence in approximately 30 min.
YAHA: fast and flexible long-read alignment with optimal breakpoint detection.

PubMed

Faust, Gregory G; Hall, Ira M

2012-10-01

With improved short-read assembly algorithms and the recent development of long-read sequencers, split mapping will soon be the preferred method for structural variant (SV) detection. Yet, current alignment tools are not well suited for this. We present YAHA, a fast and flexible hash-based aligner. YAHA is as fast and accurate as BWA-SW at finding the single best alignment per query and is dramatically faster and more sensitive than both SSAHA2 and MegaBLAST at finding all possible alignments. Unlike other aligners that report all, or one, alignment per query, or that use simple heuristics to select alignments, YAHA uses a directed acyclic graph to find the optimal set of alignments that cover a query using a biologically relevant breakpoint penalty. YAHA can also report multiple mappings per defined segment of the query. We show that YAHA detects more breakpoints in less time than BWA-SW across all SV classes, and especially excels at complex SVs comprising multiple breakpoints. YAHA is currently supported on 64-bit Linux systems. Binaries and sample data are freely available for download from http://faculty.virginia.edu/irahall/YAHA. imh4y@virginia.edu.
STELLAR: fast and exact local alignments

PubMed Central

2011-01-01

Background Large-scale comparison of genomic sequences requires reliable tools for the search of local alignments. Practical local aligners are in general fast, but heuristic, and hence sometimes miss significant matches. Results We present here the local pairwise aligner STELLAR that has full sensitivity for ε-alignments, i.e. guarantees to report all local alignments of a given minimal length and maximal error rate. The aligner is composed of two steps, filtering and verification. We apply the SWIFT algorithm for lossless filtering, and have developed a new verification strategy that we prove to be exact. Our results on simulated and real genomic data confirm and quantify the conjecture that heuristic tools like BLAST or BLAT miss a large percentage of significant local alignments. Conclusions STELLAR is very practical and fast on very long sequences which makes it a suitable new tool for finding local alignments between genomic sequences under the edit distance model. Binaries are freely available for Linux, Windows, and Mac OS X at http://www.seqan.de/projects/stellar. The source code is freely distributed with the SeqAn C++ library version 1.3 and later at http://www.seqan.de. PMID:22151882
Data preprocessing method for liquid chromatography-mass spectrometry based metabolomics.

PubMed

Wei, Xiaoli; Shi, Xue; Kim, Seongho; Zhang, Li; Patrick, Jeffrey S; Binkley, Joe; McClain, Craig; Zhang, Xiang

2012-09-18

A set of data preprocessing algorithms for peak detection and peak list alignment are reported for analysis of liquid chromatography-mass spectrometry (LC-MS)-based metabolomics data. For spectrum deconvolution, peak picking is achieved at the selected ion chromatogram (XIC) level. To estimate and remove the noise in XICs, each XIC is first segmented into several peak groups based on the continuity of scan number, and the noise level is estimated by all the XIC signals, except the regions potentially with presence of metabolite ion peaks. After removing noise, the peaks of molecular ions are detected using both the first and the second derivatives, followed by an efficient exponentially modified Gaussian-based peak deconvolution method for peak fitting. A two-stage alignment algorithm is also developed, where the retention times of all peaks are first transferred into the z-score domain and the peaks are aligned based on the measure of their mixture scores after retention time correction using a partial linear regression. Analysis of a set of spike-in LC-MS data from three groups of samples containing 16 metabolite standards mixed with metabolite extract from mouse livers demonstrates that the developed data preprocessing method performs better than two of the existing popular data analysis packages, MZmine2.6 and XCMS(2), for peak picking, peak list alignment, and quantification.
A Data Pre-processing Method for Liquid Chromatography Mass Spectrometry-based Metabolomics

PubMed Central

Wei, Xiaoli; Shi, Xue; Kim, Seongho; Zhang, Li; Patrick, Jeffrey S.; Binkley, Joe; McClain, Craig; Zhang, Xiang

2012-01-01

A set of data pre-processing algorithms for peak detection and peak list alignment are reported for analysis of LC-MS based metabolomics data. For spectrum deconvolution, peak picking is achieved at selected ion chromatogram (XIC) level. To estimate and remove the noise in XICs, each XIC is first segmented into several peak groups based on the continuity of scan number, and the noise level is estimated by all the XIC signals, except the regions potentially with presence of metabolite ion peaks. After removing noise, the peaks of molecular ions are detected using both the first and the second derivatives, followed by an efficient exponentially modified Gaussian-based peak deconvolution method for peak fitting. A two-stage alignment algorithm is also developed, where the retention times of all peaks are first transferred into z-score domain and the peaks are aligned based on the measure of their mixture scores after retention time correction using a partial linear regression. Analysis of a set of spike-in LC-MS data from three groups of samples containing 16 metabolite standards mixed with metabolite extract from mouse livers, demonstrates that the developed data pre-processing methods performs better than two of the existing popular data analysis packages, MZmine2.6 and XCMS2, for peak picking, peak list alignment and quantification. PMID:22931487
Automatic alignment of pre- and post-interventional liver CT images for assessment of radiofrequency ablation

NASA Astrophysics Data System (ADS)

Rieder, Christian; Wirtz, Stefan; Strehlow, Jan; Zidowitz, Stephan; Bruners, Philipp; Isfort, Peter; Mahnken, Andreas H.; Peitgen, Heinz-Otto

2012-02-01

Image-guided radiofrequency ablation (RFA) is becoming a standard procedure for minimally invasive tumor treatment in clinical practice. To verify the treatment success of the therapy, reliable post-interventional assessment of the ablation zone (coagulation) is essential. Typically, pre- and post-interventional CT images have to be aligned to compare the shape, size, and position of tumor and coagulation zone. In this work, we present an automatic workflow for masking liver tissue, enabling a rigid registration algorithm to perform at least as accurate as experienced medical experts. To minimize the effect of global liver deformations, the registration is computed in a local region of interest around the pre-interventional lesion and post-interventional coagulation necrosis. A registration mask excluding lesions and neighboring organs is calculated to prevent the registration algorithm from matching both lesion shapes instead of the surrounding liver anatomy. As an initial registration step, the centers of gravity from both lesions are aligned automatically. The subsequent rigid registration method is based on the Local Cross Correlation (LCC) similarity measure and Newton-type optimization. To assess the accuracy of our method, 41 RFA cases are registered and compared with the manually aligned cases from four medical experts. Furthermore, the registration results are compared with ground truth transformations based on averaged anatomical landmark pairs. In the evaluation, we show that our method allows to automatic alignment of the data sets with equal accuracy as medical experts, but requiring significancy less time consumption and variability.
Objective evaluation of linear and nonlinear tomosynthetic reconstruction algorithms

NASA Astrophysics Data System (ADS)

Webber, Richard L.; Hemler, Paul F.; Lavery, John E.

2000-04-01

This investigation objectively tests five different tomosynthetic reconstruction methods involving three different digital sensors, each used in a different radiologic application: chest, breast, and pelvis, respectively. The common task was to simulate a specific representative projection for each application by summation of appropriately shifted tomosynthetically generated slices produced by using the five algorithms. These algorithms were, respectively, (1) conventional back projection, (2) iteratively deconvoluted back projection, (3) a nonlinear algorithm similar to back projection, except that the minimum value from all of the component projections for each pixel is computed instead of the average value, (4) a similar algorithm wherein the maximum value was computed instead of the minimum value, and (5) the same type of algorithm except that the median value was computed. Using these five algorithms, we obtained data from each sensor-tissue combination, yielding three factorially distributed series of contiguous tomosynthetic slices. The respective slice stacks then were aligned orthogonally and averaged to yield an approximation of a single orthogonal projection radiograph of the complete (unsliced) tissue thickness. Resulting images were histogram equalized, and actual projection control images were subtracted from their tomosynthetically synthesized counterparts. Standard deviations of the resulting histograms were recorded as inverse figures of merit (FOMs). Visual rankings of image differences by five human observers of a subset (breast data only) also were performed to determine whether their subjective observations correlated with homologous FOMs. Nonparametric statistical analysis of these data demonstrated significant differences (P > 0.05) between reconstruction algorithms. The nonlinear minimization reconstruction method nearly always outperformed the other methods tested. Observer rankings were similar to those measured objectively.
An integrated workflow for robust alignment and simplified quantitative analysis of NMR spectrometry data.

PubMed

Vu, Trung N; Valkenborg, Dirk; Smets, Koen; Verwaest, Kim A; Dommisse, Roger; Lemière, Filip; Verschoren, Alain; Goethals, Bart; Laukens, Kris

2011-10-20

Nuclear magnetic resonance spectroscopy (NMR) is a powerful technique to reveal and compare quantitative metabolic profiles of biological tissues. However, chemical and physical sample variations make the analysis of the data challenging, and typically require the application of a number of preprocessing steps prior to data interpretation. For example, noise reduction, normalization, baseline correction, peak picking, spectrum alignment and statistical analysis are indispensable components in any NMR analysis pipeline. We introduce a novel suite of informatics tools for the quantitative analysis of NMR metabolomic profile data. The core of the processing cascade is a novel peak alignment algorithm, called hierarchical Cluster-based Peak Alignment (CluPA). The algorithm aligns a target spectrum to the reference spectrum in a top-down fashion by building a hierarchical cluster tree from peak lists of reference and target spectra and then dividing the spectra into smaller segments based on the most distant clusters of the tree. To reduce the computational time to estimate the spectral misalignment, the method makes use of Fast Fourier Transformation (FFT) cross-correlation. Since the method returns a high-quality alignment, we can propose a simple methodology to study the variability of the NMR spectra. For each aligned NMR data point the ratio of the between-group and within-group sum of squares (BW-ratio) is calculated to quantify the difference in variability between and within predefined groups of NMR spectra. This differential analysis is related to the calculation of the F-statistic or a one-way ANOVA, but without distributional assumptions. Statistical inference based on the BW-ratio is achieved by bootstrapping the null distribution from the experimental data. The workflow performance was evaluated using a previously published dataset. Correlation maps, spectral and grey scale plots show clear improvements in comparison to other methods, and the down-to-earth quantitative analysis works well for the CluPA-aligned spectra. The whole workflow is embedded into a modular and statistically sound framework that is implemented as an R package called "speaq" ("spectrum alignment and quantitation"), which is freely available from http://code.google.com/p/speaq/.
rasbhari: Optimizing Spaced Seeds for Database Searching, Read Mapping and Alignment-Free Sequence Comparison.

PubMed

Hahn, Lars; Leimeister, Chris-André; Ounit, Rachid; Lonardi, Stefano; Morgenstern, Burkhard

2016-10-01

Many algorithms for sequence analysis rely on word matching or word statistics. Often, these approaches can be improved if binary patterns representing match and don't-care positions are used as a filter, such that only those positions of words are considered that correspond to the match positions of the patterns. The performance of these approaches, however, depends on the underlying patterns. Herein, we show that the overlap complexity of a pattern set that was introduced by Ilie and Ilie is closely related to the variance of the number of matches between two evolutionarily related sequences with respect to this pattern set. We propose a modified hill-climbing algorithm to optimize pattern sets for database searching, read mapping and alignment-free sequence comparison of nucleic-acid sequences; our implementation of this algorithm is called rasbhari. Depending on the application at hand, rasbhari can either minimize the overlap complexity of pattern sets, maximize their sensitivity in database searching or minimize the variance of the number of pattern-based matches in alignment-free sequence comparison. We show that, for database searching, rasbhari generates pattern sets with slightly higher sensitivity than existing approaches. In our Spaced Words approach to alignment-free sequence comparison, pattern sets calculated with rasbhari led to more accurate estimates of phylogenetic distances than the randomly generated pattern sets that we previously used. Finally, we used rasbhari to generate patterns for short read classification with CLARK-S. Here too, the sensitivity of the results could be improved, compared to the default patterns of the program. We integrated rasbhari into Spaced Words; the source code of rasbhari is freely available at http://rasbhari.gobics.de/.
Assessment of Homology Templates and an Anesthetic Binding Site within the γ-Aminobutyric Acid Receptor

PubMed Central

Bertaccini, Edward J.; Yoluk, Ozge; Lindahl, Erik R.; Trudell, James R.

2013-01-01

Background Anesthetics mediate portions of their activity via modulation of the γ-aminobutyric acid receptor (GABAaR). While its molecular structure remains unknown, significant progress has been made towards understanding its interactions with anesthetics via molecular modeling. Methods The structure of the torpedo acetylcholine receptor (nAChRα), the structures of the α4 and β2 subunits of the human nAChR, the structures of the eukaryotic glutamate-gated chloride channel (GluCl), and the prokaryotic pH sensing channels, from Gloeobacter violaceus and Erwinia chrysanthemi, were aligned with the SAlign and 3DMA algorithms. A multiple sequence alignment from these structures and those of the GABAaR was performed with ClustalW. The Modeler and Rosetta algorithms independently created three-dimensional constructs of the GABAaR from the GluCl template. The CDocker algorithm docked a congeneric series of propofol derivatives into the binding pocket and scored calculated binding affinities for correlation with known GABAaR potentiation EC50’s. Results Multiple structure alignments of templates revealed a clear consensus of residue locations relevant to anesthetic effects except for torpedo nAChR. Within the GABAaR models generated from GluCl, the residues notable for modulating anesthetic action within transmembrane segments 1, 2, and 3 converged on the intersubunit interface between alpha and beta subunits. Docking scores of a propofol derivative series into this binding site showed strong linear correlation with GABAaR potentiation EC50. Conclusion Consensus structural alignment based on homologous templates revealed an intersubunit anesthetic binding cavity within the transmembrane domain of the GABAaR, which showed correlation of ligand docking scores with experimentally measured GABAaR potentiation. PMID:23770602
Assessment of homology templates and an anesthetic binding site within the γ-aminobutyric acid receptor.

PubMed

Bertaccini, Edward J; Yoluk, Ozge; Lindahl, Erik R; Trudell, James R

2013-11-01

Anesthetics mediate portions of their activity via modulation of the γ-aminobutyric acid receptor (GABAaR). Although its molecular structure remains unknown, significant progress has been made toward understanding its interactions with anesthetics via molecular modeling. The structure of the torpedo acetylcholine receptor (nAChRα), the structures of the α4 and β2 subunits of the human nAChR, the structures of the eukaryotic glutamate-gated chloride channel (GluCl), and the prokaryotic pH-sensing channels, from Gloeobacter violaceus and Erwinia chrysanthemi, were aligned with the SAlign and 3DMA algorithms. A multiple sequence alignment from these structures and those of the GABAaR was performed with ClustalW. The Modeler and Rosetta algorithms independently created three-dimensional constructs of the GABAaR from the GluCl template. The CDocker algorithm docked a congeneric series of propofol derivatives into the binding pocket and scored calculated binding affinities for correlation with known GABAaR potentiation EC50s. Multiple structure alignments of templates revealed a clear consensus of residue locations relevant to anesthetic effects except for torpedo nAChR. Within the GABAaR models generated from GluCl, the residues notable for modulating anesthetic action within transmembrane segments 1, 2, and 3 converged on the intersubunit interface between α and β subunits. Docking scores of a propofol derivative series into this binding site showed strong linear correlation with GABAaR potentiation EC50. Consensus structural alignment based on homologous templates revealed an intersubunit anesthetic binding cavity within the transmembrane domain of the GABAaR, which showed a correlation of ligand docking scores with experimentally measured GABAaR potentiation.
A Dimensionally Aligned Signal Projection for Classification of Unintended Radiated Emissions

DOE PAGES

Vann, Jason Michael; Karnowski, Thomas P.; Kerekes, Ryan; ...

2017-04-24

Characterization of unintended radiated emissions (URE) from electronic devices plays an important role in many research areas from electromagnetic interference to nonintrusive load monitoring to information system security. URE can provide insights for applications ranging from load disaggregation and energy efficiency to condition-based maintenance of equipment-based upon detected fault conditions. URE characterization often requires subject matter expertise to tailor transforms and feature extractors for the specific electrical devices of interest. We present a novel approach, named dimensionally aligned signal projection (DASP), for projecting aligned signal characteristics that are inherent to the physical implementation of many commercial electronic devices. These projectionsmore » minimize the need for an intimate understanding of the underlying physical circuitry and significantly reduce the number of features required for signal classification. We present three possible DASP algorithms that leverage frequency harmonics, modulation alignments, and frequency peak spacings, along with a two-dimensional image manipulation method for statistical feature extraction. To demonstrate the ability of DASP to generate relevant features from URE, we measured the conducted URE from 14 residential electronic devices using a 2 MS/s collection system. Furthermore, a linear discriminant analysis classifier was trained using DASP generated features and was blind tested resulting in a greater than 90% classification accuracy for each of the DASP algorithms and an accuracy of 99.1% when DASP features are used in combination. Furthermore, we show that a rank reduced feature set of the combined DASP algorithms provides a 98.9% classification accuracy with only three features and outperforms a set of spectral features in terms of general classification as well as applicability across a broad number of devices.« less
A Dimensionally Aligned Signal Projection for Classification of Unintended Radiated Emissions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vann, Jason Michael; Karnowski, Thomas P.; Kerekes, Ryan

Characterization of unintended radiated emissions (URE) from electronic devices plays an important role in many research areas from electromagnetic interference to nonintrusive load monitoring to information system security. URE can provide insights for applications ranging from load disaggregation and energy efficiency to condition-based maintenance of equipment-based upon detected fault conditions. URE characterization often requires subject matter expertise to tailor transforms and feature extractors for the specific electrical devices of interest. We present a novel approach, named dimensionally aligned signal projection (DASP), for projecting aligned signal characteristics that are inherent to the physical implementation of many commercial electronic devices. These projectionsmore » minimize the need for an intimate understanding of the underlying physical circuitry and significantly reduce the number of features required for signal classification. We present three possible DASP algorithms that leverage frequency harmonics, modulation alignments, and frequency peak spacings, along with a two-dimensional image manipulation method for statistical feature extraction. To demonstrate the ability of DASP to generate relevant features from URE, we measured the conducted URE from 14 residential electronic devices using a 2 MS/s collection system. Furthermore, a linear discriminant analysis classifier was trained using DASP generated features and was blind tested resulting in a greater than 90% classification accuracy for each of the DASP algorithms and an accuracy of 99.1% when DASP features are used in combination. Furthermore, we show that a rank reduced feature set of the combined DASP algorithms provides a 98.9% classification accuracy with only three features and outperforms a set of spectral features in terms of general classification as well as applicability across a broad number of devices.« less

Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

NASA Astrophysics Data System (ADS)

Roche-Lima, Abiel; Thulasiram, Ruppa K.

2012-02-01

Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.
A 2D eye gaze estimation system with low-resolution webcam images

NASA Astrophysics Data System (ADS)

Ince, Ibrahim Furkan; Kim, Jin Woo

2011-12-01

In this article, a low-cost system for 2D eye gaze estimation with low-resolution webcam images is presented. Two algorithms are proposed for this purpose, one for the eye-ball detection with stable approximate pupil-center and the other one for the eye movements' direction detection. Eyeball is detected using deformable angular integral search by minimum intensity (DAISMI) algorithm. Deformable template-based 2D gaze estimation (DTBGE) algorithm is employed as a noise filter for deciding the stable movement decisions. While DTBGE employs binary images, DAISMI employs gray-scale images. Right and left eye estimates are evaluated separately. DAISMI finds the stable approximate pupil-center location by calculating the mass-center of eyeball border vertices to be employed for initial deformable template alignment. DTBGE starts running with initial alignment and updates the template alignment with resulting eye movements and eyeball size frame by frame. The horizontal and vertical deviation of eye movements through eyeball size is considered as if it is directly proportional with the deviation of cursor movements in a certain screen size and resolution. The core advantage of the system is that it does not employ the real pupil-center as a reference point for gaze estimation which is more reliable against corneal reflection. Visual angle accuracy is used for the evaluation and benchmarking of the system. Effectiveness of the proposed system is presented and experimental results are shown.
Score distributions of gapped multiple sequence alignments down to the low-probability tail

NASA Astrophysics Data System (ADS)

Fieth, Pascal; Hartmann, Alexander K.

2016-08-01

Assessing the significance of alignment scores of optimally aligned DNA or amino acid sequences can be achieved via the knowledge of the score distribution of random sequences. But this requires obtaining the distribution in the biologically relevant high-scoring region, where the probabilities are exponentially small. For gapless local alignments of infinitely long sequences this distribution is known analytically to follow a Gumbel distribution. Distributions for gapped local alignments and global alignments of finite lengths can only be obtained numerically. To obtain result for the small-probability region, specific statistical mechanics-based rare-event algorithms can be applied. In previous studies, this was achieved for pairwise alignments. They showed that, contrary to results from previous simple sampling studies, strong deviations from the Gumbel distribution occur in case of finite sequence lengths. Here we extend the studies to multiple sequence alignments with gaps, which are much more relevant for practical applications in molecular biology. We study the distributions of scores over a large range of the support, reaching probabilities as small as 10-160, for global and local (sum-of-pair scores) multiple alignments. We find that even after suitable rescaling, eliminating the sequence-length dependence, the distributions for multiple alignment differ from the pairwise alignment case. Furthermore, we also show that the previously discussed Gaussian correction to the Gumbel distribution needs to be refined, also for the case of pairwise alignments.
ASPIC: a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences.

PubMed

Bonizzoni, Paola; Rizzi, Raffaella; Pesole, Graziano

2005-10-05

Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs) to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems--hence the need to develop novel strategies. We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites. To avoid limitations of splice sites prediction (mainly, over-predictions) due to independent single EST alignments to the genomic sequence our approach performs a multiple alignment of transcript data to the genomic sequence based on the combined analysis of all available data. We recast the problem of predicting constitutive and alternative splicing as an optimization problem, where the optimal multiple transcript alignment minimizes the number of exons and hence of splice site observations. We have implemented a splice site predictor based on this algorithm in the software tool ASPIC (Alternative Splicing PredICtion). It is distinguished from other methods based on BLAST-like tools by the incorporation of entirely new ad hoc procedures for accurate and computationally efficient transcript alignment and adopts dynamic programming for the refinement of intron boundaries. ASPIC also provides the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events. The ASPIC web resource is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility. Extensive bench marking shows that ASPIC outperforms other existing methods in the detection of novel splicing isoforms and in the minimization of over-predictions. ASPIC also requires a lower computation time for processing a single gene and an EST cluster. The ASPIC web resource is available at http://aspic.algo.disco.unimib.it/aspic-devel/.
Flexible, fast and accurate sequence alignment profiling on GPGPU with PaSWAS.

PubMed

Warris, Sven; Yalcin, Feyruz; Jackson, Katherine J L; Nap, Jan Peter

2015-01-01

To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis. With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation.
Cross contrast multi-channel image registration using image synthesis for MR brain images.

PubMed

Chen, Min; Carass, Aaron; Jog, Amod; Lee, Junghoon; Roy, Snehashis; Prince, Jerry L

2017-02-01

Multi-modal deformable registration is important for many medical image analysis tasks such as atlas alignment, image fusion, and distortion correction. Whereas a conventional method would register images with different modalities using modality independent features or information theoretic metrics such as mutual information, this paper presents a new framework that addresses the problem using a two-channel registration algorithm capable of using mono-modal similarity measures such as sum of squared differences or cross-correlation. To make it possible to use these same-modality measures, image synthesis is used to create proxy images for the opposite modality as well as intensity-normalized images from each of the two available images. The new deformable registration framework was evaluated by performing intra-subject deformation recovery, intra-subject boundary alignment, and inter-subject label transfer experiments using multi-contrast magnetic resonance brain imaging data. Three different multi-channel registration algorithms were evaluated, revealing that the framework is robust to the multi-channel deformable registration algorithm that is used. With a single exception, all results demonstrated improvements when compared against single channel registrations using the same algorithm with mutual information. Copyright © 2016 Elsevier B.V. All rights reserved.
Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns

PubMed Central

2013-01-01

Background It is well known that the search for homologous RNAs is more effective if both sequence and structure information is incorporated into the search. However, current tools for searching with RNA sequence-structure patterns cannot fully handle mutations occurring on both these levels or are simply not fast enough for searching large sequence databases because of the high computational costs of the underlying sequence-structure alignment problem. Results We present new fast index-based and online algorithms for approximate matching of RNA sequence-structure patterns supporting a full set of edit operations on single bases and base pairs. Our methods efficiently compute semi-global alignments of structural RNA patterns and substrings of the target sequence whose costs satisfy a user-defined sequence-structure edit distance threshold. For this purpose, we introduce a new computing scheme to optimally reuse the entries of the required dynamic programming matrices for all substrings and combine it with a technique for avoiding the alignment computation of non-matching substrings. Our new index-based methods exploit suffix arrays preprocessed from the target database and achieve running times that are sublinear in the size of the searched sequences. To support the description of RNA molecules that fold into complex secondary structures with multiple ordered sequence-structure patterns, we use fast algorithms for the local or global chaining of approximate sequence-structure pattern matches. The chaining step removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our improved online algorithm is faster than the best previous method by up to factor 45. Our best new index-based algorithm achieves a speedup of factor 560. Conclusions The presented methods achieve considerable speedups compared to the best previous method. This, together with the expected sublinear running time of the presented index-based algorithms, allows for the first time approximate matching of RNA sequence-structure patterns in large sequence databases. Beyond the algorithmic contributions, we provide with RaligNAtor a robust and well documented open-source software package implementing the algorithms presented in this manuscript. The RaligNAtor software is available at http://www.zbh.uni-hamburg.de/ralignator. PMID:23865810
MUSCLE: multiple sequence alignment with high accuracy and high throughput.

PubMed

Edgar, Robert C

2004-01-01

We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.
Feature-based Alignment of Volumetric Multi-modal Images

PubMed Central

Toews, Matthew; Zöllei, Lilla; Wells, William M.

2014-01-01

This paper proposes a method for aligning image volumes acquired from different imaging modalities (e.g. MR, CT) based on 3D scale-invariant image features. A novel method for encoding invariant feature geometry and appearance is developed, based on the assumption of locally linear intensity relationships, providing a solution to poor repeatability of feature detection in different image modalities. The encoding method is incorporated into a probabilistic feature-based model for multi-modal image alignment. The model parameters are estimated via a group-wise alignment algorithm, that iteratively alternates between estimating a feature-based model from feature data, then realigning feature data to the model, converging to a stable alignment solution with few pre-processing or pre-alignment requirements. The resulting model can be used to align multi-modal image data with the benefits of invariant feature correspondence: globally optimal solutions, high efficiency and low memory usage. The method is tested on the difficult RIRE data set of CT, T1, T2, PD and MP-RAGE brain images of subjects exhibiting significant inter-subject variability due to pathology. PMID:24683955
Exact calculation of distributions on integers, with application to sequence alignment.

PubMed

Newberg, Lee A; Lawrence, Charles E

2009-01-01

Computational biology is replete with high-dimensional discrete prediction and inference problems. Dynamic programming recursions can be applied to several of the most important of these, including sequence alignment, RNA secondary-structure prediction, phylogenetic inference, and motif finding. In these problems, attention is frequently focused on some scalar quantity of interest, a score, such as an alignment score or the free energy of an RNA secondary structure. In many cases, score is naturally defined on integers, such as a count of the number of pairing differences between two sequence alignments, or else an integer score has been adopted for computational reasons, such as in the test of significance of motif scores. The probability distribution of the score under an appropriate probabilistic model is of interest, such as in tests of significance of motif scores, or in calculation of Bayesian confidence limits around an alignment. Here we present three algorithms for calculating the exact distribution of a score of this type; then, in the context of pairwise local sequence alignments, we apply the approach so as to find the alignment score distribution and Bayesian confidence limits.
BlockLogo: visualization of peptide and sequence motif conservation

PubMed Central

Olsen, Lars Rønn; Kudahl, Ulrich Johan; Simon, Christian; Sun, Jing; Schönbach, Christian; Reinherz, Ellis L.; Zhang, Guang Lan; Brusic, Vladimir

2013-01-01

BlockLogo is a web-server application for visualization of protein and nucleotide fragments, continuous protein sequence motifs, and discontinuous sequence motifs using calculation of block entropy from multiple sequence alignments. The user input consists of a multiple sequence alignment, selection of motif positions, type of sequence, and output format definition. The output has BlockLogo along with the sequence logo, and a table of motif frequencies. We deployed BlockLogo as an online application and have demonstrated its utility through examples that show visualization of T-cell epitopes and B-cell epitopes (both continuous and discontinuous). Our additional example shows a visualization and analysis of structural motifs that determine specificity of peptide binding to HLA-DR molecules. The BlockLogo server also employs selected experimentally validated prediction algorithms to enable on-the-fly prediction of MHC binding affinity to 15 common HLA class I and class II alleles as well as visual analysis of discontinuous epitopes from multiple sequence alignments. It enables the visualization and analysis of structural and functional motifs that are usually described as regular expressions. It provides a compact view of discontinuous motifs composed of distant positions within biological sequences. BlockLogo is available at: http://research4.dfci.harvard.edu/cvc/blocklogo/ and http://methilab.bu.edu/blocklogo/ PMID:24001880
Fusion of range camera and photogrammetry: a systematic procedure for improving 3-D models metric accuracy.

PubMed

Guidi, G; Beraldin, J A; Ciofi, S; Atzeni, C

2003-01-01

The generation of three-dimensional (3-D) digital models produced by optical technologies in some cases involves metric errors. This happens when small high-resolution 3-D images are assembled together in order to model a large object. In some applications, as for example 3-D modeling of Cultural Heritage, the problem of metric accuracy is a major issue and no methods are currently available for enhancing it. The authors present a procedure by which the metric reliability of the 3-D model, obtained through iterative alignments of many range maps, can be guaranteed to a known acceptable level. The goal is the integration of the 3-D range camera system with a close range digital photogrammetry technique. The basic idea is to generate a global coordinate system determined by the digital photogrammetric procedure, measuring the spatial coordinates of optical targets placed around the object to be modeled. Such coordinates, set as reference points, allow the proper rigid motion of few key range maps, including a portion of the targets, in the global reference system defined by photogrammetry. The other 3-D images are normally aligned around these locked images with usual iterative algorithms. Experimental results on an anthropomorphic test object, comparing the conventional and the proposed alignment method, are finally reported.
FoldMiner and LOCK 2: protein structure comparison and motif discovery on the web.

PubMed

Shapiro, Jessica; Brutlag, Douglas

2004-07-01

The FoldMiner web server (http://foldminer.stanford.edu/) provides remote access to methods for protein structure alignment and unsupervised motif discovery. FoldMiner is unique among such algorithms in that it improves both the motif definition and the sensitivity of a structural similarity search by combining the search and motif discovery methods and using information from each process to enhance the other. In a typical run, a query structure is aligned to all structures in one of several databases of single domain targets in order to identify its structural neighbors and to discover a motif that is the basis for the similarity among the query and statistically significant targets. This process is fully automated, but options for manual refinement of the results are available as well. The server uses the Chime plugin and customized controls to allow for visualization of the motif and of structural superpositions. In addition, we provide an interface to the LOCK 2 algorithm for rapid alignments of a query structure to smaller numbers of user-specified targets.
[Motion control of moving mirror based on fixed-mirror adjustment in FTIR spectrometer].

PubMed

Li, Zhong-bing; Xu, Xian-ze; Le, Yi; Xu, Feng-qiu; Li, Jun-wei

2012-08-01

The performance of the uniform motion of the moving mirror, which is the only constant motion part in FTIR spectrometer, and the performance of the alignment of the fixed mirror play a key role in FTIR spectrometer, and affect the interference effect and the quality of the spectrogram and may restrict the precision and resolution of the instrument directly. The present article focuses on the research on the uniform motion of the moving mirror and the alignment of the fixed mirror. In order to improve the FTIR spectrometer, the maglev support system was designed for the moving mirror and the phase detection technology was adopted to adjust the tilt angle between the moving mirror and the fixed mirror. This paper also introduces an improved fuzzy PID control algorithm to get the accurate speed of the moving mirror and realize the control strategy from both hardware design and algorithm. The results show that the development of the moving mirror motion control system gets sufficient accuracy and real-time, which can ensure the uniform motion of the moving mirror and the alignment of the fixed mirror.
EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A

PubMed Central

Ndhlovu, Andrew; Durand, Pierre M.; Hazelhurst, Scott

2015-01-01

The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. Database URL: http://www.bioinf.wits.ac.za/software/fire/evodb PMID:26140928
EvoDB: a database of evolutionary rate profiles, associated protein domains and phylogenetic trees for PFAM-A.

PubMed

Ndhlovu, Andrew; Durand, Pierre M; Hazelhurst, Scott

2015-01-01

The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. © The Author(s) 2015. Published by Oxford University Press.
An automatic approach for 3D registration of CT scans

NASA Astrophysics Data System (ADS)

Hu, Yang; Saber, Eli; Dianat, Sohail; Vantaram, Sreenath Rao; Abhyankar, Vishwas

2012-03-01

CT (Computed tomography) is a widely employed imaging modality in the medical field. Normally, a volume of CT scans is prescribed by a doctor when a specific region of the body (typically neck to groin) is suspected of being abnormal. The doctors are required to make professional diagnoses based upon the obtained datasets. In this paper, we propose an automatic registration algorithm that helps healthcare personnel to automatically align corresponding scans from 'Study' to 'Atlas'. The proposed algorithm is capable of aligning both 'Atlas' and 'Study' into the same resolution through 3D interpolation. After retrieving the scanned slice volume in the 'Study' and the corresponding volume in the original 'Atlas' dataset, a 3D cross correlation method is used to identify and register various body parts.
Cover song identification by sequence alignment algorithms

NASA Astrophysics Data System (ADS)

Wang, Chih-Li; Zhong, Qian; Wang, Szu-Ying; Roychowdhury, Vwani

2011-10-01

Content-based music analysis has drawn much attention due to the rapidly growing digital music market. This paper describes a method that can be used to effectively identify cover songs. A cover song is a song that preserves only the crucial melody of its reference song but different in some other acoustic properties. Hence, the beat/chroma-synchronous chromagram, which is insensitive to the variation of the timber or rhythm of songs but sensitive to the melody, is chosen. The key transposition is achieved by cyclically shifting the chromatic domain of the chromagram. By using the Hidden Markov Model (HMM) to obtain the time sequences of songs, the system is made even more robust. Similar structure or length between the cover songs and its reference are not necessary by the Smith-Waterman Alignment Algorithm.
Building a LiDAR point cloud simulator: Testing algorithms for high resolution topographic change

NASA Astrophysics Data System (ADS)

Carrea, Dario; Abellán, Antonio; Derron, Marc-Henri; Jaboyedoff, Michel

2014-05-01

Terrestrial laser technique (TLS) is becoming a common tool in Geosciences, with clear applications ranging from the generation of a high resolution 3D models to the monitoring of unstable slopes and the quantification of morphological changes. Nevertheless, like every measurement techniques, TLS still has some limitations that are not clearly understood and affect the accuracy of the dataset (point cloud). A challenge in LiDAR research is to understand the influence of instrumental parameters on measurement errors during LiDAR acquisition. Indeed, different critical parameters interact with the scans quality at different ranges: the existence of shadow areas, the spatial resolution (point density), and the diameter of the laser beam, the incidence angle and the single point accuracy. The objective of this study is to test the main limitations of different algorithms usually applied on point cloud data treatment, from alignment to monitoring. To this end, we built in MATLAB(c) environment a LiDAR point cloud simulator able to recreate the multiple sources of errors related to instrumental settings that we normally observe in real datasets. In a first step we characterized the error from single laser pulse by modelling the influence of range and incidence angle on single point data accuracy. In a second step, we simulated the scanning part of the system in order to analyze the shifting and angular error effects. Other parameters have been added to the point cloud simulator, such as point spacing, acquisition window, etc., in order to create point clouds of simple and/or complex geometries. We tested the influence of point density and vitiating point of view on the Iterative Closest Point (ICP) alignment and also in some deformation tracking algorithm with same point cloud geometry, in order to determine alignment and deformation detection threshold. We also generated a series of high resolution point clouds in order to model small changes on different environments (erosion, landslide monitoring, etc) and we then tested the use of filtering techniques using 3D moving windows along the space and time, which considerably reduces data scattering due to the benefits of data redundancy. In conclusion, the simulator allowed us to improve our different algorithms and to understand how instrumental error affects final results. And also, improve the methodology of scans acquisition to find the best compromise between point density, positioning and acquisition time with the best accuracy possible to characterize the topographic change.
CoMFA analyses of C-2 position salvinorin A analogs at the kappa-opioid receptor provides insights into epimer selectivity.

PubMed

McGovern, Donna L; Mosier, Philip D; Roth, Bryan L; Westkaemper, Richard B

2010-04-01

The highly potent and kappa-opioid (KOP) receptor-selective hallucinogen Salvinorin A and selected analogs have been analyzed using the 3D quantitative structure-affinity relationship technique Comparative Molecular Field Analysis (CoMFA) in an effort to derive a statistically significant and predictive model of salvinorin affinity at the KOP receptor and to provide additional statistical support for the validity of previously proposed structure-based interaction models. Two CoMFA models of Salvinorin A analogs substituted at the C-2 position are presented. Separate models were developed based on the radioligand used in the kappa-opioid binding assay, [(3)H]diprenorphine or [(125)I]6 beta-iodo-3,14-dihydroxy-17-cyclopropylmethyl-4,5 alpha-epoxymorphinan ([(125)I]IOXY). For each dataset, three methods of alignment were employed: a receptor-docked alignment derived from the structure-based docking algorithm GOLD, another from the ligand-based alignment algorithm FlexS, and a rigid realignment of the poses from the receptor-docked alignment. The receptor-docked alignment produced statistically superior results compared to either the FlexS alignment or the realignment in both datasets. The [(125)I]IOXY set (Model 1) and [(3)H]diprenorphine set (Model 2) gave q(2) values of 0.592 and 0.620, respectively, using the receptor-docked alignment, and both models produced similar CoMFA contour maps that reflected the stereoelectronic features of the receptor model from which they were derived. Each model gave significantly predictive CoMFA statistics (Model 1 PSET r(2)=0.833; Model 2 PSET r(2)=0.813). Based on the CoMFA contour maps, a binding mode was proposed for amine-containing Salvinorin A analogs that provides a rationale for the observation that the beta-epimers (R-configuration) of protonated amines at the C-2 position have a higher affinity than the corresponding alpha-epimers (S-configuration). (c) 2010. Published by Elsevier Inc.

Alignment methods: strategies, challenges, benchmarking, and comparative overview.

PubMed

Löytynoja, Ari

2012-01-01

Comparative evolutionary analyses of molecular sequences are solely based on the identities and differences detected between homologous characters. Errors in this homology statement, that is errors in the alignment of the sequences, are likely to lead to errors in the downstream analyses. Sequence alignment and phylogenetic inference are tightly connected and many popular alignment programs use the phylogeny to divide the alignment problem into smaller tasks. They then neglect the phylogenetic tree, however, and produce alignments that are not evolutionarily meaningful. The use of phylogeny-aware methods reduces the error but the resulting alignments, with evolutionarily correct representation of homology, can challenge the existing practices and methods for viewing and visualising the sequences. The inter-dependency of alignment and phylogeny can be resolved by joint estimation of the two; methods based on statistical models allow for inferring the alignment parameters from the data and correctly take into account the uncertainty of the solution but remain computationally challenging. Widely used alignment methods are based on heuristic algorithms and unlikely to find globally optimal solutions. The whole concept of one correct alignment for the sequences is questionable, however, as there typically exist vast numbers of alternative, roughly equally good alignments that should also be considered. This uncertainty is hidden by many popular alignment programs and is rarely correctly taken into account in the downstream analyses. The quest for finding and improving the alignment solution is complicated by the lack of suitable measures of alignment goodness. The difficulty of comparing alternative solutions also affects benchmarks of alignment methods and the results strongly depend on the measure used. As the effects of alignment error cannot be predicted, comparing the alignments' performance in downstream analyses is recommended.
Minimap2: pairwise alignment for nucleotide sequences.

PubMed

Li, Heng

2018-05-10

Recent advances in sequencing technologies promise ultra-long reads of ∼100 kilo bases (kb) in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 mega bases (Mb) in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms. Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database. It works with accurate short reads of ≥ 100bp in length, ≥1kb genomic reads at error rate ∼15%, full-length noisy Direct RNA or cDNA reads, and assembly contigs or closely related full chromosomes of hundreds of megabases in length. Minimap2 does split-read alignment, employs concave gap cost for long insertions and deletions (INDELs) and introduces new heuristics to reduce spurious alignments. It is 3-4 times as fast as mainstream short-read mappers at comparable accuracy, and is ≥30 times faster than long-read genomic or cDNA mappers at higher accuracy, surpassing most aligners specialized in one type of alignment. https://github.com/lh3/minimap2. hengli@broadinstitute.org.
Local alignment of two-base encoded DNA sequence

PubMed Central

Homer, Nils; Merriman, Barry; Nelson, Stanley F

2009-01-01

Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732
Parametric inference for biological sequence analysis.

PubMed

Pachter, Lior; Sturmfels, Bernd

2004-11-16

One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.
Computer vision applications for coronagraphic optical alignment and image processing.

PubMed

Savransky, Dmitry; Thomas, Sandrine J; Poyneer, Lisa A; Macintosh, Bruce A

2013-05-10

Modern coronagraphic systems require very precise alignment between optical components and can benefit greatly from automated image processing. We discuss three techniques commonly employed in the fields of computer vision and image analysis as applied to the Gemini Planet Imager, a new facility instrument for the Gemini South Observatory. We describe how feature extraction and clustering methods can be used to aid in automated system alignment tasks, and also present a search algorithm for finding regular features in science images used for calibration and data processing. Along with discussions of each technique, we present our specific implementation and show results of each one in operation.
Pairwise alignment of chromatograms using an extended Fisher-Rao metric.

PubMed

Wallace, W E; Srivastava, A; Telu, K H; Simón-Manso, Y

2014-09-02

A conceptually new approach for aligning chromatograms is introduced and applied to examples of metabolite identification in human blood plasma by liquid chromatography-mass spectrometry (LC-MS). A square-root representation of the chromatogram's derivative coupled with an extended Fisher-Rao metric enables the computation of relative differences between chromatograms. Minimization of these differences using a common dynamic programming algorithm brings the chromatograms into alignment. Application to a complex sample, National Institute of Standards and Technology (NIST) Standard Reference Material 1950, Metabolites in Human Plasma, analyzed by two different LC-MS methods having significantly different ranges of elution time is described. Published by Elsevier B.V.
Real-time calibration and alignment of the LHCb RICH detectors

NASA Astrophysics Data System (ADS)

HE, Jibo

2017-12-01

In 2015, the LHCb experiment established a new and unique software trigger strategy with the purpose of increasing the purity of the signal events by applying the same algorithms online and offline. To achieve this, real-time calibration and alignment of all LHCb sub-systems is needed to provide vertexing, tracking, and particle identification of the best possible quality. The calibration of the refractive index of the RICH radiators, the calibration of the Hybrid Photon Detector image, and the alignment of the RICH mirror system, are reported in this contribution. The stability of the RICH performance and the particle identification performance are also discussed.
BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC

PubMed Central

Satija, Rahul; Novák, Ádám; Miklós, István; Lyngsø, Rune; Hein, Jotun

2009-01-01

Background We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. Results We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the α-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. Conclusion BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from PMID:19715598
BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC.

PubMed

Satija, Rahul; Novák, Adám; Miklós, István; Lyngsø, Rune; Hein, Jotun

2009-08-28

We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC) approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the alpha-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from http://www.stats.ox.ac.uk/~satija/BigFoot/
AlexSys: a knowledge-based expert system for multiple sequence alignment construction and analysis

PubMed Central

Aniba, Mohamed Radhouene; Poch, Olivier; Marchler-Bauer, Aron; Thompson, Julie Dawn

2010-01-01

Multiple sequence alignment (MSA) is a cornerstone of modern molecular biology and represents a unique means of investigating the patterns of conservation and diversity in complex biological systems. Many different algorithms have been developed to construct MSAs, but previous studies have shown that no single aligner consistently outperforms the rest. This has led to the development of a number of ‘meta-methods’ that systematically run several aligners and merge the output into one single solution. Although these methods generally produce more accurate alignments, they are inefficient because all the aligners need to be run first and the choice of the best solution is made a posteriori. Here, we describe the development of a new expert system, AlexSys, for the multiple alignment of protein sequences. AlexSys incorporates an intelligent inference engine to automatically select an appropriate aligner a priori, depending only on the nature of the input sequences. The inference engine was trained on a large set of reference multiple alignments, using a novel machine learning approach. Applying AlexSys to a test set of 178 alignments, we show that the expert system represents a good compromise between alignment quality and running time, making it suitable for high throughput projects. AlexSys is freely available from http://alnitak.u-strasbg.fr/∼aniba/alexsys. PMID:20530533
CUFID-query: accurate network querying through random walk based network flow estimation.

PubMed

Jeong, Hyundoo; Qian, Xiaoning; Yoon, Byung-Jun

2017-12-28

Functional modules in biological networks consist of numerous biomolecules and their complicated interactions. Recent studies have shown that biomolecules in a functional module tend to have similar interaction patterns and that such modules are often conserved across biological networks of different species. As a result, such conserved functional modules can be identified through comparative analysis of biological networks. In this work, we propose a novel network querying algorithm based on the CUFID (Comparative network analysis Using the steady-state network Flow to IDentify orthologous proteins) framework combined with an efficient seed-and-extension approach. The proposed algorithm, CUFID-query, can accurately detect conserved functional modules as small subnetworks in the target network that are expected to perform similar functions to the given query functional module. The CUFID framework was recently developed for probabilistic pairwise global comparison of biological networks, and it has been applied to pairwise global network alignment, where the framework was shown to yield accurate network alignment results. In the proposed CUFID-query algorithm, we adopt the CUFID framework and extend it for local network alignment, specifically to solve network querying problems. First, in the seed selection phase, the proposed method utilizes the CUFID framework to compare the query and the target networks and to predict the probabilistic node-to-node correspondence between the networks. Next, the algorithm selects and greedily extends the seed in the target network by iteratively adding nodes that have frequent interactions with other nodes in the seed network, in a way that the conductance of the extended network is maximally reduced. Finally, CUFID-query removes irrelevant nodes from the querying results based on the personalized PageRank vector for the induced network that includes the fully extended network and its neighboring nodes. Through extensive performance evaluation based on biological networks with known functional modules, we show that CUFID-query outperforms the existing state-of-the-art algorithms in terms of prediction accuracy and biological significance of the predictions.
AlignerBoost: A Generalized Software Toolkit for Boosting Next-Gen Sequencing Mapping Accuracy Using a Bayesian-Based Mapping Quality Framework.

PubMed

Zheng, Qi; Grice, Elizabeth A

2016-10-01

Accurate mapping of next-generation sequencing (NGS) reads to reference genomes is crucial for almost all NGS applications and downstream analyses. Various repetitive elements in human and other higher eukaryotic genomes contribute in large part to ambiguously (non-uniquely) mapped reads. Most available NGS aligners attempt to address this by either removing all non-uniquely mapping reads, or reporting one random or "best" hit based on simple heuristics. Accurate estimation of the mapping quality of NGS reads is therefore critical albeit completely lacking at present. Here we developed a generalized software toolkit "AlignerBoost", which utilizes a Bayesian-based framework to accurately estimate mapping quality of ambiguously mapped NGS reads. We tested AlignerBoost with both simulated and real DNA-seq and RNA-seq datasets at various thresholds. In most cases, but especially for reads falling within repetitive regions, AlignerBoost dramatically increases the mapping precision of modern NGS aligners without significantly compromising the sensitivity even without mapping quality filters. When using higher mapping quality cutoffs, AlignerBoost achieves a much lower false mapping rate while exhibiting comparable or higher sensitivity compared to the aligner default modes, therefore significantly boosting the detection power of NGS aligners even using extreme thresholds. AlignerBoost is also SNP-aware, and higher quality alignments can be achieved if provided with known SNPs. AlignerBoost's algorithm is computationally efficient, and can process one million alignments within 30 seconds on a typical desktop computer. AlignerBoost is implemented as a uniform Java application and is freely available at https://github.com/Grice-Lab/AlignerBoost.
Detecting chaos, determining the dimensions of tori and predicting slow diffusion in Fermi-Pasta-Ulam lattices by the Generalized Alignment Index method

NASA Astrophysics Data System (ADS)

Skokos, C.; Bountis, T.; Antonopoulos, C.

2008-12-01

The recently introduced GALI method is used for rapidly detecting chaos, determining the dimensionality of regular motion and predicting slow diffusion in multi-dimensional Hamiltonian systems. We propose an efficient computation of the GALIk indices, which represent volume elements of k randomly chosen deviation vectors from a given orbit, based on the Singular Value Decomposition (SVD) algorithm. We obtain theoretically and verify numerically asymptotic estimates of GALIs long-time behavior in the case of regular orbits lying on low-dimensional tori. The GALIk indices are applied to rapidly detect chaotic oscillations, identify low-dimensional tori of Fermi-Pasta-Ulam (FPU) lattices at low energies and predict weak diffusion away from quasiperiodic motion, long before it is actually observed in the oscillations.
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets.

PubMed

Shrimankar, D D; Sathe, S R

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today's supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures.
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets

PubMed Central

Shrimankar, D. D.; Sathe, S. R.

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today’s supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures. PMID:27932868
W-curve alignments for HIV-1 genomic comparisons.

PubMed

Cork, Douglas J; Lembark, Steven; Tovanabutra, Sodsai; Robb, Merlin L; Kim, Jerome H

2010-06-01

The W-curve was originally developed as a graphical visualization technique for viewing DNA and RNA sequences. Its ability to render features of DNA also makes it suitable for computational studies. Its main advantage in this area is utilizing a single-pass algorithm for comparing the sequences. Avoiding recursion during sequence alignments offers advantages for speed and in-process resources. The graphical technique also allows for multiple models of comparison to be used depending on the nucleotide patterns embedded in similar whole genomic sequences. The W-curve approach allows us to compare large numbers of samples quickly. We are currently tuning the algorithm to accommodate quirks specific to HIV-1 genomic sequences so that it can be used to aid in diagnostic and vaccine efforts. Tracking the molecular evolution of the virus has been greatly hampered by gap associated problems predominantly embedded within the envelope gene of the virus. Gaps and hypermutation of the virus slow conventional string based alignments of the whole genome. This paper describes the W-curve algorithm itself, and how we have adapted it for comparison of similar HIV-1 genomes. A treebuilding method is developed with the W-curve that utilizes a novel Cylindrical Coordinate distance method and gap analysis method. HIV-1 C2-V5 env sequence regions from a Mother/Infant cohort study are used in the comparison. The output distance matrix and neighbor results produced by the W-curve are functionally equivalent to those from Clustal for C2-V5 sequences in the mother/infant pairs infected with CRF01_AE. Significant potential exists for utilizing this method in place of conventional string based alignment of HIV-1 genomes, such as Clustal X. With W-curve heuristic alignment, it may be possible to obtain clinically useful results in a short time-short enough to affect clinical choices for acute treatment. A description of the W-curve generation process, including a comparison technique of aligning extremes of the curves to effectively phase-shift them past the HIV-1 gap problem, is presented. Besides yielding similar neighbor-joining phenogram topologies, most Mother and Infant C2-V5 sequences in the cohort pairs geometrically map closest to each other, indicating that W-curve heuristics overcame any gap problem.
LC-MSsim – a simulation software for liquid chromatography mass spectrometry data

PubMed Central

Schulz-Trieglaff, Ole; Pfeifer, Nico; Gröpl, Clemens; Kohlbacher, Oliver; Reinert, Knut

2008-01-01

Background Mass Spectrometry coupled to Liquid Chromatography (LC-MS) is commonly used to analyze the protein content of biological samples in large scale studies. The data resulting from an LC-MS experiment is huge, highly complex and noisy. Accordingly, it has sparked new developments in Bioinformatics, especially in the fields of algorithm development, statistics and software engineering. In a quantitative label-free mass spectrometry experiment, crucial steps are the detection of peptide features in the mass spectra and the alignment of samples by correcting for shifts in retention time. At the moment, it is difficult to compare the plethora of algorithms for these tasks. So far, curated benchmark data exists only for peptide identification algorithms but no data that represents a ground truth for the evaluation of feature detection, alignment and filtering algorithms. Results We present LC-MSsim, a simulation software for LC-ESI-MS experiments. It simulates ESI spectra on the MS level. It reads a list of proteins from a FASTA file and digests the protein mixture using a user-defined enzyme. The software creates an LC-MS data set using a predictor for the retention time of the peptides and a model for peak shapes and elution profiles of the mass spectral peaks. Our software also offers the possibility to add contaminants, to change the background noise level and includes a model for the detectability of peptides in mass spectra. After the simulation, LC-MSsim writes the simulated data to mzData, a public XML format. The software also stores the positions (monoisotopic m/z and retention time) and ion counts of the simulated ions in separate files. Conclusion LC-MSsim generates simulated LC-MS data sets and incorporates models for peak shapes and contaminations. Algorithm developers can match the results of feature detection and alignment algorithms against the simulated ion lists and meaningful error rates can be computed. We anticipate that LC-MSsim will be useful to the wider community to perform benchmark studies and comparisons between computational tools. PMID:18842122
Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters.

PubMed

Lan, Haidong; Chan, Yuandong; Xu, Kai; Schmidt, Bertil; Peng, Shaoliang; Liu, Weiguo

2016-07-19

Computing alignments between two or more sequences are common operations frequently performed in computational molecular biology. The continuing growth of biological sequence databases establishes the need for their efficient parallel implementation on modern accelerators. This paper presents new approaches to high performance biological sequence database scanning with the Smith-Waterman algorithm and the first stage of progressive multiple sequence alignment based on the ClustalW heuristic on a Xeon Phi-based compute cluster. Our approach uses a three-level parallelization scheme to take full advantage of the compute power available on this type of architecture; i.e. cluster-level data parallelism, thread-level coarse-grained parallelism, and vector-level fine-grained parallelism. Furthermore, we re-organize the sequence datasets and use Xeon Phi shuffle operations to improve I/O efficiency. Evaluations show that our method achieves a peak overall performance up to 220 GCUPS for scanning real protein sequence databanks on a single node consisting of two Intel E5-2620 CPUs and two Intel Xeon Phi 7110P cards. It also exhibits good scalability in terms of sequence length and size, and number of compute nodes for both database scanning and multiple sequence alignment. Furthermore, the achieved performance is highly competitive in comparison to optimized Xeon Phi and GPU implementations. Our implementation is available at https://github.com/turbo0628/LSDBS-mpi .
Statistical alignment: computational properties, homology testing and goodness-of-fit.

PubMed

Hein, J; Wiuf, C; Knudsen, B; Møller, M B; Wibling, G

2000-09-08

The model of insertions and deletions in biological sequences, first formulated by Thorne, Kishino, and Felsenstein in 1991 (the TKF91 model), provides a basis for performing alignment within a statistical framework. Here we investigate this model.Firstly, we show how to accelerate the statistical alignment algorithms several orders of magnitude. The main innovations are to confine likelihood calculations to a band close to the similarity based alignment, to get good initial guesses of the evolutionary parameters and to apply an efficient numerical optimisation algorithm for finding the maximum likelihood estimate. In addition, the recursions originally presented by Thorne, Kishino and Felsenstein can be simplified. Two proteins, about 1500 amino acids long, can be analysed with this method in less than five seconds on a fast desktop computer, which makes this method practical for actual data analysis.Secondly, we propose a new homology test based on this model, where homology means that an ancestor to a sequence pair can be found finitely far back in time. This test has statistical advantages relative to the traditional shuffle test for proteins.Finally, we describe a goodness-of-fit test, that allows testing the proposed insertion-deletion (indel) process inherent to this model and find that real sequences (here globins) probably experience indels longer than one, contrary to what is assumed by the model. Copyright 2000 Academic Press.
Estimation Filter for Alignment of the Spitzer Space Telescope

NASA Technical Reports Server (NTRS)

Bayard, David

2007-01-01

A document presents a summary of an onboard estimation algorithm now being used to calibrate the alignment of the Spitzer Space Telescope (formerly known as the Space Infrared Telescope Facility). The algorithm, denoted the S2P calibration filter, recursively generates estimates of the alignment angles between a telescope reference frame and a star-tracker reference frame. At several discrete times during the day, the filter accepts, as input, attitude estimates from the star tracker and observations taken by the Pointing Control Reference Sensor (a sensor in the field of view of the telescope). The output of the filter is a calibrated quaternion that represents the best current mean-square estimate of the alignment angles between the telescope and the star tracker. The S2P calibration filter incorporates a Kalman filter that tracks six states - two for each of three orthogonal coordinate axes. Although, in principle, one state per axis is sufficient, the use of two states per axis makes it possible to model both short- and long-term behaviors. Specifically, the filter properly models transient learning, characteristic times and bounds of thermomechanical drift, and long-term steady-state statistics, whether calibration measurements are taken frequently or infrequently. These properties ensure that the S2P filter performance is optimal over a broad range of flight conditions, and can be confidently run autonomously over several years of in-flight operation without human intervention.

Classification of biosensor time series using dynamic time warping: applications in screening cancer cells with characteristic biomarkers.

PubMed

Rai, Shesh N; Trainor, Patrick J; Khosravi, Farhad; Kloecker, Goetz; Panchapakesan, Balaji

2016-01-01

The development of biosensors that produce time series data will facilitate improvements in biomedical diagnostics and in personalized medicine. The time series produced by these devices often contains characteristic features arising from biochemical interactions between the sample and the sensor. To use such characteristic features for determining sample class, similarity-based classifiers can be utilized. However, the construction of such classifiers is complicated by the variability in the time domains of such series that renders the traditional distance metrics such as Euclidean distance ineffective in distinguishing between biological variance and time domain variance. The dynamic time warping (DTW) algorithm is a sequence alignment algorithm that can be used to align two or more series to facilitate quantifying similarity. In this article, we evaluated the performance of DTW distance-based similarity classifiers for classifying time series that mimics electrical signals produced by nanotube biosensors. Simulation studies demonstrated the positive performance of such classifiers in discriminating between time series containing characteristic features that are obscured by noise in the intensity and time domains. We then applied a DTW distance-based k -nearest neighbors classifier to distinguish the presence/absence of mesenchymal biomarker in cancer cells in buffy coats in a blinded test. Using a train-test approach, we find that the classifier had high sensitivity (90.9%) and specificity (81.8%) in differentiating between EpCAM-positive MCF7 cells spiked in buffy coats and those in plain buffy coats.
Optimal network alignment with graphlet degree vectors.

PubMed

Milenković, Tijana; Ng, Weng Leong; Hayes, Wayne; Przulj, Natasa

2010-06-30

Important biological information is encoded in the topology of biological networks. Comparative analyses of biological networks are proving to be valuable, as they can lead to transfer of knowledge between species and give deeper insights into biological function, disease, and evolution. We introduce a new method that uses the Hungarian algorithm to produce optimal global alignment between two networks using any cost function. We design a cost function based solely on network topology and use it in our network alignment. Our method can be applied to any two networks, not just biological ones, since it is based only on network topology. We use our new method to align protein-protein interaction networks of two eukaryotic species and demonstrate that our alignment exposes large and topologically complex regions of network similarity. At the same time, our alignment is biologically valid, since many of the aligned protein pairs perform the same biological function. From the alignment, we predict function of yet unannotated proteins, many of which we validate in the literature. Also, we apply our method to find topological similarities between metabolic networks of different species and build phylogenetic trees based on our network alignment score. The phylogenetic trees obtained in this way bear a striking resemblance to the ones obtained by sequence alignments. Our method detects topologically similar regions in large networks that are statistically significant. It does this independent of protein sequence or any other information external to network topology.
Alignment of multiradiation isocenters for megavoltage photon beam

PubMed Central

Zhang, Yin; Ding, Kai; Cowan, Garth; Tryggestad, Erik; Armour, Elwood

2015-01-01

The accurate measurement of the linear accelerator (linac) radiation isocenter is critical, especially for stereotactic treatment. Traditional quality assurance (QA) procedure focuses on the measurement of single radiation isocenter, usually of 6 megavoltage (MV) photon beams. Single radiation isocenter is also commonly assumed in treatment planning systems (TPS). Due to different flattening filters and bending magnet and steering parameters, the radiation isocenter of one energy mode can deviate from another if no special effort was devoted. We present the first experience of the multiradiation isocenters alignment on an Elekta linac, as well as its corresponding QA procedure and clinical impact. An 8 mm ball‐bearing (BB) phantom was placed at the 6 MV radiation isocenter using an Elekta isocenter search algorithm, based on portal images. The 3D radiation isocenter shifts of other photon energy modes relative to the 6 MV were determined. Beam profile scanning for different field sizes was used as an independent method to determine the 2D multiradiation isocenters alignment. To quantify the impact of radiation isocenter offset on targeting accuracy, the 10 MV radiation isocenter was manually offset from that for 6 MV by adjusting the bending magnet current. Because our table isocenter was mechanically aligned to the 6 MV radiation isocenter, the deviation of the table isocentric rotation from the "shifted" 10 MV radiation isocenter after bending magnet adjustment was assessed. Winston‐Lutz test was also performed to confirm the overall radiation isocenter positioning accuracy for all photon energies. The portal image method showed the radiation isocenter of the 10 MV flattening filter‐free mode deviated from others before beam parameter adjustment. After the adjustment, the deviation was greatly improved from 0.96 to 0.35 mm relative to the 6 MV radiation isocenter. The same finding was confirmed by the profile‐scanning method. The maximum deviation of the table isocentric rotation from the 10 MV radiation isocenter was observed to linearly increase with the offset between 6 and 10 MV radiation isocenter; 1 mm radiation isocenter offset can translate to almost 2 mm maximum deviation of the table isocentric rotation from the 10 MV radiation isocenter. The alignment of the multiradiation isocenters is particularly important for high‐precision radiotherapy. Our study provides the medical physics community with a quantitative measure of the multiradiation isocenters alignment. A routine QA method should be considered, to examine the radiation isocenters alignment during the linac acceptance. PACS number: 87.55.Qr, 87.56.bd, 87.56.Fc PMID:26699586
Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring

PubMed Central

2012-01-01

Background Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. Results The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Conclusions Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family. PMID:22793672
Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring.

PubMed

Durston, Kirk K; Chiu, David Ky; Wong, Andrew Kc; Li, Gary Cl

2012-07-13

Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family.
Towards a Unified Framework for Pose, Expression, and Occlusion Tolerant Automatic Facial Alignment.

PubMed

Seshadri, Keshav; Savvides, Marios

2016-10-01

We propose a facial alignment algorithm that is able to jointly deal with the presence of facial pose variation, partial occlusion of the face, and varying illumination and expressions. Our approach proceeds from sparse to dense landmarking steps using a set of specific models trained to best account for the shape and texture variation manifested by facial landmarks and facial shapes across pose and various expressions. We also propose the use of a novel l1-regularized least squares approach that we incorporate into our shape model, which is an improvement over the shape model used by several prior Active Shape Model (ASM) based facial landmark localization algorithms. Our approach is compared against several state-of-the-art methods on many challenging test datasets and exhibits a higher fitting accuracy on all of them.
muBLASTP: database-indexed protein sequence search on multicore CPUs.

PubMed

Zhang, Jing; Misra, Sanchit; Wang, Hao; Feng, Wu-Chun

2016-11-04

The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search. muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST. With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.
Fast-SG: an alignment-free algorithm for hybrid assembly.

PubMed

Di Genova, Alex; Ruz, Gonzalo A; Sagot, Marie-France; Maass, Alejandro

2018-05-01

Long-read sequencing technologies are the ultimate solution for genome repeats, allowing near reference-level reconstructions of large genomes. However, long-read de novo assembly pipelines are computationally intense and require a considerable amount of coverage, thereby hindering their broad application to the assembly of large genomes. Alternatively, hybrid assembly methods that combine short- and long-read sequencing technologies can reduce the time and cost required to produce de novo assemblies of large genomes. Here, we propose a new method, called Fast-SG, that uses a new ultrafast alignment-free algorithm specifically designed for constructing a scaffolding graph using light-weight data structures. Fast-SG can construct the graph from either short or long reads. This allows the reuse of efficient algorithms designed for short-read data and permits the definition of novel modular hybrid assembly pipelines. Using comprehensive standard datasets and benchmarks, we show how Fast-SG outperforms the state-of-the-art short-read aligners when building the scaffoldinggraph and can be used to extract linking information from either raw or error-corrected long reads. We also show how a hybrid assembly approach using Fast-SG with shallow long-read coverage (5X) and moderate computational resources can produce long-range and accurate reconstructions of the genomes of Arabidopsis thaliana (Ler-0) and human (NA12878). Fast-SG opens a door to achieve accurate hybrid long-range reconstructions of large genomes with low effort, high portability, and low cost.
Pattern-Recognition System for Approaching a Known Target

NASA Technical Reports Server (NTRS)

Huntsberger, Terrance; Cheng, Yang

2008-01-01

A closed-loop pattern-recognition system is designed to provide guidance for maneuvering a small exploratory robotic vehicle (rover) on Mars to return to a landed spacecraft to deliver soil and rock samples that the spacecraft would subsequently bring back to Earth. The system could be adapted to terrestrial use in guiding mobile robots to approach known structures that humans could not approach safely, for such purposes as reconnaissance in military or law-enforcement applications, terrestrial scientific exploration, and removal of explosive or other hazardous items. The system has been demonstrated in experiments in which the Field Integrated Design and Operations (FIDO) rover (a prototype Mars rover equipped with a video camera for guidance) is made to return to a mockup of Mars-lander spacecraft. The FIDO rover camera autonomously acquires an image of the lander from a distance of 125 m in an outdoor environment. Then under guidance by an algorithm that performs fusion of multiple line and texture features in digitized images acquired by the camera, the rover traverses the intervening terrain, using features derived from images of the lander truss structure. Then by use of precise pattern matching for determining the position and orientation of the rover relative to the lander, the rover aligns itself with the bottom of ramps extending from the lander, in preparation for climbing the ramps to deliver samples to the lander. The most innovative aspect of the system is a set of pattern-recognition algorithms that govern a three-phase visual-guidance sequence for approaching the lander. During the first phase, a multifeature fusion algorithm integrates the outputs of a horizontal-line-detection algorithm and a wavelet-transform-based visual-area-of-interest algorithm for detecting the lander from a significant distance. The horizontal-line-detection algorithm is used to determine candidate lander locations based on detection of a horizontal deck that is part of the lander.
Labview Implementation of Image Processing and Phasing Control for the SIBOA Segmented Mirror Testbed

NASA Technical Reports Server (NTRS)

Partridge, James D.

2002-01-01

'NASA is preparing to launch the Next Generation Space Telescope (NGST). This telescope will be larger than the Hubble Space Telescope, be launched on an Atlas missile rather than the Space Shuttle, have a segmented primary mirror, and be placed in a higher orbit. All these differences pose significant challenges.' This effort addresses the challenge of implementing an algorithm for aligning the segments of the primary mirror during the initial deployment that was designed by Philip Olivier and members of SOMTC (Space Optics Manufacturing Technology Center). The implementation was to be performed on the SIBOA (Systematic Image Based Optical Alignment) test bed. Unfortunately, hardware/software aspect concerning SIBOA and an extended time period for algorithm development prevented testing before the end of the study period. Properties of the digital camera were studied and understood, resulting in the current ability of selecting optimal settings regarding saturation. The study was successful in manually capturing several images of two stacked segments with various relative phases. These images can be used to calibrate the algorithm for future implementation. Currently the system is ready for testing.
Dense depth maps from correspondences derived from perceived motion

NASA Astrophysics Data System (ADS)

Kirby, Richard; Whitaker, Ross

2017-01-01

Many computer vision applications require finding corresponding points between images and using the corresponding points to estimate disparity. Today's correspondence finding algorithms primarily use image features or pixel intensities common between image pairs. Some 3-D computer vision applications, however, do not produce the desired results using correspondences derived from image features or pixel intensities. Two examples are the multimodal camera rig and the center region of a coaxial camera rig. We present an image correspondence finding technique that aligns pairs of image sequences using optical flow fields. The optical flow fields provide information about the structure and motion of the scene, which are not available in still images but can be used in image alignment. We apply the technique to a dual focal length stereo camera rig consisting of a visible light-infrared camera pair and to a coaxial camera rig. We test our method on real image sequences and compare our results with the state-of-the-art multimodal and structure from motion (SfM) algorithms. Our method produces more accurate depth and scene velocity reconstruction estimates than the state-of-the-art multimodal and SfM algorithms.
Historian: accurate reconstruction of ancestral sequences and evolutionary rates.

PubMed

Holmes, Ian H

2017-04-15

Reconstruction of ancestral sequence histories, and estimation of parameters like indel rates, are improved by using explicit evolutionary models and summing over uncertain alignments. The previous best tool for this purpose (according to simulation benchmarks) was ProtPal, but this tool was too slow for practical use. Historian combines an efficient reimplementation of the ProtPal algorithm with performance-improving heuristics from other alignment tools. Simulation results on fidelity of rate estimation via ancestral reconstruction, along with evaluations on the structurally informed alignment dataset BAliBase 3.0, recommend Historian over other alignment tools for evolutionary applications. Historian is available at https://github.com/evoldoers/historian under the Creative Commons Attribution 3.0 US license. ihholmes+historian@gmail.com. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Practical, computer-aided registration of multiple, three-dimensional, magnetic-resonance observations of the human brain

DOE Office of Scientific and Technical Information (OSTI.GOV)

Diegert, C.; Sanders, J.A.; Orrison, W.W. Jr.

1992-12-31

Researchers working with MR observations generally agree that far more information is available in a volume (3D) observation than is considered for diagnosis. The key to the new alignment method is in basing it on available information on surfaces. Using the skin surface is effective a robust algorithm can reliably extract this surface from almost any scan of the head, and a human operator`s exquisite sensitivity to facial features is allows him to manually align skin surfaces with precision. Following the definitions, we report on a preliminary experiment where we align three MR observations taken during a single MR examination,more » each weighting arterial, venous, and tissue features. When accurately aligned, a neurosurgeon can use these features as anatomical landmarks for planning and executing interventional procedures.« less
AlignerBoost: A Generalized Software Toolkit for Boosting Next-Gen Sequencing Mapping Accuracy Using a Bayesian-Based Mapping Quality Framework

PubMed Central

Zheng, Qi; Grice, Elizabeth A.

2016-01-01

Accurate mapping of next-generation sequencing (NGS) reads to reference genomes is crucial for almost all NGS applications and downstream analyses. Various repetitive elements in human and other higher eukaryotic genomes contribute in large part to ambiguously (non-uniquely) mapped reads. Most available NGS aligners attempt to address this by either removing all non-uniquely mapping reads, or reporting one random or "best" hit based on simple heuristics. Accurate estimation of the mapping quality of NGS reads is therefore critical albeit completely lacking at present. Here we developed a generalized software toolkit "AlignerBoost", which utilizes a Bayesian-based framework to accurately estimate mapping quality of ambiguously mapped NGS reads. We tested AlignerBoost with both simulated and real DNA-seq and RNA-seq datasets at various thresholds. In most cases, but especially for reads falling within repetitive regions, AlignerBoost dramatically increases the mapping precision of modern NGS aligners without significantly compromising the sensitivity even without mapping quality filters. When using higher mapping quality cutoffs, AlignerBoost achieves a much lower false mapping rate while exhibiting comparable or higher sensitivity compared to the aligner default modes, therefore significantly boosting the detection power of NGS aligners even using extreme thresholds. AlignerBoost is also SNP-aware, and higher quality alignments can be achieved if provided with known SNPs. AlignerBoost’s algorithm is computationally efficient, and can process one million alignments within 30 seconds on a typical desktop computer. AlignerBoost is implemented as a uniform Java application and is freely available at https://github.com/Grice-Lab/AlignerBoost. PMID:27706155
Genetic algorithm prediction of two-dimensional group-IV dioxides for dielectrics

NASA Astrophysics Data System (ADS)

Singh, Arunima K.; Revard, Benjamin C.; Ramanathan, Rohit; Ashton, Michael; Tavazza, Francesca; Hennig, Richard G.

2017-04-01

Two-dimensional (2D) materials present a new class of materials whose structures and properties can differ from their bulk counterparts. We perform a genetic algorithm structure search using density-functional theory to identify low-energy structures of 2D group-IV dioxides A O2 (A =Si , Ge, Sn, Pb). We find that 2D SiO2 is most stable in the experimentally determined bi-tetrahedral structure, while 2D SnO2 and PbO2 are most stable in the 1 T structure. For 2D GeO2, the genetic algorithm finds a new low-energy 2D structure with monoclinic symmetry. Each system exhibits 2D structures with formation energies ranging from 26 to 151 meV/atom, below those of certain already synthesized 2D materials. The phonon spectra confirm their dynamic stability. Using the HSE06 hybrid functional, we determine that the 2D dioxides are insulators or semiconductors, with a direct band gap of 7.2 eV at Γ for 2D SiO2, and indirect band gaps of 4.8-2.7 eV for the other dioxides. To guide future applications of these 2D materials in nanoelectronic devices, we determine their band-edge alignment with graphene, phosphorene, and single-layer BN and MoS2. An assessment of the dielectric properties and electrochemical stability of the 2D group-IV dioxides shows that 2D GeO2 and SnO2 are particularly promising candidates for gate oxides and 2D SnO2 also as a protective layer in heterostructure nanoelectronic devices.
GIRAF: a method for fast search and flexible alignment of ligand binding interfaces in proteins at atomic resolution

PubMed Central

Kinjo, Akira R.; Nakamura, Haruki

2012-01-01

Comparison and classification of protein structures are fundamental means to understand protein functions. Due to the computational difficulty and the ever-increasing amount of structural data, however, it is in general not feasible to perform exhaustive all-against-all structure comparisons necessary for comprehensive classifications. To efficiently handle such situations, we have previously proposed a method, now called GIRAF. We herein describe further improvements in the GIRAF protein structure search and alignment method. The GIRAF method achieves extremely efficient search of similar structures of ligand binding sites of proteins by exploiting database indexing of structural features of local coordinate frames. In addition, it produces refined atom-wise alignments by iterative applications of the Hungarian method to the bipartite graph defined for a pair of superimposed structures. By combining the refined alignments based on different local coordinate frames, it is made possible to align structures involving domain movements. We provide detailed accounts for the database design, the search and alignment algorithms as well as some benchmark results. PMID:27493524
Parallel seed-based approach to multiple protein structure similarities detection

DOE PAGES

Chapuis, Guillaume; Le Boudic-Jamin, Mathilde; Andonov, Rumen; ...

2015-01-01

Finding similarities between protein structures is a crucial task in molecular biology. Most of the existing tools require proteins to be aligned in order-preserving way and only find single alignments even when multiple similar regions exist. We propose a new seed-based approach that discovers multiple pairs of similar regions. Its computational complexity is polynomial and it comes with a quality guarantee—the returned alignments have both root mean squared deviations (coordinate-based as well as internal-distances based) lower than a given threshold, if such exist. We do not require the alignments to be order preserving (i.e., we consider nonsequential alignments), which makesmore » our algorithm suitable for detecting similar domains when comparing multidomain proteins as well as to detect structural repetitions within a single protein. Because the search space for nonsequential alignments is much larger than for sequential ones, the computational burden is addressed by extensive use of parallel computing techniques: a coarse-grain level parallelism making use of available CPU cores for computation and a fine-grain level parallelism exploiting bit-level concurrency as well as vector instructions.« less
RNA-Seq Alignment to Individualized Genomes Improves Transcript Abundance Estimates in Multiparent Populations

PubMed Central

Munger, Steven C.; Raghupathy, Narayanan; Choi, Kwangbom; Simons, Allen K.; Gatti, Daniel M.; Hinerfeld, Douglas A.; Svenson, Karen L.; Keller, Mark P.; Attie, Alan D.; Hibbs, Matthew A.; Graber, Joel H.; Chesler, Elissa J.; Churchill, Gary A.

2014-01-01

Massively parallel RNA sequencing (RNA-seq) has yielded a wealth of new insights into transcriptional regulation. A first step in the analysis of RNA-seq data is the alignment of short sequence reads to a common reference genome or transcriptome. Genetic variants that distinguish individual genomes from the reference sequence can cause reads to be misaligned, resulting in biased estimates of transcript abundance. Fine-tuning of read alignment algorithms does not correct this problem. We have developed Seqnature software to construct individualized diploid genomes and transcriptomes for multiparent populations and have implemented a complete analysis pipeline that incorporates other existing software tools. We demonstrate in simulated and real data sets that alignment to individualized transcriptomes increases read mapping accuracy, improves estimation of transcript abundance, and enables the direct estimation of allele-specific expression. Moreover, when applied to expression QTL mapping we find that our individualized alignment strategy corrects false-positive linkage signals and unmasks hidden associations. We recommend the use of individualized diploid genomes over reference sequence alignment for all applications of high-throughput sequencing technology in genetically diverse populations. PMID:25236449
GraphCrunch 2: Software tool for network modeling, alignment and clustering.

PubMed

Kuchaiev, Oleksii; Stevanović, Aleksandar; Hayes, Wayne; Pržulj, Nataša

2011-01-19

Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI) data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype. We introduce the GraphCrunch 2 software tool, which addresses these problems. It is a significant extension of GraphCrunch which implements the most popular random network models and compares them with the data networks with respect to many network properties. Also, GraphCrunch 2 implements the GRAph ALigner algorithm ("GRAAL") for purely topological network alignment. GRAAL can align any pair of networks and exposes large, dense, contiguous regions of topological and functional similarities far larger than any other existing tool. Finally, GraphCruch 2 implements an algorithm for clustering nodes within a network based solely on their topological similarities. Using GraphCrunch 2, we demonstrate that eukaryotic and viral PPI networks may belong to different graph model families and show that topology-based clustering can reveal important functional similarities between proteins within yeast and human PPI networks. GraphCrunch 2 is a software tool that implements the latest research on biological network analysis. It parallelizes computationally intensive tasks to fully utilize the potential of modern multi-core CPUs. It is open-source and freely available for research use. It runs under the Windows and Linux platforms.
Coherent lidar design and performance verification

NASA Technical Reports Server (NTRS)

Frehlich, Rod

1993-01-01

The verification of LAWS beam alignment in space can be achieved by a measurement of heterodyne efficiency using the surface return. The crucial element is a direct detection signal that can be identified for each surface return. This should be satisfied for LAWS but will not be satisfied for descoped LAWS. The performance of algorithms for velocity estimation can be described with two basic parameters: the number of coherently detected photo-electrons per estimate and the number of independent signal samples per estimate. The average error of spectral domain velocity estimation algorithms are bounded by a new periodogram Cramer-Rao Bound. Comparison of the periodogram CRB with the exact CRB indicates a factor of two improvement in velocity accuracy is possible using non-spectral domain estimators. This improvement has been demonstrated with a maximum-likelihood estimator. The comparison of velocity estimation algorithms for 2 and 10 micron coherent lidar was performed by assuming all the system design parameters are fixed and the signal statistics are dominated by a 1 m/s rms wind fluctuation over the range gate. The beam alignment requirements for 2 micron are much more severe than for a 10 micron lidar. The effects of the random backscattered field on estimating the alignment error is a major problem for space based lidar operation, especially if the heterodyne efficiency cannot be estimated. For LAWS, the biggest science payoff would result from a short transmitted pulse, on the order of 0.5 microseconds instead of 3 microseconds. The numerically errors for simulation of laser propagation in the atmosphere have been determined as a joint project with the University of California, San Diego. Useful scaling laws were obtained for Kolmogorov atmospheric refractive turbulence and an atmospheric refractive turbulence characterized with an inner scale. This permits verification of the simulation procedure which is essential for the evaluation of the effects of refractive turbulence on coherent Doppler lidar systems. The analysis of 2 micron Doppler lidar data from Coherent Technologies, Inc. (CTI) has demonstrated many of the advantages of doppler lidar measurements of boundary layer winds. The effects of wind shear and wind turbulence over the pulse volume are probably the dominant source of the reduced performance. The effects of wind shear and wind turbulence on the statistical description of doppler lidar data has been derived and calculated.

A Kalman Filter for SINS Self-Alignment Based on Vector Observation.

PubMed

Xu, Xiang; Xu, Xiaosu; Zhang, Tao; Li, Yao; Tong, Jinwu

2017-01-29

In this paper, a self-alignment method for strapdown inertial navigation systems based on the q -method is studied. In addition, an improved method based on integrating gravitational apparent motion to form apparent velocity is designed, which can reduce the random noises of the observation vectors. For further analysis, a novel self-alignment method using a Kalman filter based on adaptive filter technology is proposed, which transforms the self-alignment procedure into an attitude estimation using the observation vectors. In the proposed method, a linear psuedo-measurement equation is adopted by employing the transfer method between the quaternion and the observation vectors. Analysis and simulation indicate that the accuracy of the self-alignment is improved. Meanwhile, to improve the convergence rate of the proposed method, a new method based on parameter recognition and a reconstruction algorithm for apparent gravitation is devised, which can reduce the influence of the random noises of the observation vectors. Simulations and turntable tests are carried out, and the results indicate that the proposed method can acquire sound alignment results with lower standard variances, and can obtain higher alignment accuracy and a faster convergence rate.
Evaluation of Eight Methods for Aligning Orientation of Two Coordinate Systems.

PubMed

Mecheri, Hakim; Robert-Lachaine, Xavier; Larue, Christian; Plamondon, André

2016-08-01

The aim of this study was to evaluate eight methods for aligning the orientation of two different local coordinate systems. Alignment is very important when combining two different systems of motion analysis. Two of the methods were developed specifically for biomechanical studies, and because there have been at least three decades of algorithm development in robotics, it was decided to include six methods from this field. To compare these methods, an Xsens sensor and two Optotrak clusters were attached to a Plexiglas plate. The first optical marker cluster was fixed on the sensor and 20 trials were recorded. The error of alignment was calculated for each trial, and the mean, the standard deviation, and the maximum values of this error over all trials were reported. One-way repeated measures analysis of variance revealed that the alignment error differed significantly across the eight methods. Post-hoc tests showed that the alignment error from the methods based on angular velocities was significantly lower than for the other methods. The method using angular velocities performed the best, with an average error of 0.17 ± 0.08 deg. We therefore recommend this method, which is easy to perform and provides accurate alignment.
Global Alignment of Pairwise Protein Interaction Networks for Maximal Common Conserved Patterns

DOE PAGES

Tian, Wenhong; Samatova, Nagiza F.

2013-01-01

A number of tools for the alignment of protein-protein interaction (PPI) networks have laid the foundation for PPI network analysis. Most of alignment tools focus on finding conserved interaction regions across the PPI networks through either local or global mapping of similar sequences. Researchers are still trying to improve the speed, scalability, and accuracy of network alignment. In view of this, we introduce a connected-components based fast algorithm, HopeMap, for network alignment. Observing that the size of true orthologs across species is small comparing to the total number of proteins in all species, we take a different approach based onmore » a precompiled list of homologs identified by KO terms. Applying this approach to S. cerevisiae (yeast) and D. melanogaster (fly), E. coli K12 and S. typhimurium , E. coli K12 and C. crescenttus , we analyze all clusters identified in the alignment. The results are evaluated through up-to-date known gene annotations, gene ontology (GO), and KEGG ortholog groups (KO). Comparing to existing tools, our approach is fast with linear computational cost, highly accurate in terms of KO and GO terms specificity and sensitivity, and can be extended to multiple alignments easily.« less
A Kalman Filter for SINS Self-Alignment Based on Vector Observation

PubMed Central

Xu, Xiang; Xu, Xiaosu; Zhang, Tao; Li, Yao; Tong, Jinwu

2017-01-01

In this paper, a self-alignment method for strapdown inertial navigation systems based on the q-method is studied. In addition, an improved method based on integrating gravitational apparent motion to form apparent velocity is designed, which can reduce the random noises of the observation vectors. For further analysis, a novel self-alignment method using a Kalman filter based on adaptive filter technology is proposed, which transforms the self-alignment procedure into an attitude estimation using the observation vectors. In the proposed method, a linear psuedo-measurement equation is adopted by employing the transfer method between the quaternion and the observation vectors. Analysis and simulation indicate that the accuracy of the self-alignment is improved. Meanwhile, to improve the convergence rate of the proposed method, a new method based on parameter recognition and a reconstruction algorithm for apparent gravitation is devised, which can reduce the influence of the random noises of the observation vectors. Simulations and turntable tests are carried out, and the results indicate that the proposed method can acquire sound alignment results with lower standard variances, and can obtain higher alignment accuracy and a faster convergence rate. PMID:28146059
Quantification of Cardiomyocyte Alignment from Three-Dimensional (3D) Confocal Microscopy of Engineered Tissue.

PubMed

Kowalski, William J; Yuan, Fangping; Nakane, Takeichiro; Masumoto, Hidetoshi; Dwenger, Marc; Ye, Fei; Tinney, Joseph P; Keller, Bradley B

2017-08-01

Biological tissues have complex, three-dimensional (3D) organizations of cells and matrix factors that provide the architecture necessary to meet morphogenic and functional demands. Disordered cell alignment is associated with congenital heart disease, cardiomyopathy, and neurodegenerative diseases and repairing or replacing these tissues using engineered constructs may improve regenerative capacity. However, optimizing cell alignment within engineered tissues requires quantitative 3D data on cell orientations and both efficient and validated processing algorithms. We developed an automated method to measure local 3D orientations based on structure tensor analysis and incorporated an adaptive subregion size to account for multiple scales. Our method calculates the statistical concentration parameter, κ, to quantify alignment, as well as the traditional orientational order parameter. We validated our method using synthetic images and accurately measured principal axis and concentration. We then applied our method to confocal stacks of cleared, whole-mount engineered cardiac tissues generated from human-induced pluripotent stem cells or embryonic chick cardiac cells and quantified cardiomyocyte alignment. We found significant differences in alignment based on cellular composition and tissue geometry. These results from our synthetic images and confocal data demonstrate the efficiency and accuracy of our method to measure alignment in 3D tissues.
Detection of Unexpected High Correlations between Balance Calibration Loads and Load Residuals

NASA Technical Reports Server (NTRS)

Ulbrich, N.; Volden, T.

2014-01-01

An algorithm was developed for the assessment of strain-gage balance calibration data that makes it possible to systematically investigate potential sources of unexpected high correlations between calibration load residuals and applied calibration loads. The algorithm investigates correlations on a load series by load series basis. The linear correlation coefficient is used to quantify the correlations. It is computed for all possible pairs of calibration load residuals and applied calibration loads that can be constructed for the given balance calibration data set. An unexpected high correlation between a load residual and a load is detected if three conditions are met: (i) the absolute value of the correlation coefficient of a residual/load pair exceeds 0.95; (ii) the maximum of the absolute values of the residuals of a load series exceeds 0.25 % of the load capacity; (iii) the load component of the load series is intentionally applied. Data from a baseline calibration of a six-component force balance is used to illustrate the application of the detection algorithm to a real-world data set. This analysis also showed that the detection algorithm can identify load alignment errors as long as repeat load series are contained in the balance calibration data set that do not suffer from load alignment problems.
An Online Tilt Estimation and Compensation Algorithm for a Small Satellite Camera

NASA Astrophysics Data System (ADS)

Lee, Da-Hyun; Hwang, Jai-hyuk

2018-04-01

In the case of a satellite camera designed to execute an Earth observation mission, even after a pre-launch precision alignment process has been carried out, misalignment will occur due to external factors during the launch and in the operating environment. In particular, for high-resolution satellite cameras, which require submicron accuracy for alignment between optical components, misalignment is a major cause of image quality degradation. To compensate for this, most high-resolution satellite cameras undergo a precise realignment process called refocusing before and during the operation process. However, conventional Earth observation satellites only execute refocusing upon de-space. Thus, in this paper, an online tilt estimation and compensation algorithm that can be utilized after de-space correction is executed. Although the sensitivity of the optical performance degradation due to the misalignment is highest in de-space, the MTF can be additionally increased by correcting tilt after refocusing. The algorithm proposed in this research can be used to estimate the amount of tilt that occurs by taking star images, and it can also be used to carry out automatic tilt corrections by employing a compensation mechanism that gives angular motion to the secondary mirror. Crucially, this algorithm is developed using an online processing system so that it can operate without communication with the ground.
A Stochastic Kinematic Model of Class Averaging in Single-Particle Electron Microscopy

PubMed Central

Park, Wooram; Midgett, Charles R.; Madden, Dean R.; Chirikjian, Gregory S.

2011-01-01

Single-particle electron microscopy is an experimental technique that is used to determine the 3D structure of biological macromolecules and the complexes that they form. In general, image processing techniques and reconstruction algorithms are applied to micrographs, which are two-dimensional (2D) images taken by electron microscopes. Each of these planar images can be thought of as a projection of the macromolecular structure of interest from an a priori unknown direction. A class is defined as a collection of projection images with a high degree of similarity, presumably resulting from taking projections along similar directions. In practice, micrographs are very noisy and those in each class are aligned and averaged in order to reduce the background noise. Errors in the alignment process are inevitable due to noise in the electron micrographs. This error results in blurry averaged images. In this paper, we investigate how blurring parameters are related to the properties of the background noise in the case when the alignment is achieved by matching the mass centers and the principal axes of the experimental images. We observe that the background noise in micrographs can be treated as Gaussian. Using the mean and variance of the background Gaussian noise, we derive equations for the mean and variance of translational and rotational misalignments in the class averaging process. This defines a Gaussian probability density on the Euclidean motion group of the plane. Our formulation is validated by convolving the derived blurring function representing the stochasticity of the image alignments with the underlying noiseless projection and comparing with the original blurry image. PMID:21660125
Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions.

PubMed

Krissinel, E; Henrick, K

2004-12-01

The present paper describes the SSM algorithm of protein structure comparison in three dimensions, which includes an original procedure of matching graphs built on the protein's secondary-structure elements, followed by an iterative three-dimensional alignment of protein backbone Calpha atoms. The SSM results are compared with those obtained from other protein comparison servers, and the advantages and disadvantages of different scores that are used for structure recognition are discussed. A new score, balancing the r.m.s.d. and alignment length Nalign, is proposed. It is found that different servers agree reasonably well on the new score, while showing considerable differences in r.m.s.d. and Nalign.
Optimized smith waterman processor design for breast cancer early diagnosis

NASA Astrophysics Data System (ADS)

Nurdin, D. S.; Isa, M. N.; Ismail, R. C.; Ahmad, M. I.

2017-09-01

This paper presents an optimized design of Processing Element (PE) of Systolic Array (SA) which implements affine gap penalty Smith Waterman (SW) algorithm on the Xilinx Virtex-6 XC6VLX75T Field Programmable Gate Array (FPGA) for Deoxyribonucleic Acid (DNA) sequence alignment. The PE optimization aims to reduce PE logic resources to increase number of PEs in FPGA for higher degree of parallelism during alignment matrix computations. This is useful for aligning long DNA-based disease sequence such as Breast Cancer (BC) for early diagnosis. The optimized PE architecture has the smallest PE area with 15 slices in a PE and 776 PEs implemented in the Virtex - 6 FPGA.
Unsupervised learning of discriminative edge measures for vehicle matching between nonoverlapping cameras.

PubMed

Shan, Ying; Sawhney, Harpreet S; Kumar, Rakesh

2008-04-01

This paper proposes a novel unsupervised algorithm learning discriminative features in the context of matching road vehicles between two non-overlapping cameras. The matching problem is formulated as a same-different classification problem, which aims to compute the probability of vehicle images from two distinct cameras being from the same vehicle or different vehicle(s). We employ a novel measurement vector that consists of three independent edge-based measures and their associated robust measures computed from a pair of aligned vehicle edge maps. The weight of each measure is determined by an unsupervised learning algorithm that optimally separates the same-different classes in the combined measurement space. This is achieved with a weak classification algorithm that automatically collects representative samples from same-different classes, followed by a more discriminative classifier based on Fisher' s Linear Discriminants and Gibbs Sampling. The robustness of the match measures and the use of unsupervised discriminant analysis in the classification ensures that the proposed method performs consistently in the presence of missing/false features, temporally and spatially changing illumination conditions, and systematic misalignment caused by different camera configurations. Extensive experiments based on real data of over 200 vehicles at different times of day demonstrate promising results.
Site-directed protein recombination as a shortest-path problem.

PubMed

Endelman, Jeffrey B; Silberg, Jonathan J; Wang, Zhen-Gang; Arnold, Frances H

2004-07-01

Protein function can be tuned using laboratory evolution, in which one rapidly searches through a library of proteins for the properties of interest. In site-directed recombination, n crossovers are chosen in an alignment of p parents to define a set of p(n + 1) peptide fragments. These fragments are then assembled combinatorially to create a library of p(n+1) proteins. We have developed a computational algorithm to enrich these libraries in folded proteins while maintaining an appropriate level of diversity for evolution. For a given set of parents, our algorithm selects crossovers that minimize the average energy of the library, subject to constraints on the length of each fragment. This problem is equivalent to finding the shortest path between nodes in a network, for which the global minimum can be found efficiently. Our algorithm has a running time of O(N(3)p(2) + N(2)n) for a protein of length N. Adjusting the constraints on fragment length generates a set of optimized libraries with varying degrees of diversity. By comparing these optima for different sets of parents, we rapidly determine which parents yield the lowest energy libraries.
Iterative Most-Likely Point Registration (IMLP): A Robust Algorithm for Computing Optimal Shape Alignment

PubMed Central

Billings, Seth D.; Boctor, Emad M.; Taylor, Russell H.

2015-01-01

We present a probabilistic registration algorithm that robustly solves the problem of rigid-body alignment between two shapes with high accuracy, by aptly modeling measurement noise in each shape, whether isotropic or anisotropic. For point-cloud shapes, the probabilistic framework additionally enables modeling locally-linear surface regions in the vicinity of each point to further improve registration accuracy. The proposed Iterative Most-Likely Point (IMLP) algorithm is formed as a variant of the popular Iterative Closest Point (ICP) algorithm, which iterates between point-correspondence and point-registration steps. IMLP’s probabilistic framework is used to incorporate a generalized noise model into both the correspondence and the registration phases of the algorithm, hence its name as a most-likely point method rather than a closest-point method. To efficiently compute the most-likely correspondences, we devise a novel search strategy based on a principal direction (PD)-tree search. We also propose a new approach to solve the generalized total-least-squares (GTLS) sub-problem of the registration phase, wherein the point correspondences are registered under a generalized noise model. Our GTLS approach has improved accuracy, efficiency, and stability compared to prior methods presented for this problem and offers a straightforward implementation using standard least squares. We evaluate the performance of IMLP relative to a large number of prior algorithms including ICP, a robust variant on ICP, Generalized ICP (GICP), and Coherent Point Drift (CPD), as well as drawing close comparison with the prior anisotropic registration methods of GTLS-ICP and A-ICP. The performance of IMLP is shown to be superior with respect to these algorithms over a wide range of noise conditions, outliers, and misalignments using both mesh and point-cloud representations of various shapes. PMID:25748700
BAYESIAN PROTEIN STRUCTURE ALIGNMENT.

PubMed

Rodriguez, Abel; Schmidler, Scott C

The analysis of the three-dimensional structure of proteins is an important topic in molecular biochemistry. Structure plays a critical role in defining the function of proteins and is more strongly conserved than amino acid sequence over evolutionary timescales. A key challenge is the identification and evaluation of structural similarity between proteins; such analysis can aid in understanding the role of newly discovered proteins and help elucidate evolutionary relationships between organisms. Computational biologists have developed many clever algorithmic techniques for comparing protein structures, however, all are based on heuristic optimization criteria, making statistical interpretation somewhat difficult. Here we present a fully probabilistic framework for pairwise structural alignment of proteins. Our approach has several advantages, including the ability to capture alignment uncertainty and to estimate key "gap" parameters which critically affect the quality of the alignment. We show that several existing alignment methods arise as maximum a posteriori estimates under specific choices of prior distributions and error models. Our probabilistic framework is also easily extended to incorporate additional information, which we demonstrate by including primary sequence information to generate simultaneous sequence-structure alignments that can resolve ambiguities obtained using structure alone. This combined model also provides a natural approach for the difficult task of estimating evolutionary distance based on structural alignments. The model is illustrated by comparison with well-established methods on several challenging protein alignment examples.
Protein Identification Using Top-Down Spectra*

PubMed Central

Liu, Xiaowen; Sirotkin, Yakov; Shen, Yufeng; Anderson, Gordon; Tsai, Yihsuan S.; Ting, Ying S.; Goodlett, David R.; Smith, Richard D.; Bafna, Vineet; Pevzner, Pavel A.

2012-01-01

In the last two years, because of advances in protein separation and mass spectrometry, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples and identifying hundreds and even thousands of proteins. However, computational tools for database search of top-down spectra against protein databases are still in their infancy. We describe MS-Align+, a fast algorithm for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. We also propose a method for evaluating statistical significance of top-down protein identifications and further benchmark various software tools on two top-down data sets from Saccharomyces cerevisiae and Salmonella typhimurium. We demonstrate that MS-Align+ significantly increases the number of identified spectra as compared with MASCOT and OMSSA on both data sets. Although MS-Align+ and ProSightPC have similar performance on the Salmonella typhimurium data set, MS-Align+ outperforms ProSightPC on the (more complex) Saccharomyces cerevisiae data set. PMID:22027200
Mirroring co-evolving trees in the light of their topologies.

PubMed

Hajirasouliha, Iman; Schönhuth, Alexander; de Juan, David; Valencia, Alfonso; Sahinalp, S Cenk

2012-05-01

Determining the interaction partners among protein/domain families poses hard computational problems, in particular in the presence of paralogous proteins. Available approaches aim to identify interaction partners among protein/domain families through maximizing the similarity between trimmed versions of their phylogenetic trees. Since maximization of any natural similarity score is computationally difficult, many approaches employ heuristics to evaluate the distance matrices corresponding to the tree topologies in question. In this article, we devise an efficient deterministic algorithm which directly maximizes the similarity between two leaf labeled trees with edge lengths, obtaining a score-optimal alignment of the two trees in question. Our algorithm is significantly faster than those methods based on distance matrix comparison: 1 min on a single processor versus 730 h on a supercomputer. Furthermore, we outperform the current state-of-the-art exhaustive search approach in terms of precision, while incurring acceptable losses in recall. A C implementation of the method demonstrated in this article is available at http://compbio.cs.sfu.ca/mirrort.htm
Ancient DNA sequence revealed by error-correcting codes.

PubMed

Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

2015-07-10

A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.
Ancient DNA sequence revealed by error-correcting codes

PubMed Central

Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

2015-01-01

A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228
Biologically inspired EM image alignment and neural reconstruction.

PubMed

Knowles-Barley, Seymour; Butcher, Nancy J; Meinertzhagen, Ian A; Armstrong, J Douglas

2011-08-15

Three-dimensional reconstruction of consecutive serial-section transmission electron microscopy (ssTEM) images of neural tissue currently requires many hours of manual tracing and annotation. Several computational techniques have already been applied to ssTEM images to facilitate 3D reconstruction and ease this burden. Here, we present an alternative computational approach for ssTEM image analysis. We have used biologically inspired receptive fields as a basis for a ridge detection algorithm to identify cell membranes, synaptic contacts and mitochondria. Detected line segments are used to improve alignment between consecutive images and we have joined small segments of membrane into cell surfaces using a dynamic programming algorithm similar to the Needleman-Wunsch and Smith-Waterman DNA sequence alignment procedures. A shortest path-based approach has been used to close edges and achieve image segmentation. Partial reconstructions were automatically generated and used as a basis for semi-automatic reconstruction of neural tissue. The accuracy of partial reconstructions was evaluated and 96% of membrane could be identified at the cost of 13% false positive detections. An open-source reference implementation is available in the Supplementary information. seymour.kb@ed.ac.uk; douglas.armstrong@ed.ac.uk Supplementary data are available at Bioinformatics online.
Retention-error patterns in complex alphanumeric serial-recall tasks.

PubMed

Mathy, Fabien; Varré, Jean-Stéphane

2013-01-01

We propose a new method based on an algorithm usually dedicated to DNA sequence alignment in order to both reliably score short-term memory performance on immediate serial-recall tasks and analyse retention-error patterns. There can be considerable confusion on how performance on immediate serial list recall tasks is scored, especially when the to-be-remembered items are sampled with replacement. We discuss the utility of sequence-alignment algorithms to compare the stimuli to the participants' responses. The idea is that deletion, substitution, translocation, and insertion errors, which are typical in DNA, are also typical putative errors in short-term memory (respectively omission, confusion, permutation, and intrusion errors). We analyse four data sets in which alphanumeric lists included a few (or many) repetitions. After examining the method on two simple data sets, we show that sequence alignment offers 1) a compelling method for measuring capacity in terms of chunks when many regularities are introduced in the material (third data set) and 2) a reliable estimator of individual differences in short-term memory capacity. This study illustrates the difficulty of arriving at a good measure of short-term memory performance, and also attempts to characterise the primary factors underpinning remembering and forgetting.

Minimal-effort planning of active alignment processes for beam-shaping optics

NASA Astrophysics Data System (ADS)

Haag, Sebastian; Schranner, Matthias; Müller, Tobias; Zontar, Daniel; Schlette, Christian; Losch, Daniel; Brecher, Christian; Roßmann, Jürgen

2015-03-01

In science and industry, the alignment of beam-shaping optics is usually a manual procedure. Many industrial applications utilizing beam-shaping optical systems require more scalable production solutions and therefore effort has been invested in research regarding the automation of optics assembly. In previous works, the authors and other researchers have proven the feasibility of automated alignment of beam-shaping optics such as collimation lenses or homogenization optics. Nevertheless, the planning efforts as well as additional knowledge from the fields of automation and control required for such alignment processes are immense. This paper presents a novel approach of planning active alignment processes of beam-shaping optics with the focus of minimizing the planning efforts for active alignment. The approach utilizes optical simulation and the genetic programming paradigm from computer science for automatically extracting features from a simulated data basis with a high correlation coefficient regarding the individual degrees of freedom of alignment. The strategy is capable of finding active alignment strategies that can be executed by an automated assembly system. The paper presents a tool making the algorithm available to end-users and it discusses the results of planning the active alignment of the well-known assembly of a fast-axis collimator. The paper concludes with an outlook on the transferability to other use cases such as application specific intensity distributions which will benefit from reduced planning efforts.
Automated side-chain model building and sequence assignment by template matching.

PubMed

Terwilliger, Thomas C

2003-01-01

An algorithm is described for automated building of side chains in an electron-density map once a main-chain model is built and for alignment of the protein sequence to the map. The procedure is based on a comparison of electron density at the expected side-chain positions with electron-density templates. The templates are constructed from average amino-acid side-chain densities in 574 refined protein structures. For each contiguous segment of main chain, a matrix with entries corresponding to an estimate of the probability that each of the 20 amino acids is located at each position of the main-chain model is obtained. The probability that this segment corresponds to each possible alignment with the sequence of the protein is estimated using a Bayesian approach and high-confidence matches are kept. Once side-chain identities are determined, the most probable rotamer for each side chain is built into the model. The automated procedure has been implemented in the RESOLVE software. Combined with automated main-chain model building, the procedure produces a preliminary model suitable for refinement and extension by an experienced crystallographer.
Focus: a robust workflow for one-dimensional NMR spectral analysis.

PubMed

Alonso, Arnald; Rodríguez, Miguel A; Vinaixa, Maria; Tortosa, Raül; Correig, Xavier; Julià, Antonio; Marsal, Sara

2014-01-21

One-dimensional (1)H NMR represents one of the most commonly used analytical techniques in metabolomic studies. The increase in the number of samples analyzed as well as the technical improvements involving instrumentation and spectral acquisition demand increasingly accurate and efficient high-throughput data processing workflows. We present FOCUS, an integrated and innovative methodology that provides a complete data analysis workflow for one-dimensional NMR-based metabolomics. This tool will allow users to easily obtain a NMR peak feature matrix ready for chemometric analysis as well as metabolite identification scores for each peak that greatly simplify the biological interpretation of the results. The algorithm development has been focused on solving the critical difficulties that appear at each data processing step and that can dramatically affect the quality of the results. As well as method integration, simplicity has been one of the main objectives in FOCUS development, requiring very little user input to perform accurate peak alignment, peak picking, and metabolite identification. The new spectral alignment algorithm, RUNAS, allows peak alignment with no need of a reference spectrum, and therefore, it reduces the bias introduced by other alignment approaches. Spectral alignment has been tested against previous methodologies obtaining substantial improvements in the case of moderate or highly unaligned spectra. Metabolite identification has also been significantly improved, using the positional and correlation peak patterns in contrast to a reference metabolite panel. Furthermore, the complete workflow has been tested using NMR data sets from 60 human urine samples and 120 aqueous liver extracts, reaching a successful identification of 42 metabolites from the two data sets. The open-source software implementation of this methodology is available at http://www.urr.cat/FOCUS.
Optimized graph-based mosaicking for virtual microscopy

NASA Astrophysics Data System (ADS)

Steckhan, Dirk G.; Wittenberg, Thomas

2009-02-01

Virtual microscopy has the potential to partially replace traditional microscopy. For virtualization, the slide is scanned once by a fully automatized robotic microscope and saved digitally. Typically, such a scan results in several hundreds to thousands of fields of view. Since robotic stages have positioning errors, these fields of view have to be registered locally and globally in an additional step. In this work we propose a new global mosaicking method for the creation of virtual slides based on sub-pixel exact phase correlation for local alignment in combination with Prim's minimum spanning tree algorithm for global alignment. Our algorithm allows for a robust reproduction of the original slide even in the presence of views with little to no information content. This makes it especially suitable for the mosaicking of cervical smears. These smears often exhibit large empty areas, which do not contain enough information for common stitching approaches.
Free-viewpoint video of human actors using multiple handheld Kinects.

PubMed

Ye, Genzhi; Liu, Yebin; Deng, Yue; Hasler, Nils; Ji, Xiangyang; Dai, Qionghai; Theobalt, Christian

2013-10-01

We present an algorithm for creating free-viewpoint video of interacting humans using three handheld Kinect cameras. Our method reconstructs deforming surface geometry and temporal varying texture of humans through estimation of human poses and camera poses for every time step of the RGBZ video. Skeletal configurations and camera poses are found by solving a joint energy minimization problem, which optimizes the alignment of RGBZ data from all cameras, as well as the alignment of human shape templates to the Kinect data. The energy function is based on a combination of geometric correspondence finding, implicit scene segmentation, and correspondence finding using image features. Finally, texture recovery is achieved through jointly optimization on spatio-temporal RGB data using matrix completion. As opposed to previous methods, our algorithm succeeds on free-viewpoint video of human actors under general uncontrolled indoor scenes with potentially dynamic background, and it succeeds even if the cameras are moving.
Consistency functional map propagation for repetitive patterns

NASA Astrophysics Data System (ADS)

Wang, Hao

2017-09-01

Repetitive patterns appear frequently in both man-made and natural environments. Automatically and robustly detecting such patterns from an image is a challenging problem. We study repetitive pattern alignment by embedding segmentation cue with a functional map model. However, this model cannot tackle the repetitive patterns directly due to the large photometric and geometric variations. Thus, a consistency functional map propagation (CFMP) algorithm that extends the functional map with dynamic propagation is proposed to address this issue. This propagation model is acquired in two steps. The first one aligns the patterns from a local region, transferring segmentation functions among patterns. It can be cast as an L norm optimization problem. The latter step updates the template segmentation for the next round of pattern discovery by merging the transferred segmentation functions. Extensive experiments and comparative analyses have demonstrated an encouraging performance of the proposed algorithm in detection and segmentation of repetitive patterns.
Genetic Algorithm Phase Retrieval for the Systematic Image-Based Optical Alignment Testbed

NASA Technical Reports Server (NTRS)

Taylor, Jaime; Rakoczy, John; Steincamp, James

2003-01-01

Phase retrieval requires calculation of the real-valued phase of the pupil fimction from the image intensity distribution and characteristics of an optical system. Genetic 'algorithms were used to solve two one-dimensional phase retrieval problem. A GA successfully estimated the coefficients of a polynomial expansion of the phase when the number of coefficients was correctly specified. A GA also successfully estimated the multiple p h e s of a segmented optical system analogous to the seven-mirror Systematic Image-Based Optical Alignment (SIBOA) testbed located at NASA s Marshall Space Flight Center. The SIBOA testbed was developed to investigate phase retrieval techniques. Tiphilt and piston motions of the mirrors accomplish phase corrections. A constant phase over each mirror can be achieved by an independent tip/tilt correction: the phase Conection term can then be factored out of the Discrete Fourier Tranform (DFT), greatly reducing computations.
AGScan: a pluggable microarray image quantification software based on the ImageJ library.

PubMed

Cathelin, R; Lopez, F; Klopp, Ch

2007-01-15

Many different programs are available to analyze microarray images. Most programs are commercial packages, some are free. In the latter group only few propose automatic grid alignment and batch mode. More often than not a program implements only one quantification algorithm. AGScan is an open source program that works on all major platforms. It is based on the ImageJ library [Rasband (1997-2006)] and offers a plug-in extension system to add new functions to manipulate images, align grid and quantify spots. It is appropriate for daily laboratory use and also as a framework for new algorithms. The program is freely distributed under X11 Licence. The install instructions can be found in the user manual. The software can be downloaded from http://mulcyber.toulouse.inra.fr/projects/agscan/. The questions and plug-ins can be sent to the contact listed below.
ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures.

PubMed

Konc, Janez; Cesnik, Tomo; Konc, Joanna Trykowska; Penca, Matej; Janežič, Dušanka

2012-02-27

ProBiS-Database is a searchable repository of precalculated local structural alignments in proteins detected by the ProBiS algorithm in the Protein Data Bank. Identification of functionally important binding regions of the protein is facilitated by structural similarity scores mapped to the query protein structure. PDB structures that have been aligned with a query protein may be rapidly retrieved from the ProBiS-Database, which is thus able to generate hypotheses concerning the roles of uncharacterized proteins. Presented with uncharacterized protein structure, ProBiS-Database can discern relationships between such a query protein and other better known proteins in the PDB. Fast access and a user-friendly graphical interface promote easy exploration of this database of over 420 million local structural alignments. The ProBiS-Database is updated weekly and is freely available online at http://probis.cmm.ki.si/database.
Directed branch growth in aligned nanowire arrays.

PubMed

Beaudry, Allan L; LaForge, Joshua M; Tucker, Ryan T; Sorge, Jason B; Adamski, Nicholas L; Li, Peng; Taschuk, Michael T; Brett, Michael J

2014-01-01

Branch growth is directed along two, three, or four in-plane directions in vertically aligned nanowire arrays using vapor-liquid-solid glancing angle deposition (VLS-GLAD) flux engineering. In this work, a dynamically controlled collimated vapor flux guides branch placement during the self-catalyzed epitaxial growth of branched indium tin oxide nanowire arrays. The flux is positioned to grow branches on select nanowire facets, enabling fabrication of aligned nanotree arrays with L-, T-, or X-branching. In addition, a flux motion algorithm is designed to selectively elongate branches along one in-plane axis. Nanotrees are found to be aligned across large areas by X-ray diffraction pole figure analysis and through branch length and orientation measurements collected over 140 μm(2) from scanning electron microscopy images for each array. The pathway to guided assembly of nanowire architectures with controlled interconnectivity in three-dimensions using VLS-GLAD is discussed.
GATA: A graphic alignment tool for comparative sequenceanalysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nix, David A.; Eisen, Michael B.

2005-01-01

Several problems exist with current methods used to align DNA sequences for comparative sequence analysis. Most dynamic programming algorithms assume that conserved sequence elements are collinear. This assumption appears valid when comparing orthologous protein coding sequences. Functional constraints on proteins provide strong selective pressure against sequence inversions, and minimize sequence duplications and feature shuffling. For non-coding sequences this collinearity assumption is often invalid. For example, enhancers contain clusters of transcription factor binding sites that change in number, orientation, and spacing during evolution yet the enhancer retains its activity. Dotplot analysis is often used to estimate non-coding sequence relatedness. Yet dotmore » plots do not actually align sequences and thus cannot account well for base insertions or deletions. Moreover, they lack an adequate statistical framework for comparing sequence relatedness and are limited to pairwise comparisons. Lastly, dot plots and dynamic programming text outputs fail to provide an intuitive means for visualizing DNA alignments.« less
RBT-GA: a novel metaheuristic for solving the Multiple Sequence Alignment problem.

PubMed

Taheri, Javid; Zomaya, Albert Y

2009-07-07

Multiple Sequence Alignment (MSA) has always been an active area of research in Bioinformatics. MSA is mainly focused on discovering biologically meaningful relationships among different sequences or proteins in order to investigate the underlying main characteristics/functions. This information is also used to generate phylogenetic trees. This paper presents a novel approach, namely RBT-GA, to solve the MSA problem using a hybrid solution methodology combining the Rubber Band Technique (RBT) and the Genetic Algorithm (GA) metaheuristic. RBT is inspired by the behavior of an elastic Rubber Band (RB) on a plate with several poles, which is analogues to locations in the input sequences that could potentially be biologically related. A GA attempts to mimic the evolutionary processes of life in order to locate optimal solutions in an often very complex landscape. RBT-GA is a population based optimization algorithm designed to find the optimal alignment for a set of input protein sequences. In this novel technique, each alignment answer is modeled as a chromosome consisting of several poles in the RBT framework. These poles resemble locations in the input sequences that are most likely to be correlated and/or biologically related. A GA-based optimization process improves these chromosomes gradually yielding a set of mostly optimal answers for the MSA problem. RBT-GA is tested with one of the well-known benchmarks suites (BALiBASE 2.0) in this area. The obtained results show that the superiority of the proposed technique even in the case of formidable sequences.
Automatic construction of patient-specific finite-element mesh of the spine from IVDs and vertebra segmentations

NASA Astrophysics Data System (ADS)

Castro-Mateos, Isaac; Pozo, Jose M.; Lazary, Aron; Frangi, Alejandro F.

2016-03-01

Computational medicine aims at developing patient-specific models to help physicians in the diagnosis and treatment selection for patients. The spine, and other skeletal structures, is an articulated object, composed of rigid bones (vertebrae) and non-rigid parts (intervertebral discs (IVD), ligaments and muscles). These components are usually extracted from different image modalities, involving patient repositioning. In the case of the spine, these models require the segmentation of IVDs from MR and vertebrae from CT. In the literature, there exists a vast selection of segmentations methods, but there is a lack of approaches to align the vertebrae and IVDs. This paper presents a method to create patient-specific finite element meshes for biomechanical simulations, integrating rigid and non-rigid parts of articulated objects. First, the different parts are aligned in a complete surface model. Vertebrae extracted from CT are rigidly repositioned in between the IVDs, initially using the IVDs location and then refining the alignment using the MR image with a rigid active shape model algorithm. Finally, a mesh morphing algorithm, based on B-splines, is employed to map a template finite-element (volumetric) mesh to the patient-specific surface mesh. This morphing reduces possible misalignments and guarantees the convexity of the model elements. Results show that the accuracy of the method to align vertebrae into MR, together with IVDs, is similar to that of the human observers. Thus, this method is a step forward towards the automation of patient-specific finite element models for biomechanical simulations.
An algorithm for longitudinal registration of PET/CT images acquired during neoadjuvant chemotherapy in breast cancer: preliminary results.

PubMed

Li, Xia; Abramson, Richard G; Arlinghaus, Lori R; Chakravarthy, Anuradha Bapsi; Abramson, Vandana; Mayer, Ingrid; Farley, Jaime; Delbeke, Dominique; Yankeelov, Thomas E

2012-11-16

By providing estimates of tumor glucose metabolism, 18F-fluorodeoxyglucose positron emission tomography (FDG-PET) can potentially characterize the response of breast tumors to treatment. To assess therapy response, serial measurements of FDG-PET parameters (derived from static and/or dynamic images) can be obtained at different time points during the course of treatment. However, most studies track the changes in average parameter values obtained from the whole tumor, thereby discarding all spatial information manifested in tumor heterogeneity. Here, we propose a method whereby serially acquired FDG-PET breast data sets can be spatially co-registered to enable the spatial comparison of parameter maps at the voxel level. The goal is to optimally register normal tissues while simultaneously preventing tumor distortion. In order to accomplish this, we constructed a PET support device to enable PET/CT imaging of the breasts of ten patients in the prone position and applied a mutual information-based rigid body registration followed by a non-rigid registration. The non-rigid registration algorithm extended the adaptive bases algorithm (ABA) by incorporating a tumor volume-preserving constraint, which computed the Jacobian determinant over the tumor regions as outlined on the PET/CT images, into the cost function. We tested this approach on ten breast cancer patients undergoing neoadjuvant chemotherapy. By both qualitative and quantitative evaluation, our constrained algorithm yielded significantly less tumor distortion than the unconstrained algorithm: considering the tumor volume determined from standard uptake value maps, the post-registration median tumor volume changes, and the 25th and 75th quantiles were 3.42% (0%, 13.39%) and 16.93% (9.21%, 49.93%) for the constrained and unconstrained algorithms, respectively (p = 0.002), while the bending energy (a measure of the smoothness of the deformation) was 0.0015 (0.0005, 0.012) and 0.017 (0.005, 0.044), respectively (p = 0.005). The results indicate that the constrained ABA algorithm can accurately align prone breast FDG-PET images acquired at different time points while keeping the tumor from being substantially compressed or distorted. NCT00474604.
Alignment algorithms and per-particle CTF correction for single particle cryo-electron tomography.

PubMed

Galaz-Montoya, Jesús G; Hecksel, Corey W; Baldwin, Philip R; Wang, Eryu; Weaver, Scott C; Schmid, Michael F; Ludtke, Steven J; Chiu, Wah

2016-06-01

Single particle cryo-electron tomography (cryoSPT) extracts features from cryo-electron tomograms, followed by 3D classification, alignment and averaging to generate improved 3D density maps of such features. Robust methods to correct for the contrast transfer function (CTF) of the electron microscope are necessary for cryoSPT to reach its resolution potential. Many factors can make CTF correction for cryoSPT challenging, such as lack of eucentricity of the specimen stage, inherent low dose per image, specimen charging, beam-induced specimen motions, and defocus gradients resulting both from specimen tilting and from unpredictable ice thickness variations. Current CTF correction methods for cryoET make at least one of the following assumptions: that the defocus at the center of the image is the same across the images of a tiltseries, that the particles all lie at the same Z-height in the embedding ice, and/or that the specimen, the cryo-electron microscopy (cryoEM) grid and/or the carbon support are flat. These experimental conditions are not always met. We have developed a CTF correction algorithm for cryoSPT without making any of the aforementioned assumptions. We also introduce speed and accuracy improvements and a higher degree of automation to the subtomogram averaging algorithms available in EMAN2. Using motion-corrected images of isolated virus particles as a benchmark specimen, recorded with a DE20 direct detection camera, we show that our CTF correction and subtomogram alignment routines can yield subtomogram averages close to 4/5 Nyquist frequency of the detector under our experimental conditions. Copyright © 2016 Elsevier Inc. All rights reserved.
Alignment Algorithms and Per-Particle CTF Correction for Single Particle Cryo-Electron Tomography

PubMed Central

Galaz-Montoya, Jesús G.; Hecksel, Corey W.; Baldwin, Philip R.; Wang, Eryu; Weaver, Scott C.; Schmid, Michael F.; Ludtke, Steven J.; Chiu, Wah

2016-01-01

Single particle cryo-electron tomography (cryoSPT) extracts features from cryo-electron tomograms, followed by 3D classification, alignment and averaging to generate improved 3D density maps of such features. Robust methods to correct for the contrast transfer function (CTF) of the electron microscope are necessary for cryoSPT to reach its resolution potential. Many factors can make CTF correction for cryoSPT challenging, such as lack of eucentricity of the specimen stage, inherent low dose per image, specimen charging, beam-induced specimen motions, and defocus gradients resulting both from specimen tilting and from unpredictable ice thickness variations. Current CTF correction methods for cryoET make at least one of the following assumptions: that the defocus at the center of the image is the same across the images of a tiltseries, that the particles all lie at the same Z-height in the embedding ice, and/or that the specimen grid and carbon support are flat. These experimental conditions are not always met. We have developed a CTF correction algorithm for cryoSPT without making any of the aforementioned assumptions. We also introduce speed and accuracy improvements and a higher degree of automation to the subtomogram averaging algorithms available in EMAN2. Using motion-corrected images of isolated virus particles as a benchmark specimen, recorded with a DE20 direct detection camera, we show that our CTF correction and subtomogram alignment routines can yield subtomogram averages close to 4/5 Nyquist frequency of the detector under our experimental conditions. PMID:27016284
HBLAST: Parallelised sequence similarity--A Hadoop MapReducable basic local alignment search tool.

PubMed

O'Driscoll, Aisling; Belogrudov, Vladislav; Carroll, John; Kropp, Kai; Walsh, Paul; Ghazal, Peter; Sleator, Roy D

2015-04-01

The recent exponential growth of genomic databases has resulted in the common task of sequence alignment becoming one of the major bottlenecks in the field of computational biology. It is typical for these large datasets and complex computations to require cost prohibitive High Performance Computing (HPC) to function. As such, parallelised solutions have been proposed but many exhibit scalability limitations and are incapable of effectively processing "Big Data" - the name attributed to datasets that are extremely large, complex and require rapid processing. The Hadoop framework, comprised of distributed storage and a parallelised programming framework known as MapReduce, is specifically designed to work with such datasets but it is not trivial to efficiently redesign and implement bioinformatics algorithms according to this paradigm. The parallelisation strategy of "divide and conquer" for alignment algorithms can be applied to both data sets and input query sequences. However, scalability is still an issue due to memory constraints or large databases, with very large database segmentation leading to additional performance decline. Herein, we present Hadoop Blast (HBlast), a parallelised BLAST algorithm that proposes a flexible method to partition both databases and input query sequences using "virtual partitioning". HBlast presents improved scalability over existing solutions and well balanced computational work load while keeping database segmentation and recompilation to a minimum. Enhanced BLAST search performance on cheap memory constrained hardware has significant implications for in field clinical diagnostic testing; enabling faster and more accurate identification of pathogenic DNA in human blood or tissue samples. Copyright © 2015 Elsevier Inc. All rights reserved.
Genetically improved BarraCUDA.

PubMed

Langdon, W B; Lam, Brian Yee Hong

2017-01-01

BarraCUDA is an open source C program which uses the BWA algorithm in parallel with nVidia CUDA to align short next generation DNA sequences against a reference genome. Recently its source code was optimised using "Genetic Improvement". The genetically improved (GI) code is up to three times faster on short paired end reads from The 1000 Genomes Project and 60% more accurate on a short BioPlanet.com GCAT alignment benchmark. GPGPU BarraCUDA running on a single K80 Tesla GPU can align short paired end nextGen sequences up to ten times faster than bwa on a 12 core server. The speed up was such that the GI version was adopted and has been regularly downloaded from SourceForge for more than 12 months.
mRAISE: an alternative algorithmic approach to ligand-based virtual screening

NASA Astrophysics Data System (ADS)

von Behren, Mathias M.; Bietz, Stefan; Nittinger, Eva; Rarey, Matthias

2016-08-01

Ligand-based virtual screening is a well established method to find new lead molecules in todays drug discovery process. In order to be applicable in day to day practice, such methods have to face multiple challenges. The most important part is the reliability of the results, which can be shown and compared in retrospective studies. Furthermore, in the case of 3D methods, they need to provide biologically relevant molecular alignments of the ligands, that can be further investigated by a medicinal chemist. Last but not least, they have to be able to screen large databases in reasonable time. Many algorithms for ligand-based virtual screening have been proposed in the past, most of them based on pairwise comparisons. Here, a new method is introduced called mRAISE. Based on structural alignments, it uses a descriptor-based bitmap search engine (RAISE) to achieve efficiency. Alignments created on the fly by the search engine get evaluated with an independent shape-based scoring function also used for ranking of compounds. The correct ranking as well as the alignment quality of the method are evaluated and compared to other state of the art methods. On the commonly used Directory of Useful Decoys dataset mRAISE achieves an average area under the ROC curve of 0.76, an average enrichment factor at 1 % of 20.2 and an average hit rate at 1 % of 55.5. With these results, mRAISE is always among the top performing methods with available data for comparison. To access the quality of the alignments calculated by ligand-based virtual screening methods, we introduce a new dataset containing 180 prealigned ligands for 11 diverse targets. Within the top ten ranked conformations, the alignment closest to X-ray structure calculated with mRAISE has a root-mean-square deviation of less than 2.0 Å for 80.8 % of alignment pairs and achieves a median of less than 2.0 Å for eight of the 11 cases. The dataset used to rate the quality of the calculated alignments is freely available at http://www.zbh.uni-hamburg.de/mraise-dataset.html. The table of all PDB codes contained in the ensembles can be found in the supplementary material. The software tool mRAISE is freely available for evaluation purposes and academic use (see http://www.zbh.uni-hamburg.de/raise).
Self-learning computers for surgical planning and prediction of postoperative alignment.

PubMed

Lafage, Renaud; Pesenti, Sébastien; Lafage, Virginie; Schwab, Frank J

2018-02-01

In past decades, the role of sagittal alignment has been widely demonstrated in the setting of spinal conditions. As several parameters can be affected, identifying the driver of the deformity is the cornerstone of a successful treatment approach. Despite the importance of restoring sagittal alignment for optimizing outcome, this task remains challenging. Self-learning computers and optimized algorithms are of great interest in spine surgery as in that they facilitate better planning and prediction of postoperative alignment. Nowadays, computer-assisted tools are part of surgeons' daily practice; however, the use of such tools remains to be time-consuming. NARRATIVE REVIEW AND RESULTS: Computer-assisted methods for the prediction of postoperative alignment consist of a three step analysis: identification of anatomical landmark, definition of alignment objectives, and simulation of surgery. Recently, complex rules for the prediction of alignment have been proposed. Even though this kind of work leads to more personalized objectives, the number of parameters involved renders it difficult for clinical use, stressing the importance of developing computer-assisted tools. The evolution of our current technology, including machine learning and other types of advanced algorithms, will provide powerful tools that could be useful in improving surgical outcomes and alignment prediction. These tools can combine different types of advanced technologies, such as image recognition and shape modeling, and using this technique, computer-assisted methods are able to predict spinal shape. The development of powerful computer-assisted methods involves the integration of several sources of information such as radiographic parameters (X-rays, MRI, CT scan, etc.), demographic information, and unusual non-osseous parameters (muscle quality, proprioception, gait analysis data). In using a larger set of data, these methods will aim to mimic what is actually done by spine surgeons, leading to real tailor-made solutions. Integrating newer technology can change the current way of planning/simulating surgery. The use of powerful computer-assisted tools that are able to integrate several parameters and learn from experience can change the traditional way of selecting treatment pathways and counseling patients. However, there is still much work to be done to reach a desired level as noted in other orthopedic fields, such as hip surgery. Many of these tools already exist in non-medical fields and their adaptation to spine surgery is of considerable interest.

Ab-initio conformational epitope structure prediction using genetic algorithm and SVM for vaccine design.

PubMed

Moghram, Basem Ameen; Nabil, Emad; Badr, Amr

2018-01-01

T-cell epitope structure identification is a significant challenging immunoinformatic problem within epitope-based vaccine design. Epitopes or antigenic peptides are a set of amino acids that bind with the Major Histocompatibility Complex (MHC) molecules. The aim of this process is presented by Antigen Presenting Cells to be inspected by T-cells. MHC-molecule-binding epitopes are responsible for triggering the immune response to antigens. The epitope's three-dimensional (3D) molecular structure (i.e., tertiary structure) reflects its proper function. Therefore, the identification of MHC class-II epitopes structure is a significant step towards epitope-based vaccine design and understanding of the immune system. In this paper, we propose a new technique using a Genetic Algorithm for Predicting the Epitope Structure (GAPES), to predict the structure of MHC class-II epitopes based on their sequence. The proposed Elitist-based genetic algorithm for predicting the epitope's tertiary structure is based on Ab-Initio Empirical Conformational Energy Program for Peptides (ECEPP) Force Field Model. The developed secondary structure prediction technique relies on Ramachandran Plot. We used two alignment algorithms: the ROSS alignment and TM-Score alignment. We applied four different alignment approaches to calculate the similarity scores of the dataset under test. We utilized the support vector machine (SVM) classifier as an evaluation of the prediction performance. The prediction accuracy and the Area Under Receiver Operating Characteristic (ROC) Curve (AUC) were calculated as measures of performance. The calculations are performed on twelve similarity-reduced datasets of the Immune Epitope Data Base (IEDB) and a large dataset of peptide-binding affinities to HLA-DRB1*0101. The results showed that GAPES was reliable and very accurate. We achieved an average prediction accuracy of 93.50% and an average AUC of 0.974 in the IEDB dataset. Also, we achieved an accuracy of 95.125% and an AUC of 0.987 on the HLA-DRB1*0101 allele of the Wang benchmark dataset. The results indicate that the proposed prediction technique "GAPES" is a promising technique that will help researchers and scientists to predict the protein structure and it will assist them in the intelligent design of new epitope-based vaccines. Copyright © 2017 Elsevier B.V. All rights reserved.
Integrative network alignment reveals large regions of global network similarity in yeast and human.

PubMed

Kuchaiev, Oleksii; Przulj, Natasa

2011-05-15

High-throughput methods for detecting molecular interactions have produced large sets of biological network data with much more yet to come. Analogous to sequence alignment, efficient and reliable network alignment methods are expected to improve our understanding of biological systems. Unlike sequence alignment, network alignment is computationally intractable. Hence, devising efficient network alignment heuristics is currently a foremost challenge in computational biology. We introduce a novel network alignment algorithm, called Matching-based Integrative GRAph ALigner (MI-GRAAL), which can integrate any number and type of similarity measures between network nodes (e.g. proteins), including, but not limited to, any topological network similarity measure, sequence similarity, functional similarity and structural similarity. Hence, we resolve the ties in similarity measures and find a combination of similarity measures yielding the largest contiguous (i.e. connected) and biologically sound alignments. MI-GRAAL exposes the largest functional, connected regions of protein-protein interaction (PPI) network similarity to date: surprisingly, it reveals that 77.7% of proteins in the baker's yeast high-confidence PPI network participate in such a subnetwork that is fully contained in the human high-confidence PPI network. This is the first demonstration that species as diverse as yeast and human contain so large, continuous regions of global network similarity. We apply MI-GRAAL's alignments to predict functions of un-annotated proteins in yeast, human and bacteria validating our predictions in the literature. Furthermore, using network alignment scores for PPI networks of different herpes viruses, we reconstruct their phylogenetic relationship. This is the first time that phylogeny is exactly reconstructed from purely topological alignments of PPI networks. Supplementary files and MI-GRAAL executables: http://bio-nets.doc.ic.ac.uk/MI-GRAAL/.
PARTS: Probabilistic Alignment for RNA joinT Secondary structure prediction

PubMed Central

Harmanci, Arif Ozgun; Sharma, Gaurav; Mathews, David H.

2008-01-01

A novel method is presented for joint prediction of alignment and common secondary structures of two RNA sequences. The joint consideration of common secondary structures and alignment is accomplished by structural alignment over a search space defined by the newly introduced motif called matched helical regions. The matched helical region formulation generalizes previously employed constraints for structural alignment and thereby better accommodates the structural variability within RNA families. A probabilistic model based on pseudo free energies obtained from precomputed base pairing and alignment probabilities is utilized for scoring structural alignments. Maximum a posteriori (MAP) common secondary structures, sequence alignment and joint posterior probabilities of base pairing are obtained from the model via a dynamic programming algorithm called PARTS. The advantage of the more general structural alignment of PARTS is seen in secondary structure predictions for the RNase P family. For this family, the PARTS MAP predictions of secondary structures and alignment perform significantly better than prior methods that utilize a more restrictive structural alignment model. For the tRNA and 5S rRNA families, the richer structural alignment model of PARTS does not offer a benefit and the method therefore performs comparably with existing alternatives. For all RNA families studied, the posterior probability estimates obtained from PARTS offer an improvement over posterior probability estimates from a single sequence prediction. When considering the base pairings predicted over a threshold value of confidence, the combination of sensitivity and positive predictive value is superior for PARTS than for the single sequence prediction. PARTS source code is available for download under the GNU public license at http://rna.urmc.rochester.edu. PMID:18304945
Avoiding hard chromatographic segmentation: A moving window approach for the automated resolution of gas chromatography-mass spectrometry-based metabolomics signals by multivariate methods.

PubMed

Domingo-Almenara, Xavier; Perera, Alexandre; Brezmes, Jesus

2016-11-25

Gas chromatography-mass spectrometry (GC-MS) produces large and complex datasets characterized by co-eluted compounds and at trace levels, and with a distinct compound ion-redundancy as a result of the high fragmentation by the electron impact ionization. Compounds in GC-MS can be resolved by taking advantage of the multivariate nature of GC-MS data by applying multivariate resolution methods. However, multivariate methods have to be applied in small regions of the chromatogram, and therefore chromatograms are segmented prior to the application of the algorithms. The automation of this segmentation process is a challenging task as it implies separating between informative data and noise from the chromatogram. This study demonstrates the capabilities of independent component analysis-orthogonal signal deconvolution (ICA-OSD) and multivariate curve resolution-alternating least squares (MCR-ALS) with an overlapping moving window implementation to avoid the typical hard chromatographic segmentation. Also, after being resolved, compounds are aligned across samples by an automated alignment algorithm. We evaluated the proposed methods through a quantitative analysis of GC-qTOF MS data from 25 serum samples. The quantitative performance of both moving window ICA-OSD and MCR-ALS-based implementations was compared with the quantification of 33 compounds by the XCMS package. Results shown that most of the R 2 coefficients of determination exhibited a high correlation (R 2 >0.90) in both ICA-OSD and MCR-ALS moving window-based approaches. Copyright © 2016 Elsevier B.V. All rights reserved.
Markov random field based automatic image alignment for electron tomography.

PubMed

Amat, Fernando; Moussavi, Farshid; Comolli, Luis R; Elidan, Gal; Downing, Kenneth H; Horowitz, Mark

2008-03-01

We present a method for automatic full-precision alignment of the images in a tomographic tilt series. Full-precision automatic alignment of cryo electron microscopy images has remained a difficult challenge to date, due to the limited electron dose and low image contrast. These facts lead to poor signal to noise ratio (SNR) in the images, which causes automatic feature trackers to generate errors, even with high contrast gold particles as fiducial features. To enable fully automatic alignment for full-precision reconstructions, we frame the problem probabilistically as finding the most likely particle tracks given a set of noisy images, using contextual information to make the solution more robust to the noise in each image. To solve this maximum likelihood problem, we use Markov Random Fields (MRF) to establish the correspondence of features in alignment and robust optimization for projection model estimation. The resulting algorithm, called Robust Alignment and Projection Estimation for Tomographic Reconstruction, or RAPTOR, has not needed any manual intervention for the difficult datasets we have tried, and has provided sub-pixel alignment that is as good as the manual approach by an expert user. We are able to automatically map complete and partial marker trajectories and thus obtain highly accurate image alignment. Our method has been applied to challenging cryo electron tomographic datasets with low SNR from intact bacterial cells, as well as several plastic section and X-ray datasets.
Alignment of sensor arrays in optical instruments using a geometric approach.

PubMed

Sawyer, Travis W

2018-02-01

Alignment of sensor arrays in optical instruments is critical to maximize the instrument's performance. While many commercial systems use standardized mounting threads for alignment, custom systems require specialized equipment and alignment procedures. These alignment procedures can be time-consuming, dependent on operator experience, and have low repeatability. Furthermore, each alignment solution must be considered on a case-by-case basis, leading to additional time and resource cost. Here I present a method to align a sensor array using geometric analysis. By imaging a grid pattern of dots, I show that it is possible to calculate the misalignment for a sensor in five degrees of freedom simultaneously. I first test the approach by simulating different cases of misalignment using Zemax before applying the method to experimentally acquired data of sensor misalignment for an echelle spectrograph. The results show that the algorithm effectively quantifies misalignment in five degrees of freedom for an F/5 imaging system, accurate to within ±0.87 deg in rotation and ±0.86 μm in translation. Furthermore, the results suggest that the method can also be applied to non-imaging systems with a small penalty to precision. This general approach can potentially improve the alignment of sensor arrays in custom instruments by offering an accurate, quantitative approach to calculating misalignment in five degrees of freedom simultaneously.
Fast and accurate reference-free alignment of subtomograms.

PubMed

Chen, Yuxiang; Pfeffer, Stefan; Hrabe, Thomas; Schuller, Jan Michael; Förster, Friedrich

2013-06-01

In cryoelectron tomography alignment and averaging of subtomograms, each dnepicting the same macromolecule, improves the resolution compared to the individual subtomogram. Major challenges of subtomogram alignment are noise enhancement due to overfitting, the bias of an initial reference in the iterative alignment process, and the computational cost of processing increasingly large amounts of data. Here, we propose an efficient and accurate alignment algorithm via a generalized convolution theorem, which allows computation of a constrained correlation function using spherical harmonics. This formulation increases computational speed of rotational matching dramatically compared to rotation search in Cartesian space without sacrificing accuracy in contrast to other spherical harmonic based approaches. Using this sampling method, a reference-free alignment procedure is proposed to tackle reference bias and overfitting, which also includes contrast transfer function correction by Wiener filtering. Application of the method to simulated data allowed us to obtain resolutions near the ground truth. For two experimental datasets, ribosomes from yeast lysate and purified 20S proteasomes, we achieved reconstructions of approximately 20Å and 16Å, respectively. The software is ready-to-use and made public to the community. Copyright © 2013 Elsevier Inc. All rights reserved.
Floating shock fitting via Lagrangian adaptive meshes

NASA Technical Reports Server (NTRS)

Vanrosendale, John

1995-01-01

In recent work we have formulated a new approach to compressible flow simulation, combining the advantages of shock-fitting and shock-capturing. Using a cell-centered on Roe scheme discretization on unstructured meshes, we warp the mesh while marching to steady state, so that mesh edges align with shocks and other discontinuities. This new algorithm, the Shock-fitting Lagrangian Adaptive Method (SLAM), is, in effect, a reliable shock-capturing algorithm which yields shock-fitted accuracy at convergence.
Use of multiresolution wavelet feature pyramids for automatic registration of multisensor imagery

NASA Technical Reports Server (NTRS)

Zavorin, Ilya; Le Moigne, Jacqueline

2005-01-01

The problem of image registration, or the alignment of two or more images representing the same scene or object, has to be addressed in various disciplines that employ digital imaging. In the area of remote sensing, just like in medical imaging or computer vision, it is necessary to design robust, fast, and widely applicable algorithms that would allow automatic registration of images generated by various imaging platforms at the same or different times and that would provide subpixel accuracy. One of the main issues that needs to be addressed when developing a registration algorithm is what type of information should be extracted from the images being registered, to be used in the search for the geometric transformation that best aligns them. The main objective of this paper is to evaluate several wavelet pyramids that may be used both for invariant feature extraction and for representing images at multiple spatial resolutions to accelerate registration. We find that the bandpass wavelets obtained from the steerable pyramid due to Simoncelli performs best in terms of accuracy and consistency, while the low-pass wavelets obtained from the same pyramid give the best results in terms of the radius of convergence. Based on these findings, we propose a modification of a gradient-based registration algorithm that has recently been developed for medical data. We test the modified algorithm on several sets of real and synthetic satellite imagery.
Use of Multi-Resolution Wavelet Feature Pyramids for Automatic Registration of Multi-Sensor Imagery

NASA Technical Reports Server (NTRS)

Zavorin, Ilya; LeMoigne, Jacqueline

2003-01-01

The problem of image registration, or alignment of two or more images representing the same scene or object, has to be addressed in various disciplines that employ digital imaging. In the area of remote sensing, just like in medical imaging or computer vision, it is necessary to design robust, fast and widely applicable algorithms that would allow automatic registration of images generated by various imaging platforms at the same or different times, and that would provide sub-pixel accuracy. One of the main issues that needs to be addressed when developing a registration algorithm is what type of information should be extracted from the images being registered, to be used in the search for the geometric transformation that best aligns them. The main objective of this paper is to evaluate several wavelet pyramids that may be used both for invariant feature extraction and for representing images at multiple spatial resolutions to accelerate registration. We find that the band-pass wavelets obtained from the Steerable Pyramid due to Simoncelli perform better than two types of low-pass pyramids when the images being registered have relatively small amount of nonlinear radiometric variations between them. Based on these findings, we propose a modification of a gradient-based registration algorithm that has recently been developed for medical data. We test the modified algorithm on several sets of real and synthetic satellite imagery.
Strawberry: Fast and accurate genome-guided transcript reconstruction and quantification from RNA-Seq.

PubMed

Liu, Ruolin; Dickerson, Julie

2017-11-01

We propose a novel method and software tool, Strawberry, for transcript reconstruction and quantification from RNA-Seq data under the guidance of genome alignment and independent of gene annotation. Strawberry consists of two modules: assembly and quantification. The novelty of Strawberry is that the two modules use different optimization frameworks but utilize the same data graph structure, which allows a highly efficient, expandable and accurate algorithm for dealing large data. The assembly module parses aligned reads into splicing graphs, and uses network flow algorithms to select the most likely transcripts. The quantification module uses a latent class model to assign read counts from the nodes of splicing graphs to transcripts. Strawberry simultaneously estimates the transcript abundances and corrects for sequencing bias through an EM algorithm. Based on simulations, Strawberry outperforms Cufflinks and StringTie in terms of both assembly and quantification accuracies. Under the evaluation of a real data set, the estimated transcript expression by Strawberry has the highest correlation with Nanostring probe counts, an independent experiment measure for transcript expression. Strawberry is written in C++14, and is available as open source software at https://github.com/ruolin/strawberry under the MIT license.
Cloud-Coffee: implementation of a parallel consistency-based multiple alignment algorithm in the T-Coffee package and its benchmarking on the Amazon Elastic-Cloud.

PubMed

Di Tommaso, Paolo; Orobitg, Miquel; Guirado, Fernando; Cores, Fernado; Espinosa, Toni; Notredame, Cedric

2010-08-01

We present the first parallel implementation of the T-Coffee consistency-based multiple aligner. We benchmark it on the Amazon Elastic Cloud (EC2) and show that the parallelization procedure is reasonably effective. We also conclude that for a web server with moderate usage (10K hits/month) the cloud provides a cost-effective alternative to in-house deployment. T-Coffee is a freeware open source package available from http://www.tcoffee.org/homepage.html
Iterative Magnetometer Calibration

NASA Technical Reports Server (NTRS)

Sedlak, Joseph

2006-01-01

This paper presents an iterative method for three-axis magnetometer (TAM) calibration that makes use of three existing utilities recently incorporated into the attitude ground support system used at NASA's Goddard Space Flight Center. The method combines attitude-independent and attitude-dependent calibration algorithms with a new spinning spacecraft Kalman filter to solve for biases, scale factors, nonorthogonal corrections to the alignment, and the orthogonal sensor alignment. The method is particularly well-suited to spin-stabilized spacecraft, but may also be useful for three-axis stabilized missions given sufficient data to provide observability.
Parametric and non-parametric masking of randomness in sequence alignments can be improved and leads to better resolved trees.

PubMed

Kück, Patrick; Meusemann, Karen; Dambach, Johannes; Thormann, Birthe; von Reumont, Björn M; Wägele, Johann W; Misof, Bernhard

2010-03-31

Methods of alignment masking, which refers to the technique of excluding alignment blocks prior to tree reconstructions, have been successful in improving the signal-to-noise ratio in sequence alignments. However, the lack of formally well defined methods to identify randomness in sequence alignments has prevented a routine application of alignment masking. In this study, we compared the effects on tree reconstructions of the most commonly used profiling method (GBLOCKS) which uses a predefined set of rules in combination with alignment masking, with a new profiling approach (ALISCORE) based on Monte Carlo resampling within a sliding window, using different data sets and alignment methods. While the GBLOCKS approach excludes variable sections above a certain threshold which choice is left arbitrary, the ALISCORE algorithm is free of a priori rating of parameter space and therefore more objective. ALISCORE was successfully extended to amino acids using a proportional model and empirical substitution matrices to score randomness in multiple sequence alignments. A complex bootstrap resampling leads to an even distribution of scores of randomly similar sequences to assess randomness of the observed sequence similarity. Testing performance on real data, both masking methods, GBLOCKS and ALISCORE, helped to improve tree resolution. The sliding window approach was less sensitive to different alignments of identical data sets and performed equally well on all data sets. Concurrently, ALISCORE is capable of dealing with different substitution patterns and heterogeneous base composition. ALISCORE and the most relaxed GBLOCKS gap parameter setting performed best on all data sets. Correspondingly, Neighbor-Net analyses showed the most decrease in conflict. Alignment masking improves signal-to-noise ratio in multiple sequence alignments prior to phylogenetic reconstruction. Given the robust performance of alignment profiling, alignment masking should routinely be used to improve tree reconstructions. Parametric methods of alignment profiling can be easily extended to more complex likelihood based models of sequence evolution which opens the possibility of further improvements.
Efficient Spatiotemporal Clutter Rejection and Nonlinear Filtering-based Dim Resolved and Unresolved Object Tracking Algorithms

NASA Astrophysics Data System (ADS)

Tartakovsky, A.; Tong, M.; Brown, A. P.; Agh, C.

2013-09-01

We develop efficient spatiotemporal image processing algorithms for rejection of non-stationary clutter and tracking of multiple dim objects using non-linear track-before-detect methods. For clutter suppression, we include an innovative image alignment (registration) algorithm. The images are assumed to contain elements of the same scene, but taken at different angles, from different locations, and at different times, with substantial clutter non-stationarity. These challenges are typical for space-based and surface-based IR/EO moving sensors, e.g., highly elliptical orbit or low earth orbit scenarios. The algorithm assumes that the images are related via a planar homography, also known as the projective transformation. The parameters are estimated in an iterative manner, at each step adjusting the parameter vector so as to achieve improved alignment of the images. Operating in the parameter space rather than in the coordinate space is a new idea, which makes the algorithm more robust with respect to noise as well as to large inter-frame disturbances, while operating at real-time rates. For dim object tracking, we include new advancements to a particle non-linear filtering-based track-before-detect (TrbD) algorithm. The new TrbD algorithm includes both real-time full image search for resolved objects not yet in track and joint super-resolution and tracking of individual objects in closely spaced object (CSO) clusters. The real-time full image search provides near-optimal detection and tracking of multiple extremely dim, maneuvering objects/clusters. The super-resolution and tracking CSO TrbD algorithm provides efficient near-optimal estimation of the number of unresolved objects in a CSO cluster, as well as the locations, velocities, accelerations, and intensities of the individual objects. We demonstrate that the algorithm is able to accurately estimate the number of CSO objects and their locations when the initial uncertainty on the number of objects is large. We demonstrate performance of the TrbD algorithm both for satellite-based and surface-based EO/IR surveillance scenarios.
Calculating Least Risk Paths in 3d Indoor Space

NASA Astrophysics Data System (ADS)

Vanclooster, A.; De Maeyer, Ph.; Fack, V.; Van de Weghe, N.

2013-08-01

Over the last couple of years, research on indoor environments has gained a fresh impetus; more specifically applications that support navigation and wayfinding have become one of the booming industries. Indoor navigation research currently covers the technological aspect of indoor positioning and the modelling of indoor space. The algorithmic development to support navigation has so far been left mostly untouched, as most applications mainly rely on adapting Dijkstra's shortest path algorithm to an indoor network. However, alternative algorithms for outdoor navigation have been proposed adding a more cognitive notion to the calculated paths and as such adhering to the natural wayfinding behaviour (e.g. simplest paths, least risk paths). These algorithms are currently restricted to outdoor applications. The need for indoor cognitive algorithms is highlighted by a more challenged navigation and orientation due to the specific indoor structure (e.g. fragmentation, less visibility, confined areas…). As such, the clarity and easiness of route instructions is of paramount importance when distributing indoor routes. A shortest or fastest path indoors not necessarily aligns with the cognitive mapping of the building. Therefore, the aim of this research is to extend those richer cognitive algorithms to three-dimensional indoor environments. More specifically for this paper, we will focus on the application of the least risk path algorithm of Grum (2005) to an indoor space. The algorithm as proposed by Grum (2005) is duplicated and tested in a complex multi-storey building. The results of several least risk path calculations are compared to the shortest paths in indoor environments in terms of total length, improvement in route description complexity and number of turns. Several scenarios are tested in this comparison: paths covering a single floor, paths crossing several building wings and/or floors. Adjustments to the algorithm are proposed to be more aligned to the specific structure of indoor environments (e.g. no turn restrictions, restricted usage of rooms, vertical movement) and common wayfinding strategies indoors. In a later stage, other cognitive algorithms will be implemented and tested in both an indoor and combined indoor-outdoor setting, in an effort to improve the overall user experience during navigation in indoor environments.
Thoracic limb alignment in healthy labrador retrievers: evaluation of standing versus recumbent frontal plane radiography.

PubMed

Goodrich, Zachary J; Norby, Bo; Eichelberger, Bunita M; Friedeck, Wade O; Callis, Hollye N; Hulse, Don A; Kerwin, Sharon C; Fox, Derek B; Saunders, W Brian

2014-10-01

To report thoracic limb alignment values in healthy dogs; to determine if limb alignment values are significantly different when obtained from standing versus recumbent radiographic projections. Prospective cross-sectional study. Labrador Retrievers (n = 45) >15 months of age. Standing and recumbent radiographs were obtained and limb montages were randomized before analysis by a single investigator blinded to dog, limb, and limb position. Twelve limb alignment values were determined using the CORA methodology. Measurements were performed in triplicate and intra-observer variability was evaluated by intra-class correlation coefficient (ICC). Limb alignment values were reported as mean ± SD and 95% confidence intervals. Linear mixed models were used to determine if significant associations existed between limb alignment values and limb, limb position, gender, age, weight, and body condition score. There were significant differences in standing and recumbent limb alignment values for all values except elbow mechanical axis deviation (eMAD). Limb, gender, age, body weight, and body condition score had no effect. ICC values ranged from 0.522 to 0.758, indicating moderate to substantial agreement for repeated measurements by a single investigator. Limb alignment values are significantly different when determined from standing versus recumbent radiographs in healthy Labrador Retrievers. © Copyright 2014 by The American College of Veterinary Surgeons.
Summarizing Audiovisual Contents of a Video Program

NASA Astrophysics Data System (ADS)

Gong, Yihong

2003-12-01

In this paper, we focus on video programs that are intended to disseminate information and knowledge such as news, documentaries, seminars, etc, and present an audiovisual summarization system that summarizes the audio and visual contents of the given video separately, and then integrating the two summaries with a partial alignment. The audio summary is created by selecting spoken sentences that best present the main content of the audio speech while the visual summary is created by eliminating duplicates/redundancies and preserving visually rich contents in the image stream. The alignment operation aims to synchronize each spoken sentence in the audio summary with its corresponding speaker's face and to preserve the rich content in the visual summary. A Bipartite Graph-based audiovisual alignment algorithm is developed to efficiently find the best alignment solution that satisfies these alignment requirements. With the proposed system, we strive to produce a video summary that: (1) provides a natural visual and audio content overview, and (2) maximizes the coverage for both audio and visual contents of the original video without having to sacrifice either of them.
Calibration Transfer in LIBS and Raman Spectroscopy for Planetary Applications

NASA Astrophysics Data System (ADS)

Dyar, M. D.; Thomas, B. F.; Parente, M.; Gemp, I.; Mullen, T. H.

2017-12-01

Planetary scientists rely on spectral libraries and instrument reproducibility to interpret results from missions. Major investments have been made into assembling libraries, but they often naively assume that spectra of single crystals versus powders and from varying instruments will be the same. Calibration transfer (CT) seeks to algorithmically resolve discrepancies among datasets from different instruments or conditions. It offers the ability to align suites of spectra with a small number of common samples, allowing better models to be built with combined data sets. LIBS and Raman data present different challenges for CT. Quantitative geochemical analyses by LIBS spectroscopy are limited by lack of consistency among repeated laser shots and across instruments. Many different factors affect the presence/absence of emission lines and their intensities, such as laser power/plasma temperature, angle of incidence, detector sensitivity/resolution. To overcome these, models in which disparate datasets are projected into a joint low-dimensional subspace where all data can be aligned before quantitative analysis, such as Correlation Analysis for Domain Adaptation (CADA), have proven very effective. They require some overlap between the populations of spectra to be aligned. For example, prediction of SiO2 on 80 samples from two different LIBS labs show errors of ±16-29 wt.% when the training and test sets have no overlap, and ±4.94 wt% SiO2 when CADA is used. Uncorrected Earth-Mars spectral differences are likely to cause errors with the same order of magnitude. As with other types of reflectance spectroscopy, Raman data are plagued by differences among single crystal/powder samples and laser wavelength that affect peak intensities, and by spectral offsets from instruments with varying resolution and wavenumber alignment schemes. These problems persist even within the archetypal RRUFF database. Pre-processing transformation functions such as optimized baseline removal, normalization, squashing, and smoothing improve mineral matching accuracy. Alignment methods can record shifts between corresponding peaks from the same mineral from pairs of instruments. By considering many pairs of minerals, corrections at each energy increment can be determined, creating a transfer function to align the data.
Coherent Lidar Design and Performance Verification

NASA Technical Reports Server (NTRS)

Frehlich, Rod

1996-01-01

This final report summarizes the investigative results from the 3 complete years of funding and corresponding publications are listed. The first year saw the verification of beam alignment for coherent Doppler lidar in space by using the surface return. The second year saw the analysis and computerized simulation of using heterodyne efficiency as an absolute measure of performance of coherent Doppler lidar. A new method was proposed to determine the estimation error for Doppler lidar wind measurements without the need for an independent wind measurement. Coherent Doppler lidar signal covariance, including wind shear and turbulence, was derived and calculated for typical atmospheric conditions. The effects of wind turbulence defined by Kolmogorov spatial statistics were investigated theoretically and with simulations. The third year saw the performance of coherent Doppler lidar in the weak signal regime determined by computer simulations using the best velocity estimators. Improved algorithms for extracting the performance of velocity estimators with wind turbulence included were also produced.

Testing and Calibration of Phase Plates for JWST Optical Simulator

NASA Technical Reports Server (NTRS)

Gong, Qian; Chu, Jenny; Tournois, Severine; Eichhorn, William; Kubalak, David

2011-01-01

Three phase plates were designed to simulate the JWST segmented primary mirror wavefront at three on-orbit alignment stages: coarse phasing, intermediate phasing, and fine phasing. The purpose is to verify JWST's on-orbit wavefront sensing capability. Amongst the three stages, coarse alignment is defined to have piston error between adjacent segments being 30 m to 300 m, intermediate being 0.4 m to 10 m, and fine is below 0.4 m. The phase plates were made of fused silica, and were assembled in JWST Optical Simulator (OSIM). The piston difference was realized by the thickness difference of two adjacent segments. The two important parameters to phase plates are piston and wavefront errors. Dispersed Fringe Sensor (DFS) method was used for initial coarse piston evaluation, which is the emphasis of this paper. Point Diffraction Interferometer (PDI) is used for fine piston and wavefront error. In order to remove piston's 2 pi uncertainty with PDI, three laser wavelengths, 640nm, 660nm, and 780nm, are used for the measurement. The DHS test setup, analysis algorithm and results are presented. The phase plate design concept and its application (i.e. verifying the JWST on-orbit alignment algorithm) are described. The layout of JWST OSIM and the function of phase plates in OSIM are also addressed briefly.
TeachEnG: a Teaching Engine for Genomics.

PubMed

Kim, Minji; Kim, Yeonsung; Qian, Lei; Song, Jun S

2017-10-15

Bioinformatics is a rapidly growing field that has emerged from the synergy of computer science, statistics and biology. Given the interdisciplinary nature of bioinformatics, many students from diverse fields struggle with grasping bioinformatic concepts only from classroom lectures. Interactive tools for helping students reinforce their learning would be thus desirable. Here, we present an interactive online educational tool called TeachEnG (acronym for Teaching Engine for Genomics) for reinforcing key concepts in sequence alignment and phylogenetic tree reconstruction. Our instructional games allow students to align sequences by hand, fill out the dynamic programming matrix in the Needleman-Wunsch global sequence alignment algorithm, and reconstruct phylogenetic trees via the maximum parsimony, Unweighted Pair Group Method with Arithmetic mean (UPGMA) and Neighbor-Joining algorithms. With an easily accessible interface and instant visual feedback, TeachEnG will help promote active learning in bioinformatics. TeachEnG is freely available at http://teacheng.illinois.edu. The source code is available from https://github.com/KnowEnG/TeachEnG under the Artistic License 2.0. It is written in JavaScript and compatible with Firefox, Safari, Chrome and Microsoft Edge. songj@illinois.edu. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Tracking cells in Life Cell Imaging videos using topological alignments.

PubMed

Mosig, Axel; Jäger, Stefan; Wang, Chaofeng; Nath, Sumit; Ersoy, Ilker; Palaniappan, Kannap-pan; Chen, Su-Shing

2009-07-16

With the increasing availability of live cell imaging technology, tracking cells and other moving objects in live cell videos has become a major challenge for bioimage informatics. An inherent problem for most cell tracking algorithms is over- or under-segmentation of cells - many algorithms tend to recognize one cell as several cells or vice versa. We propose to approach this problem through so-called topological alignments, which we apply to address the problem of linking segmentations of two consecutive frames in the video sequence. Starting from the output of a conventional segmentation procedure, we align pairs of consecutive frames through assigning sets of segments in one frame to sets of segments in the next frame. We achieve this through finding maximum weighted solutions to a generalized "bipartite matching" between two hierarchies of segments, where we derive weights from relative overlap scores of convex hulls of sets of segments. For solving the matching task, we rely on an integer linear program. Practical experiments demonstrate that the matching task can be solved efficiently in practice, and that our method is both effective and useful for tracking cells in data sets derived from a so-called Large Scale Digital Cell Analysis System (LSDCAS). The source code of the implementation is available for download from http://www.picb.ac.cn/patterns/Software/topaln.
Dynamical differences of hemoglobin and the ionotropic glutamate receptor in different states revealed by a new dynamics alignment method.

PubMed

Tobi, Dror

2017-08-01

A new algorithm for comparison of protein dynamics is presented. Compared protein structures are superposed and their modes of motions are calculated using the anisotropic network model. The obtained modes are aligned using the dynamic programming algorithm of Needleman and Wunsch, commonly used for sequence alignment. Dynamical comparison of hemoglobin in the T and R2 states reveals that the dynamics of the allosteric effector 2,3-bisphosphoglycerate binding site is different in the two states. These differences can contribute to the selectivity of the effector to the T state. Similar comparison of the ionotropic glutamate receptor in the kainate+(R,R)-2b and ZK bound states reveals that the kainate+(R,R)-2b bound states slow modes describe upward motions of ligand binding domain and the transmembrane domain regions. Such motions may lead to the opening of the receptor. The upper lobes of the LBDs of the ZK bound state have a smaller interface with the amino terminal domains above them and have a better ability to move together. The present study exemplifies the use of dynamics comparison as a tool to study protein function. Proteins 2017; 85:1507-1517. © 2014 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Acceptance test of a commercially available software for automatic image registration of computed tomography (CT), magnetic resonance imaging (MRI) and 99mTc-methoxyisobutylisonitrile (MIBI) single-photon emission computed tomography (SPECT) brain images.

PubMed

Loi, Gianfranco; Dominietto, Marco; Manfredda, Irene; Mones, Eleonora; Carriero, Alessandro; Inglese, Eugenio; Krengli, Marco; Brambilla, Marco

2008-09-01

This note describes a method to characterize the performances of image fusion software (Syntegra) with respect to accuracy and robustness. Computed tomography (CT), magnetic resonance imaging (MRI), and single-photon emission computed tomography (SPECT) studies were acquired from two phantoms and 10 patients. Image registration was performed independently by two couples composed of one radiotherapist and one physicist by means of superposition of anatomic landmarks. Each couple performed jointly and saved the registration. The two solutions were averaged to obtain the gold standard registration. A new set of estimators was defined to identify translation and rotation errors in the coordinate axes, independently from point position in image field of view (FOV). Algorithms evaluated were local correlation (LC) for CT-MRI, normalized mutual information (MI) for CT-MRI, and CT-SPECT registrations. To evaluate accuracy, estimator values were compared to limiting values for the algorithms employed, both in phantoms and in patients. To evaluate robustness, different alignments between images taken from a sample patient were produced and registration errors determined. LC algorithm resulted accurate in CT-MRI registrations in phantoms, but exceeded limiting values in 3 of 10 patients. MI algorithm resulted accurate in CT-MRI and CT-SPECT registrations in phantoms; limiting values were exceeded in one case in CT-MRI and never reached in CT-SPECT registrations. Thus, the evaluation of robustness was restricted to the algorithm of MI both for CT-MRI and CT-SPECT registrations. The algorithm of MI proved to be robust: limiting values were not exceeded with translation perturbations up to 2.5 cm, rotation perturbations up to 10 degrees and roto-translational perturbation up to 3 cm and 5 degrees.
Consensus generation and variant detection by Celera Assembler.

PubMed

Denisov, Gennady; Walenz, Brian; Halpern, Aaron L; Miller, Jason; Axelrod, Nelson; Levy, Samuel; Sutton, Granger

2008-04-15

We present an algorithm to identify allelic variation given a Whole Genome Shotgun (WGS) assembly of haploid sequences, and to produce a set of haploid consensus sequences rather than a single consensus sequence. Existing WGS assemblers take a column-by-column approach to consensus generation, and produce a single consensus sequence which can be inconsistent with the underlying haploid alleles, and inconsistent with any of the aligned sequence reads. Our new algorithm uses a dynamic windowing approach. It detects alleles by simultaneously processing the portions of aligned reads spanning a region of sequence variation, assigns reads to their respective alleles, phases adjacent variant alleles and generates a consensus sequence corresponding to each confirmed allele. This algorithm was used to produce the first diploid genome sequence of an individual human. It can also be applied to assemblies of multiple diploid individuals and hybrid assemblies of multiple haploid organisms. Being applied to the individual human genome assembly, the new algorithm detects exactly two confirmed alleles and reports two consensus sequences in 98.98% of the total number 2,033311 detected regions of sequence variation. In 33,269 out of 460,373 detected regions of size >1 bp, it fixes the constructed errors of a mosaic haploid representation of a diploid locus as produced by the original Celera Assembler consensus algorithm. Using an optimized procedure calibrated against 1 506 344 known SNPs, it detects 438 814 new heterozygous SNPs with false positive rate 12%. The open source code is available at: http://wgs-assembler.cvs.sourceforge.net/wgs-assembler/
NEP: web server for epitope prediction based on antibody neutralization of viral strains with diverse sequences

PubMed Central

Chuang, Gwo-Yu; Liou, David; Kwong, Peter D.; Georgiev, Ivelin S.

2014-01-01

Delineation of the antigenic site, or epitope, recognized by an antibody can provide clues about functional vulnerabilities and resistance mechanisms, and can therefore guide antibody optimization and epitope-based vaccine design. Previously, we developed an algorithm for antibody-epitope prediction based on antibody neutralization of viral strains with diverse sequences and validated the algorithm on a set of broadly neutralizing HIV-1 antibodies. Here we describe the implementation of this algorithm, NEP (Neutralization-based Epitope Prediction), as a web-based server. The users must supply as input: (i) an alignment of antigen sequences of diverse viral strains; (ii) neutralization data for the antibody of interest against the same set of antigen sequences; and (iii) (optional) a structure of the unbound antigen, for enhanced prediction accuracy. The prediction results can be downloaded or viewed interactively on the antigen structure (if supplied) from the web browser using a JSmol applet. Since neutralization experiments are typically performed as one of the first steps in the characterization of an antibody to determine its breadth and potency, the NEP server can be used to predict antibody-epitope information at no additional experimental costs. NEP can be accessed on the internet at http://exon.niaid.nih.gov/nep. PMID:24782517
Improving transmission efficiency of large sequence alignment/map (SAM) files.

PubMed

Sakib, Muhammad Nazmus; Tang, Jijun; Zheng, W Jim; Huang, Chin-Tser

2011-01-01

Research in bioinformatics primarily involves collection and analysis of a large volume of genomic data. Naturally, it demands efficient storage and transfer of this huge amount of data. In recent years, some research has been done to find efficient compression algorithms to reduce the size of various sequencing data. One way to improve the transmission time of large files is to apply a maximum lossless compression on them. In this paper, we present SAMZIP, a specialized encoding scheme, for sequence alignment data in SAM (Sequence Alignment/Map) format, which improves the compression ratio of existing compression tools available. In order to achieve this, we exploit the prior knowledge of the file format and specifications. Our experimental results show that our encoding scheme improves compression ratio, thereby reducing overall transmission time significantly.
RBT-GA: a novel metaheuristic for solving the multiple sequence alignment problem

PubMed Central

Taheri, Javid; Zomaya, Albert Y

2009-01-01

Background Multiple Sequence Alignment (MSA) has always been an active area of research in Bioinformatics. MSA is mainly focused on discovering biologically meaningful relationships among different sequences or proteins in order to investigate the underlying main characteristics/functions. This information is also used to generate phylogenetic trees. Results This paper presents a novel approach, namely RBT-GA, to solve the MSA problem using a hybrid solution methodology combining the Rubber Band Technique (RBT) and the Genetic Algorithm (GA) metaheuristic. RBT is inspired by the behavior of an elastic Rubber Band (RB) on a plate with several poles, which is analogues to locations in the input sequences that could potentially be biologically related. A GA attempts to mimic the evolutionary processes of life in order to locate optimal solutions in an often very complex landscape. RBT-GA is a population based optimization algorithm designed to find the optimal alignment for a set of input protein sequences. In this novel technique, each alignment answer is modeled as a chromosome consisting of several poles in the RBT framework. These poles resemble locations in the input sequences that are most likely to be correlated and/or biologically related. A GA-based optimization process improves these chromosomes gradually yielding a set of mostly optimal answers for the MSA problem. Conclusion RBT-GA is tested with one of the well-known benchmarks suites (BALiBASE 2.0) in this area. The obtained results show that the superiority of the proposed technique even in the case of formidable sequences. PMID:19594869
TRIC: an automated alignment strategy for reproducible protein quantification in targeted proteomics

PubMed Central

Röst, Hannes L.; Liu, Yansheng; D’Agostino, Giuseppe; Zanella, Matteo; Navarro, Pedro; Rosenberger, George; Collins, Ben C.; Gillet, Ludovic; Testa, Giuseppe; Malmström, Lars; Aebersold, Ruedi

2016-01-01

Large scale, quantitative proteomic studies have become essential for the analysis of clinical cohorts, large perturbation experiments and systems biology studies. While next-generation mass spectrometric techniques such as SWATH-MS have substantially increased throughput and reproducibility, ensuring consistent quantification of thousands of peptide analytes across multiple LC-MS/MS runs remains a challenging and laborious manual process. To produce highly consistent and quantitatively accurate proteomics data matrices in an automated fashion, we have developed the TRIC software which utilizes fragment ion data to perform cross-run alignment, consistent peak-picking and quantification for high throughput targeted proteomics. TRIC uses a graph-based alignment strategy based on non-linear retention time correction to integrate peak elution information from all LC-MS/MS runs acquired in a study. When compared to state-of-the-art SWATH-MS data analysis, the algorithm was able to reduce the identification error by more than 3-fold at constant recall, while correcting for highly non-linear chromatographic effects. On a pulsed-SILAC experiment performed on human induced pluripotent stem (iPS) cells, TRIC was able to automatically align and quantify thousands of light and heavy isotopic peak groups and substantially increased the quantitative completeness and biological information in the data, providing insights into protein dynamics of iPS cells. Overall, this study demonstrates the importance of consistent quantification in highly challenging experimental setups, and proposes an algorithm to automate this task, constituting the last missing piece in a pipeline for automated analysis of massively parallel targeted proteomics datasets. PMID:27479329
AllergenFP: allergenicity prediction by descriptor fingerprints.

PubMed

Dimitrov, Ivan; Naneva, Lyudmila; Doytchinova, Irini; Bangov, Ivan

2014-03-15

Allergenicity, like antigenicity and immunogenicity, is a property encoded linearly and non-linearly, and therefore the alignment-based approaches are not able to identify this property unambiguously. A novel alignment-free descriptor-based fingerprint approach is presented here and applied to identify allergens and non-allergens. The approach was implemented into a four step algorithm. Initially, the protein sequences are described by amino acid principal properties as hydrophobicity, size, relative abundance, helix and β-strand forming propensities. Then, the generated strings of different length are converted into vectors with equal length by auto- and cross-covariance (ACC). The vectors were transformed into binary fingerprints and compared in terms of Tanimoto coefficient. The approach was applied to a set of 2427 known allergens and 2427 non-allergens and identified correctly 88% of them with Matthews correlation coefficient of 0.759. The descriptor fingerprint approach presented here is universal. It could be applied for any classification problem in computational biology. The set of E-descriptors is able to capture the main structural and physicochemical properties of amino acids building the proteins. The ACC transformation overcomes the main problem in the alignment-based comparative studies arising from the different length of the aligned protein sequences. The conversion of protein ACC values into binary descriptor fingerprints allows similarity search and classification. The algorithm described in the present study was implemented in a specially designed Web site, named AllergenFP (FP stands for FingerPrint). AllergenFP is written in Python, with GIU in HTML. It is freely accessible at http://ddg-pharmfac.net/Allergen FP. idoytchinova@pharmfac.net or ivanbangov@shu-bg.net.
An Algorithm to Identify and Localize Suitable Dock Locations from 3-D LiDAR Scans

DTIC Science & Technology

2013-05-10

Locations from 3-D LiDAR Scans 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Graves, Mitchell Robert 5d. PROJECT NUMBER...Ranging ( LiDAR ) scans. A LiDAR sensor is a sensor that collects range images from a rotating array of vertically aligned lasers. Our solution leverages...Algorithm, Dock, Locations, Point Clouds, LiDAR , Identify 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a
Interactive collision detection for deformable models using streaming AABBs.

PubMed

Zhang, Xinyu; Kim, Young J

2007-01-01

We present an interactive and accurate collision detection algorithm for deformable, polygonal objects based on the streaming computational model. Our algorithm can detect all possible pairwise primitive-level intersections between two severely deforming models at highly interactive rates. In our streaming computational model, we consider a set of axis aligned bounding boxes (AABBs) that bound each of the given deformable objects as an input stream and perform massively-parallel pairwise, overlapping tests onto the incoming streams. As a result, we are able to prevent performance stalls in the streaming pipeline that can be caused by expensive indexing mechanism required by bounding volume hierarchy-based streaming algorithms. At runtime, as the underlying models deform over time, we employ a novel, streaming algorithm to update the geometric changes in the AABB streams. Moreover, in order to get only the computed result (i.e., collision results between AABBs) without reading back the entire output streams, we propose a streaming en/decoding strategy that can be performed in a hierarchical fashion. After determining overlapped AABBs, we perform a primitive-level (e.g., triangle) intersection checking on a serial computational model such as CPUs. We implemented the entire pipeline of our algorithm using off-the-shelf graphics processors (GPUs), such as nVIDIA GeForce 7800 GTX, for streaming computations, and Intel Dual Core 3.4G processors for serial computations. We benchmarked our algorithm with different models of varying complexities, ranging from 15K up to 50K triangles, under various deformation motions, and the timings were obtained as 30 approximately 100 FPS depending on the complexity of models and their relative configurations. Finally, we made comparisons with a well-known GPU-based collision detection algorithm, CULLIDE [4] and observed about three times performance improvement over the earlier approach. We also made comparisons with a SW-based AABB culling algorithm [2] and observed about two times improvement.
Vicinal light inspection of translucent materials

DOEpatents

Burns, Geroge R [Albuquerque, NM; Yang, Pin [Albuquerque, NM

2010-01-19

The present invention includes methods and apparatus for inspecting vicinally illuminated non-patterned areas of translucent materials. An initial image of the material is received. A second image is received following a relative translation between the material being inspected and a device generating the images. Each vicinally illuminated image includes a portion having optimal illumination, that can be extracted and stored in a composite image of the non-patterned area. The composite image includes aligned portions of the extracted image portions, and provides a composite having optimal illumination over a non-patterned area of the material to be inspected. The composite image can be processed by enhancement and object detection algorithms, to determine the presence of, and characterize any inhomogeneities present in the material.
Matt: local flexibility aids protein multiple structure alignment.

PubMed

Menke, Matthew; Berger, Bonnie; Cowen, Lenore

2008-01-01

Even when there is agreement on what measure a protein multiple structure alignment should be optimizing, finding the optimal alignment is computationally prohibitive. One approach used by many previous methods is aligned fragment pair chaining, where short structural fragments from all the proteins are aligned against each other optimally, and the final alignment chains these together in geometrically consistent ways. Ye and Godzik have recently suggested that adding geometric flexibility may help better model protein structures in a variety of contexts. We introduce the program Matt (Multiple Alignment with Translations and Twists), an aligned fragment pair chaining algorithm that, in intermediate steps, allows local flexibility between fragments: small translations and rotations are temporarily allowed to bring sets of aligned fragments closer, even if they are physically impossible under rigid body transformations. After a dynamic programming assembly guided by these "bent" alignments, geometric consistency is restored in the final step before the alignment is output. Matt is tested against other recent multiple protein structure alignment programs on the popular Homstrad and SABmark benchmark datasets. Matt's global performance is competitive with the other programs on Homstrad, but outperforms the other programs on SABmark, a benchmark of multiple structure alignments of proteins with more distant homology. On both datasets, Matt demonstrates an ability to better align the ends of alpha-helices and beta-strands, an important characteristic of any structure alignment program intended to help construct a structural template library for threading approaches to the inverse protein-folding problem. The related question of whether Matt alignments can be used to distinguish distantly homologous structure pairs from pairs of proteins that are not homologous is also considered. For this purpose, a p-value score based on the length of the common core and average root mean squared deviation (RMSD) of Matt alignments is shown to largely separate decoys from homologous protein structures in the SABmark benchmark dataset. We postulate that Matt's strong performance comes from its ability to model proteins in different conformational states and, perhaps even more important, its ability to model backbone distortions in more distantly related proteins.
Fine Guidance Sensing for Coronagraphic Observatories

NASA Technical Reports Server (NTRS)

Brugarolas, Paul; Alexander, James W.; Trauger, John T.; Moody, Dwight C.

2011-01-01

Three options have been developed for Fine Guidance Sensing (FGS) for coronagraphic observatories using a Fine Guidance Camera within a coronagraphic instrument. Coronagraphic observatories require very fine precision pointing in order to image faint objects at very small distances from a target star. The Fine Guidance Camera measures the direction to the target star. The first option, referred to as Spot, was to collect all of the light reflected from a coronagraph occulter onto a focal plane, producing an Airy-type point spread function (PSF). This would allow almost all of the starlight from the central star to be used for centroiding. The second approach, referred to as Punctured Disk, collects the light that bypasses a central obscuration, producing a PSF with a punctured central disk. The final approach, referred to as Lyot, collects light after passing through the occulter at the Lyot stop. The study includes generation of representative images for each option by the science team, followed by an engineering evaluation of a centroiding or a photometric algorithm for each option. After the alignment of the coronagraph to the fine guidance system, a "nulling" point on the FGS focal point is determined by calibration. This alignment is implemented by a fine alignment mechanism that is part of the fine guidance camera selection mirror. If the star images meet the modeling assumptions, and the star "centroid" can be driven to that nulling point, the contrast for the coronagraph will be maximized.
Fast 3D registration of multimodality tibial images with significant structural mismatch

NASA Astrophysics Data System (ADS)

Rajapakse, C. S.; Wald, M. J.; Magland, J.; Zhang, X. H.; Liu, X. S.; Guo, X. E.; Wehrli, F. W.

2009-02-01

Recently, micro-magnetic resonance imaging (μMRI) in conjunction with micro-finite element analysis has shown great potential in estimating mechanical properties - stiffness and elastic moduli - of bone in patients at risk of osteoporosis. Due to limited spatial resolution and signal-to-noise ratio achievable in vivo, the validity of estimated properties is often established by comparison to those derived from high-resolution micro-CT (μCT) images of cadaveric specimens. For accurate comparison of mechanical parameters derived from μMR and μCT images, analyzed 3D volumes have to be closely matched. The alignment of the micro structure (and the cortex) is often hampered by the fundamental differences of μMR and μCT images and variations in marrow content and cortical bone thickness. Here we present an intensity cross-correlation based registration algorithm coupled with segmentation for registering 3D tibial specimen images acquired by μMRI and μCT in the context of finite-element modeling to assess the bone's mechanical constants. The algorithm first generates three translational and three rotational parameters required to align segmented μMR and CT images from sub regions with high micro-structural similarities. These transformation parameters are then used to register the grayscale μMR and μCT images, which include both the cortex and trabecular bone. The intensity crosscorrelation maximization based registration algorithm described here is suitable for 3D rigid-body image registration applications where through-plane rotations are known to be relatively small. The close alignment of the resulting images is demonstrated quantitatively based on a voxel-overlap measure and qualitatively using visual inspection of the micro structure.
Classification of G-protein coupled receptors based on a rich generation of convolutional neural network, N-gram transformation and multiple sequence alignments.

PubMed

Li, Man; Ling, Cheng; Xu, Qi; Gao, Jingyang

2018-02-01

Sequence classification is crucial in predicting the function of newly discovered sequences. In recent years, the prediction of the incremental large-scale and diversity of sequences has heavily relied on the involvement of machine-learning algorithms. To improve prediction accuracy, these algorithms must confront the key challenge of extracting valuable features. In this work, we propose a feature-enhanced protein classification approach, considering the rich generation of multiple sequence alignment algorithms, N-gram probabilistic language model and the deep learning technique. The essence behind the proposed method is that if each group of sequences can be represented by one feature sequence, composed of homologous sites, there should be less loss when the sequence is rebuilt, when a more relevant sequence is added to the group. On the basis of this consideration, the prediction becomes whether a query sequence belonging to a group of sequences can be transferred to calculate the probability that the new feature sequence evolves from the original one. The proposed work focuses on the hierarchical classification of G-protein Coupled Receptors (GPCRs), which begins by extracting the feature sequences from the multiple sequence alignment results of the GPCRs sub-subfamilies. The N-gram model is then applied to construct the input vectors. Finally, these vectors are imported into a convolutional neural network to make a prediction. The experimental results elucidate that the proposed method provides significant performance improvements. The classification error rate of the proposed method is reduced by at least 4.67% (family level I) and 5.75% (family Level II), in comparison with the current state-of-the-art methods. The implementation program of the proposed work is freely available at: https://github.com/alanFchina/CNN .
Desktop aligner for fabrication of multilayer microfluidic devices.

PubMed

Li, Xiang; Yu, Zeta Tak For; Geraldo, Dalton; Weng, Shinuo; Alve, Nitesh; Dun, Wu; Kini, Akshay; Patel, Karan; Shu, Roberto; Zhang, Feng; Li, Gang; Jin, Qinghui; Fu, Jianping

2015-07-01

Multilayer assembly is a commonly used technique to construct multilayer polydimethylsiloxane (PDMS)-based microfluidic devices with complex 3D architecture and connectivity for large-scale microfluidic integration. Accurate alignment of structure features on different PDMS layers before their permanent bonding is critical in determining the yield and quality of assembled multilayer microfluidic devices. Herein, we report a custom-built desktop aligner capable of both local and global alignments of PDMS layers covering a broad size range. Two digital microscopes were incorporated into the aligner design to allow accurate global alignment of PDMS structures up to 4 in. in diameter. Both local and global alignment accuracies of the desktop aligner were determined to be about 20 μm cm(-1). To demonstrate its utility for fabrication of integrated multilayer PDMS microfluidic devices, we applied the desktop aligner to achieve accurate alignment of different functional PDMS layers in multilayer microfluidics including an organs-on-chips device as well as a microfluidic device integrated with vertical passages connecting channels located in different PDMS layers. Owing to its convenient operation, high accuracy, low cost, light weight, and portability, the desktop aligner is useful for microfluidic researchers to achieve rapid and accurate alignment for generating multilayer PDMS microfluidic devices.
Desktop aligner for fabrication of multilayer microfluidic devices

PubMed Central

Li, Xiang; Yu, Zeta Tak For; Geraldo, Dalton; Weng, Shinuo; Alve, Nitesh; Dun, Wu; Kini, Akshay; Patel, Karan; Shu, Roberto; Zhang, Feng; Li, Gang; Jin, Qinghui; Fu, Jianping

2015-01-01

Multilayer assembly is a commonly used technique to construct multilayer polydimethylsiloxane (PDMS)-based microfluidic devices with complex 3D architecture and connectivity for large-scale microfluidic integration. Accurate alignment of structure features on different PDMS layers before their permanent bonding is critical in determining the yield and quality of assembled multilayer microfluidic devices. Herein, we report a custom-built desktop aligner capable of both local and global alignments of PDMS layers covering a broad size range. Two digital microscopes were incorporated into the aligner design to allow accurate global alignment of PDMS structures up to 4 in. in diameter. Both local and global alignment accuracies of the desktop aligner were determined to be about 20 μm cm−1. To demonstrate its utility for fabrication of integrated multilayer PDMS microfluidic devices, we applied the desktop aligner to achieve accurate alignment of different functional PDMS layers in multilayer microfluidics including an organs-on-chips device as well as a microfluidic device integrated with vertical passages connecting channels located in different PDMS layers. Owing to its convenient operation, high accuracy, low cost, light weight, and portability, the desktop aligner is useful for microfluidic researchers to achieve rapid and accurate alignment for generating multilayer PDMS microfluidic devices. PMID:26233409

Some links on this page may take you to non-federal websites. Their policies may differ from this site.