Sample records for multiple alignment methods

  1. Simultaneous phylogeny reconstruction and multiple sequence alignment

    PubMed Central

    Yue, Feng; Shi, Jian; Tang, Jijun

    2009-01-01

    Background A phylogeny is the evolutionary history of a group of organisms. To date, sequence data is still the most used data type for phylogenetic reconstruction. Before any sequences can be used for phylogeny reconstruction, they must be aligned, and the quality of the multiple sequence alignment has been shown to affect the quality of the inferred phylogeny. At the same time, all the current multiple sequence alignment programs use a guide tree to produce the alignment and experiments showed that good guide trees can significantly improve the multiple alignment quality. Results We devise a new algorithm to simultaneously align multiple sequences and search for the phylogenetic tree that leads to the best alignment. We also implemented the algorithm as a C program package, which can handle both DNA and protein data and can take simple cost model as well as complex substitution matrices, such as PAM250 or BLOSUM62. The performance of the new method are compared with those from other popular multiple sequence alignment tools, including the widely used programs such as ClustalW and T-Coffee. Experimental results suggest that this method has good performance in terms of both phylogeny accuracy and alignment quality. Conclusion We present an algorithm to align multiple sequences and reconstruct the phylogenies that minimize the alignment score, which is based on an efficient algorithm to solve the median problems for three sequences. Our extensive experiments suggest that this method is very promising and can produce high quality phylogenies and alignments. PMID:19208110

  2. Implied alignment: a synapomorphy-based multiple-sequence alignment method and its use in cladogram search

    NASA Technical Reports Server (NTRS)

    Wheeler, Ward C.

    2003-01-01

    A method to align sequence data based on parsimonious synapomorphy schemes generated by direct optimization (DO; earlier termed optimization alignment) is proposed. DO directly diagnoses sequence data on cladograms without an intervening multiple-alignment step, thereby creating topology-specific, dynamic homology statements. Hence, no multiple-alignment is required to generate cladograms. Unlike general and globally optimal multiple-alignment procedures, the method described here, implied alignment (IA), takes these dynamic homologies and traces them back through a single cladogram, linking the unaligned sequence positions in the terminal taxa via DO transformation series. These "lines of correspondence" link ancestor-descendent states and, when displayed as linearly arrayed columns without hypothetical ancestors, are largely indistinguishable from standard multiple alignment. Since this method is based on synapomorphy, the treatment of certain classes of insertion-deletion (indel) events may be different from that of other alignment procedures. As with all alignment methods, results are dependent on parameter assumptions such as indel cost and transversion:transition ratios. Such an IA could be used as a basis for phylogenetic search, but this would be questionable since the homologies derived from the implied alignment depend on its natal cladogram and any variance, between DO and IA + Search, due to heuristic approach. The utility of this procedure in heuristic cladogram searches using DO and the improvement of heuristic cladogram cost calculations are discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.

  3. Multiple DNA and protein sequence alignment on a workstation and a supercomputer.

    PubMed

    Tajima, K

    1988-11-01

    This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.

  4. MANGO: a new approach to multiple sequence alignment.

    PubMed

    Zhang, Zefeng; Lin, Hao; Li, Ming

    2007-01-01

    Multiple sequence alignment is a classical and challenging task for biological sequence analysis. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state of the art multiple sequence alignment programs suffer from the 'once a gap, always a gap' phenomenon. Is there a radically new way to do multiple sequence alignment? This paper introduces a novel and orthogonal multiple sequence alignment method, using multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds are provably significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks showing that MANGO compares favorably, in both accuracy and speed, against state-of-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, Prob-ConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0 and Kalign 2.0.

  5. Multiple network alignment via multiMAGNA+.

    PubMed

    Vijayan, Vipin; Milenkovic, Tijana

    2017-08-21

    Network alignment (NA) aims to find a node mapping that identifies topologically or functionally similar network regions between molecular networks of different species. Analogous to genomic sequence alignment, NA can be used to transfer biological knowledge from well- to poorly-studied species between aligned network regions. Pairwise NA (PNA) finds similar regions between two networks while multiple NA (MNA) can align more than two networks. We focus on MNA. Existing MNA methods aim to maximize total similarity over all aligned nodes (node conservation). Then, they evaluate alignment quality by measuring the amount of conserved edges, but only after the alignment is constructed. Directly optimizing edge conservation during alignment construction in addition to node conservation may result in superior alignments. Thus, we present a novel MNA method called multiMAGNA++ that can achieve this. Indeed, multiMAGNA++ outperforms or is on par with existing MNA methods, while often completing faster than existing methods. That is, multiMAGNA++ scales well to larger network data and can be parallelized effectively. During method evaluation, we also introduce new MNA quality measures to allow for more fair MNA method comparison compared to the existing alignment quality measures. MultiMAGNA++ code is available on the method's web page at http://nd.edu/~cone/multiMAGNA++/.

  6. COACH: profile-profile alignment of protein families using hidden Markov models.

    PubMed

    Edgar, Robert C; Sjölander, Kimmen

    2004-05-22

    Alignments of two multiple-sequence alignments, or statistical models of such alignments (profiles), have important applications in computational biology. The increased amount of information in a profile versus a single sequence can lead to more accurate alignments and more sensitive homolog detection in database searches. Several profile-profile alignment methods have been proposed and have been shown to improve sensitivity and alignment quality compared with sequence-sequence methods (such as BLAST) and profile-sequence methods (e.g. PSI-BLAST). Here we present a new approach to profile-profile alignment we call Comparison of Alignments by Constructing Hidden Markov Models (HMMs) (COACH). COACH aligns two multiple sequence alignments by constructing a profile HMM from one alignment and aligning the other to that HMM. We compare the alignment accuracy of COACH with two recently published methods: Yona and Levitt's prof_sim and Sadreyev and Grishin's COMPASS. On two sets of reference alignments selected from the FSSP database, we find that COACH is able, on average, to produce alignments giving the best coverage or the fewest errors, depending on the chosen parameter settings. COACH is freely available from www.drive5.com/lobster

  7. Progressive structure-based alignment of homologous proteins: Adopting sequence comparison strategies.

    PubMed

    Joseph, Agnel Praveen; Srinivasan, Narayanaswamy; de Brevern, Alexandre G

    2012-09-01

    Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a 1D sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. Copyright © 2012 Elsevier Masson SAS. All rights reserved.

  8. Fast alignment-free sequence comparison using spaced-word frequencies.

    PubMed

    Leimeister, Chris-Andre; Boden, Marcus; Horwege, Sebastian; Lindner, Sebastian; Morgenstern, Burkhard

    2014-07-15

    Alignment-free methods for sequence comparison are increasingly used for genome analysis and phylogeny reconstruction; they circumvent various difficulties of traditional alignment-based approaches. In particular, alignment-free methods are much faster than pairwise or multiple alignments. They are, however, less accurate than methods based on sequence alignment. Most alignment-free approaches work by comparing the word composition of sequences. A well-known problem with these methods is that neighbouring word matches are far from independent. To reduce the statistical dependency between adjacent word matches, we propose to use 'spaced words', defined by patterns of 'match' and 'don't care' positions, for alignment-free sequence comparison. We describe a fast implementation of this approach using recursive hashing and bit operations, and we show that further improvements can be achieved by using multiple patterns instead of single patterns. To evaluate our approach, we use spaced-word frequencies as a basis for fast phylogeny reconstruction. Using real-world and simulated sequence data, we demonstrate that our multiple-pattern approach produces better phylogenies than approaches relying on contiguous words. Our program is freely available at http://spaced.gobics.de/. © The Author 2014. Published by Oxford University Press.

  9. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    PubMed

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  10. Is multiple-sequence alignment required for accurate inference of phylogeny?

    PubMed

    Höhl, Michael; Ragan, Mark A

    2007-04-01

    The process of inferring phylogenetic trees from molecular sequences almost always starts with a multiple alignment of these sequences but can also be based on methods that do not involve multiple sequence alignment. Very little is known about the accuracy with which such alignment-free methods recover the correct phylogeny or about the potential for increasing their accuracy. We conducted a large-scale comparison of ten alignment-free methods, among them one new approach that does not calculate distances and a faster variant of our pattern-based approach; all distance-based alignment-free methods are freely available from http://www.bioinformatics.org.au (as Python package decaf+py). We show that most methods exhibit a higher overall reconstruction accuracy in the presence of high among-site rate variation. Under all conditions that we considered, variants of the pattern-based approach were significantly better than the other alignment-free methods. The new pattern-based variant achieved a speed-up of an order of magnitude in the distance calculation step, accompanied by a small loss of tree reconstruction accuracy. A method of Bayesian inference from k-mers did not improve on classical alignment-free (and distance-based) methods but may still offer other advantages due to its Bayesian nature. We found the optimal word length k of word-based methods to be stable across various data sets, and we provide parameter ranges for two different alphabets. The influence of these alphabets was analyzed to reveal a trade-off in reconstruction accuracy between long and short branches. We have mapped the phylogenetic accuracy for many alignment-free methods, among them several recently introduced ones, and increased our understanding of their behavior in response to biologically important parameters. In all experiments, the pattern-based approach emerged as superior, at the expense of higher resource consumption. Nonetheless, no alignment-free method that we examined recovers the correct phylogeny as accurately as does an approach based on maximum-likelihood distance estimates of multiply aligned sequences.

  11. Mango: multiple alignment with N gapped oligos.

    PubMed

    Zhang, Zefeng; Lin, Hao; Li, Ming

    2008-06-01

    Multiple sequence alignment is a classical and challenging task. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state-of-the-art works suffer from the "once a gap, always a gap" phenomenon. Is there a radically new way to do multiple sequence alignment? In this paper, we introduce a novel and orthogonal multiple sequence alignment method, using both multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole and tries to build the alignment vertically, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds have proved significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks, showing that MANGO compares favorably, in both accuracy and speed, against state-of-the-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, ProbConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0, and Kalign 2.0. We have further demonstrated the scalability of MANGO on very large datasets of repeat elements. MANGO can be downloaded at http://www.bioinfo.org.cn/mango/ and is free for academic usage.

  12. A method of alignment masking for refining the phylogenetic signal of multiple sequence alignments.

    PubMed

    Rajan, Vaibhav

    2013-03-01

    Inaccurate inference of positional homologies in multiple sequence alignments and systematic errors introduced by alignment heuristics obfuscate phylogenetic inference. Alignment masking, the elimination of phylogenetically uninformative or misleading sites from an alignment before phylogenetic analysis, is a common practice in phylogenetic analysis. Although masking is often done manually, automated methods are necessary to handle the much larger data sets being prepared today. In this study, we introduce the concept of subsplits and demonstrate their use in extracting phylogenetic signal from alignments. We design a clustering approach for alignment masking where each cluster contains similar columns-similarity being defined on the basis of compatible subsplits; our approach then identifies noisy clusters and eliminates them. Trees inferred from the columns in the retained clusters are found to be topologically closer to the reference trees. We test our method on numerous standard benchmarks (both synthetic and biological data sets) and compare its performance with other methods of alignment masking. We find that our method can eliminate sites more accurately than other methods, particularly on divergent data, and can improve the topologies of the inferred trees in likelihood-based analyses. Software available upon request from the author.

  13. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences

    PubMed Central

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong

    2015-01-01

    Abstract We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate—slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory. PMID:25549288

  14. Protein contact prediction by integrating deep multiple sequence alignments, coevolution and machine learning.

    PubMed

    Adhikari, Badri; Hou, Jie; Cheng, Jianlin

    2018-03-01

    In this study, we report the evaluation of the residue-residue contacts predicted by our three different methods in the CASP12 experiment, focusing on studying the impact of multiple sequence alignment, residue coevolution, and machine learning on contact prediction. The first method (MULTICOM-NOVEL) uses only traditional features (sequence profile, secondary structure, and solvent accessibility) with deep learning to predict contacts and serves as a baseline. The second method (MULTICOM-CONSTRUCT) uses our new alignment algorithm to generate deep multiple sequence alignment to derive coevolution-based features, which are integrated by a neural network method to predict contacts. The third method (MULTICOM-CLUSTER) is a consensus combination of the predictions of the first two methods. We evaluated our methods on 94 CASP12 domains. On a subset of 38 free-modeling domains, our methods achieved an average precision of up to 41.7% for top L/5 long-range contact predictions. The comparison of the three methods shows that the quality and effective depth of multiple sequence alignments, coevolution-based features, and machine learning integration of coevolution-based features and traditional features drive the quality of predicted protein contacts. On the full CASP12 dataset, the coevolution-based features alone can improve the average precision from 28.4% to 41.6%, and the machine learning integration of all the features further raises the precision to 56.3%, when top L/5 predicted long-range contacts are evaluated. And the correlation between the precision of contact prediction and the logarithm of the number of effective sequences in alignments is 0.66. © 2017 Wiley Periodicals, Inc.

  15. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment

    PubMed Central

    2013-01-01

    Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200

  16. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

    PubMed

    Nagar, Anurag; Hahsler, Michael

    2013-01-01

    Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.

  17. Iterative pass optimization of sequence data

    NASA Technical Reports Server (NTRS)

    Wheeler, Ward C.

    2003-01-01

    The problem of determining the minimum-cost hypothetical ancestral sequences for a given cladogram is known to be NP-complete. This "tree alignment" problem has motivated the considerable effort placed in multiple sequence alignment procedures. Wheeler in 1996 proposed a heuristic method, direct optimization, to calculate cladogram costs without the intervention of multiple sequence alignment. This method, though more efficient in time and more effective in cladogram length than many alignment-based procedures, greedily optimizes nodes based on descendent information only. In their proposal of an exact multiple alignment solution, Sankoff et al. in 1976 described a heuristic procedure--the iterative improvement method--to create alignments at internal nodes by solving a series of median problems. The combination of a three-sequence direct optimization with iterative improvement and a branch-length-based cladogram cost procedure, provides an algorithm that frequently results in superior (i.e., lower) cladogram costs. This iterative pass optimization is both computation and memory intensive, but economies can be made to reduce this burden. An example in arthropod systematics is discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.

  18. mTM-align: a server for fast protein structure database search and multiple protein structure alignment.

    PubMed

    Dong, Runze; Pan, Shuo; Peng, Zhenling; Zhang, Yang; Yang, Jianyi

    2018-05-21

    With the rapid increase of the number of protein structures in the Protein Data Bank, it becomes urgent to develop algorithms for efficient protein structure comparisons. In this article, we present the mTM-align server, which consists of two closely related modules: one for structure database search and the other for multiple structure alignment. The database search is speeded up based on a heuristic algorithm and a hierarchical organization of the structures in the database. The multiple structure alignment is performed using the recently developed algorithm mTM-align. Benchmark tests demonstrate that our algorithms outperform other peering methods for both modules, in terms of speed and accuracy. One of the unique features for the server is the interplay between database search and multiple structure alignment. The server provides service not only for performing fast database search, but also for making accurate multiple structure alignment with the structures found by the search. For the database search, it takes about 2-5 min for a structure of a medium size (∼300 residues). For the multiple structure alignment, it takes a few seconds for ∼10 structures of medium sizes. The server is freely available at: http://yanglab.nankai.edu.cn/mTM-align/.

  19. CAB-Align: A Flexible Protein Structure Alignment Method Based on the Residue-Residue Contact Area.

    PubMed

    Terashi, Genki; Takeda-Shitaka, Mayuko

    2015-01-01

    Proteins are flexible, and this flexibility has an essential functional role. Flexibility can be observed in loop regions, rearrangements between secondary structure elements, and conformational changes between entire domains. However, most protein structure alignment methods treat protein structures as rigid bodies. Thus, these methods fail to identify the equivalences of residue pairs in regions with flexibility. In this study, we considered that the evolutionary relationship between proteins corresponds directly to the residue-residue physical contacts rather than the three-dimensional (3D) coordinates of proteins. Thus, we developed a new protein structure alignment method, contact area-based alignment (CAB-align), which uses the residue-residue contact area to identify regions of similarity. The main purpose of CAB-align is to identify homologous relationships at the residue level between related protein structures. The CAB-align procedure comprises two main steps: First, a rigid-body alignment method based on local and global 3D structure superposition is employed to generate a sufficient number of initial alignments. Then, iterative dynamic programming is executed to find the optimal alignment. We evaluated the performance and advantages of CAB-align based on four main points: (1) agreement with the gold standard alignment, (2) alignment quality based on an evolutionary relationship without 3D coordinate superposition, (3) consistency of the multiple alignments, and (4) classification agreement with the gold standard classification. Comparisons of CAB-align with other state-of-the-art protein structure alignment methods (TM-align, FATCAT, and DaliLite) using our benchmark dataset showed that CAB-align performed robustly in obtaining high-quality alignments and generating consistent multiple alignments with high coverage and accuracy rates, and it performed extremely well when discriminating between homologous and nonhomologous pairs of proteins in both single and multi-domain comparisons. The CAB-align software is freely available to academic users as stand-alone software at http://www.pharm.kitasato-u.ac.jp/bmd/bmd/Publications.html.

  20. A statistically harmonized alignment-classification in image space enables accurate and robust alignment of noisy images in single particle analysis.

    PubMed

    Kawata, Masaaki; Sato, Chikara

    2007-06-01

    In determining the three-dimensional (3D) structure of macromolecular assemblies in single particle analysis, a large representative dataset of two-dimensional (2D) average images from huge number of raw images is a key for high resolution. Because alignments prior to averaging are computationally intensive, currently available multireference alignment (MRA) software does not survey every possible alignment. This leads to misaligned images, creating blurred averages and reducing the quality of the final 3D reconstruction. We present a new method, in which multireference alignment is harmonized with classification (multireference multiple alignment: MRMA). This method enables a statistical comparison of multiple alignment peaks, reflecting the similarities between each raw image and a set of reference images. Among the selected alignment candidates for each raw image, misaligned images are statistically excluded, based on the principle that aligned raw images of similar projections have a dense distribution around the correctly aligned coordinates in image space. This newly developed method was examined for accuracy and speed using model image sets with various signal-to-noise ratios, and with electron microscope images of the Transient Receptor Potential C3 and the sodium channel. In every data set, the newly developed method outperformed conventional methods in robustness against noise and in speed, creating 2D average images of higher quality. This statistically harmonized alignment-classification combination should greatly improve the quality of single particle analysis.

  1. DNA Multiple Sequence Alignment Guided by Protein Domains: The MSA-PAD 2.0 Method.

    PubMed

    Balech, Bachir; Monaco, Alfonso; Perniola, Michele; Santamaria, Monica; Donvito, Giacinto; Vicario, Saverio; Maggi, Giorgio; Pesole, Graziano

    2018-01-01

    Multiple sequence alignment (MSA) is a fundamental component in many DNA sequence analyses including metagenomics studies and phylogeny inference. When guided by protein profiles, DNA multiple alignments assume a higher precision and robustness. Here we present details of the use of the upgraded version of MSA-PAD (2.0), which is a DNA multiple sequence alignment framework able to align DNA sequences coding for single/multiple protein domains guided by PFAM or user-defined annotations. MSA-PAD has two alignment strategies, called "Gene" and "Genome," accounting for coding domains order and genomic rearrangements, respectively. Novel options were added to the present version, where the MSA can be guided by protein profiles provided by the user. This allows MSA-PAD 2.0 to run faster and to add custom protein profiles sometimes not present in PFAM database according to the user's interest. MSA-PAD 2.0 is currently freely available as a Web application at https://recasgateway.cloud.ba.infn.it/ .

  2. A greedy, graph-based algorithm for the alignment of multiple homologous gene lists.

    PubMed

    Fostier, Jan; Proost, Sebastian; Dhoedt, Bart; Saeys, Yvan; Demeester, Piet; Van de Peer, Yves; Vandepoele, Klaas

    2011-03-15

    Many comparative genomics studies rely on the correct identification of homologous genomic regions using accurate alignment tools. In such case, the alphabet of the input sequences consists of complete genes, rather than nucleotides or amino acids. As optimal multiple sequence alignment is computationally impractical, a progressive alignment strategy is often employed. However, such an approach is susceptible to the propagation of alignment errors in early pairwise alignment steps, especially when dealing with strongly diverged genomic regions. In this article, we present a novel accurate and efficient greedy, graph-based algorithm for the alignment of multiple homologous genomic segments, represented as ordered gene lists. Based on provable properties of the graph structure, several heuristics are developed to resolve local alignment conflicts that occur due to gene duplication and/or rearrangement events on the different genomic segments. The performance of the algorithm is assessed by comparing the alignment results of homologous genomic segments in Arabidopsis thaliana to those obtained by using both a progressive alignment method and an earlier graph-based implementation. Especially for datasets that contain strongly diverged segments, the proposed method achieves a substantially higher alignment accuracy, and proves to be sufficiently fast for large datasets including a few dozens of eukaryotic genomes. http://bioinformatics.psb.ugent.be/software. The algorithm is implemented as a part of the i-ADHoRe 3.0 package.

  3. Protein alignment algorithms with an efficient backtracking routine on multiple GPUs.

    PubMed

    Blazewicz, Jacek; Frohmberg, Wojciech; Kierzynka, Michal; Pesch, Erwin; Wojciechowski, Pawel

    2011-05-20

    Pairwise sequence alignment methods are widely used in biological research. The increasing number of sequences is perceived as one of the upcoming challenges for sequence alignment methods in the nearest future. To overcome this challenge several GPU (Graphics Processing Unit) computing approaches have been proposed lately. These solutions show a great potential of a GPU platform but in most cases address the problem of sequence database scanning and computing only the alignment score whereas the alignment itself is omitted. Thus, the need arose to implement the global and semiglobal Needleman-Wunsch, and Smith-Waterman algorithms with a backtracking procedure which is needed to construct the alignment. In this paper we present the solution that performs the alignment of every given sequence pair, which is a required step for progressive multiple sequence alignment methods, as well as for DNA recognition at the DNA assembly stage. Performed tests show that the implementation, with performance up to 6.3 GCUPS on a single GPU for affine gap penalties, is very efficient in comparison to other CPU and GPU-based solutions. Moreover, multiple GPUs support with load balancing makes the application very scalable. The article shows that the backtracking procedure of the sequence alignment algorithms may be designed to fit in with the GPU architecture. Therefore, our algorithm, apart from scores, is able to compute pairwise alignments. This opens a wide range of new possibilities, allowing other methods from the area of molecular biology to take advantage of the new computational architecture. Performed tests show that the efficiency of the implementation is excellent. Moreover, the speed of our GPU-based algorithms can be almost linearly increased when using more than one graphics card.

  4. Alignment method for solar collector arrays

    DOEpatents

    Driver, Jr., Richard B

    2012-10-23

    The present invention is directed to an improved method for establishing camera fixture location for aligning mirrors on a solar collector array (SCA) comprising multiple mirror modules. The method aligns the mirrors on a module by comparing the location of the receiver image in photographs with the predicted theoretical receiver image location. To accurately align an entire SCA, a common reference is used for all of the individual module images within the SCA. The improved method can use relative pixel location information in digital photographs along with alignment fixture inclinometer data to calculate relative locations of the fixture between modules. The absolute locations are determined by minimizing alignment asymmetry for the SCA. The method inherently aligns all of the mirrors in an SCA to the receiver, even with receiver position and module-to-module alignment errors.

  5. The Application of the Weighted k-Partite Graph Problem to the Multiple Alignment for Metabolic Pathways.

    PubMed

    Chen, Wenbin; Hendrix, William; Samatova, Nagiza F

    2017-12-01

    The problem of aligning multiple metabolic pathways is one of very challenging problems in computational biology. A metabolic pathway consists of three types of entities: reactions, compounds, and enzymes. Based on similarities between enzymes, Tohsato et al. gave an algorithm for aligning multiple metabolic pathways. However, the algorithm given by Tohsato et al. neglects the similarities among reactions, compounds, enzymes, and pathway topology. How to design algorithms for the alignment problem of multiple metabolic pathways based on the similarity of reactions, compounds, and enzymes? It is a difficult computational problem. In this article, we propose an algorithm for the problem of aligning multiple metabolic pathways based on the similarities among reactions, compounds, enzymes, and pathway topology. First, we compute a weight between each pair of like entities in different input pathways based on the entities' similarity score and topological structure using Ay et al.'s methods. We then construct a weighted k-partite graph for the reactions, compounds, and enzymes. We extract a mapping between these entities by solving the maximum-weighted k-partite matching problem by applying a novel heuristic algorithm. By analyzing the alignment results of multiple pathways in different organisms, we show that the alignments found by our algorithm correctly identify common subnetworks among multiple pathways.

  6. Multiple alignment analysis on phylogenetic tree of the spread of SARS epidemic using distance method

    NASA Astrophysics Data System (ADS)

    Amiroch, S.; Pradana, M. S.; Irawan, M. I.; Mukhlash, I.

    2017-09-01

    Multiple Alignment (MA) is a particularly important tool for studying the viral genome and determine the evolutionary process of the specific virus. Application of MA in the case of the spread of the Severe acute respiratory syndrome (SARS) epidemic is an interesting thing because this virus epidemic a few years ago spread so quickly that medical attention in many countries. Although there has been a lot of software to process multiple sequences, but the use of pairwise alignment to process MA is very important to consider. In previous research, the alignment between the sequences to process MA algorithm, Super Pairwise Alignment, but in this study used a dynamic programming algorithm Needleman wunchs simulated in Matlab. From the analysis of MA obtained and stable region and unstable which indicates the position where the mutation occurs, the system network topology that produced the phylogenetic tree of the SARS epidemic distance method, and system area networks mutation.

  7. MultiSETTER: web server for multiple RNA structure comparison.

    PubMed

    Čech, Petr; Hoksza, David; Svozil, Daniel

    2015-08-12

    Understanding the architecture and function of RNA molecules requires methods for comparing and analyzing their tertiary and quaternary structures. While structural superposition of short RNAs is achievable in a reasonable time, large structures represent much bigger challenge. Therefore, we have developed a fast and accurate algorithm for RNA pairwise structure superposition called SETTER and implemented it in the SETTER web server. However, though biological relationships can be inferred by a pairwise structure alignment, key features preserved by evolution can be identified only from a multiple structure alignment. Thus, we extended the SETTER algorithm to the alignment of multiple RNA structures and developed the MultiSETTER algorithm. In this paper, we present the updated version of the SETTER web server that implements a user friendly interface to the MultiSETTER algorithm. The server accepts RNA structures either as the list of PDB IDs or as user-defined PDB files. After the superposition is computed, structures are visualized in 3D and several reports and statistics are generated. To the best of our knowledge, the MultiSETTER web server is the first publicly available tool for a multiple RNA structure alignment. The MultiSETTER server offers the visual inspection of an alignment in 3D space which may reveal structural and functional relationships not captured by other multiple alignment methods based either on a sequence or on secondary structure motifs.

  8. Spatio-temporal alignment of multiple sensors

    NASA Astrophysics Data System (ADS)

    Zhang, Tinghua; Ni, Guoqiang; Fan, Guihua; Sun, Huayan; Yang, Biao

    2018-01-01

    Aiming to achieve the spatio-temporal alignment of multi sensor on the same platform for space target observation, a joint spatio-temporal alignment method is proposed. To calibrate the parameters and measure the attitude of cameras, an astronomical calibration method is proposed based on star chart simulation and collinear invariant features of quadrilateral diagonal between the observed star chart. In order to satisfy a temporal correspondence and spatial alignment similarity simultaneously, the method based on the astronomical calibration and attitude measurement in this paper formulates the video alignment to fold the spatial and temporal alignment into a joint alignment framework. The advantage of this method is reinforced by exploiting the similarities and prior knowledge of velocity vector field between adjacent frames, which is calculated by the SIFT Flow algorithm. The proposed method provides the highest spatio-temporal alignment accuracy compared to the state-of-the-art methods on sequences recorded from multi sensor at different times.

  9. A simplified implementation of edge detection in MATLAB is faster and more sensitive than fast fourier transform for actin fiber alignment quantification.

    PubMed

    Kemeny, Steven Frank; Clyne, Alisa Morss

    2011-04-01

    Fiber alignment plays a critical role in the structure and function of cells and tissues. While fiber alignment quantification is important to experimental analysis and several different methods for quantifying fiber alignment exist, many studies focus on qualitative rather than quantitative analysis perhaps due to the complexity of current fiber alignment methods. Speed and sensitivity were compared in edge detection and fast Fourier transform (FFT) for measuring actin fiber alignment in cells exposed to shear stress. While edge detection using matrix multiplication was consistently more sensitive than FFT, image processing time was significantly longer. However, when MATLAB functions were used to implement edge detection, MATLAB's efficient element-by-element calculations and fast filtering techniques reduced computation cost 100 times compared to the matrix multiplication edge detection method. The new computation time was comparable to the FFT method, and MATLAB edge detection produced well-distributed fiber angle distributions that statistically distinguished aligned and unaligned fibers in half as many sample images. When the FFT sensitivity was improved by dividing images into smaller subsections, processing time grew larger than the time required for MATLAB edge detection. Implementation of edge detection in MATLAB is simpler, faster, and more sensitive than FFT for fiber alignment quantification.

  10. Parametric and non-parametric masking of randomness in sequence alignments can be improved and leads to better resolved trees.

    PubMed

    Kück, Patrick; Meusemann, Karen; Dambach, Johannes; Thormann, Birthe; von Reumont, Björn M; Wägele, Johann W; Misof, Bernhard

    2010-03-31

    Methods of alignment masking, which refers to the technique of excluding alignment blocks prior to tree reconstructions, have been successful in improving the signal-to-noise ratio in sequence alignments. However, the lack of formally well defined methods to identify randomness in sequence alignments has prevented a routine application of alignment masking. In this study, we compared the effects on tree reconstructions of the most commonly used profiling method (GBLOCKS) which uses a predefined set of rules in combination with alignment masking, with a new profiling approach (ALISCORE) based on Monte Carlo resampling within a sliding window, using different data sets and alignment methods. While the GBLOCKS approach excludes variable sections above a certain threshold which choice is left arbitrary, the ALISCORE algorithm is free of a priori rating of parameter space and therefore more objective. ALISCORE was successfully extended to amino acids using a proportional model and empirical substitution matrices to score randomness in multiple sequence alignments. A complex bootstrap resampling leads to an even distribution of scores of randomly similar sequences to assess randomness of the observed sequence similarity. Testing performance on real data, both masking methods, GBLOCKS and ALISCORE, helped to improve tree resolution. The sliding window approach was less sensitive to different alignments of identical data sets and performed equally well on all data sets. Concurrently, ALISCORE is capable of dealing with different substitution patterns and heterogeneous base composition. ALISCORE and the most relaxed GBLOCKS gap parameter setting performed best on all data sets. Correspondingly, Neighbor-Net analyses showed the most decrease in conflict. Alignment masking improves signal-to-noise ratio in multiple sequence alignments prior to phylogenetic reconstruction. Given the robust performance of alignment profiling, alignment masking should routinely be used to improve tree reconstructions. Parametric methods of alignment profiling can be easily extended to more complex likelihood based models of sequence evolution which opens the possibility of further improvements.

  11. SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.

    PubMed

    Liu, Kevin; Warnow, Tandy J; Holder, Mark T; Nelesen, Serita M; Yu, Jiaye; Stamatakis, Alexandros P; Linder, C Randal

    2012-01-01

    Highly accurate estimation of phylogenetic trees for large data sets is difficult, in part because multiple sequence alignments must be accurate for phylogeny estimation methods to be accurate. Coestimation of alignments and trees has been attempted but currently only SATé estimates reasonably accurate trees and alignments for large data sets in practical time frames (Liu K., Raghavan S., Nelesen S., Linder C.R., Warnow T. 2009b. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science. 324:1561-1564). Here, we present a modification to the original SATé algorithm that improves upon SATé (which we now call SATé-I) in terms of speed and of phylogenetic and alignment accuracy. SATé-II uses a different divide-and-conquer strategy than SATé-I and so produces smaller more closely related subsets than SATé-I; as a result, SATé-II produces more accurate alignments and trees, can analyze larger data sets, and runs more efficiently than SATé-I. Generally, SATé is a metamethod that takes an existing multiple sequence alignment method as an input parameter and boosts the quality of that alignment method. SATé-II-boosted alignment methods are significantly more accurate than their unboosted versions, and trees based upon these improved alignments are more accurate than trees based upon the original alignments. Because SATé-I used maximum likelihood (ML) methods that treat gaps as missing data to estimate trees and because we found a correlation between the quality of tree/alignment pairs and ML scores, we explored the degree to which SATé's performance depends on using ML with gaps treated as missing data to determine the best tree/alignment pair. We present two lines of evidence that using ML with gaps treated as missing data to optimize the alignment and tree produces very poor results. First, we show that the optimization problem where a set of unaligned DNA sequences is given and the output is the tree and alignment of those sequences that maximize likelihood under the Jukes-Cantor model is uninformative in the worst possible sense. For all inputs, all trees optimize the likelihood score. Second, we show that a greedy heuristic that uses GTR+Gamma ML to optimize the alignment and the tree can produce very poor alignments and trees. Therefore, the excellent performance of SATé-II and SATé-I is not because ML is used as an optimization criterion for choosing the best tree/alignment pair but rather due to the particular divide-and-conquer realignment techniques employed.

  12. AlexSys: a knowledge-based expert system for multiple sequence alignment construction and analysis

    PubMed Central

    Aniba, Mohamed Radhouene; Poch, Olivier; Marchler-Bauer, Aron; Thompson, Julie Dawn

    2010-01-01

    Multiple sequence alignment (MSA) is a cornerstone of modern molecular biology and represents a unique means of investigating the patterns of conservation and diversity in complex biological systems. Many different algorithms have been developed to construct MSAs, but previous studies have shown that no single aligner consistently outperforms the rest. This has led to the development of a number of ‘meta-methods’ that systematically run several aligners and merge the output into one single solution. Although these methods generally produce more accurate alignments, they are inefficient because all the aligners need to be run first and the choice of the best solution is made a posteriori. Here, we describe the development of a new expert system, AlexSys, for the multiple alignment of protein sequences. AlexSys incorporates an intelligent inference engine to automatically select an appropriate aligner a priori, depending only on the nature of the input sequences. The inference engine was trained on a large set of reference multiple alignments, using a novel machine learning approach. Applying AlexSys to a test set of 178 alignments, we show that the expert system represents a good compromise between alignment quality and running time, making it suitable for high throughput projects. AlexSys is freely available from http://alnitak.u-strasbg.fr/∼aniba/alexsys. PMID:20530533

  13. Reconstruction of interatomic vectors by principle component analysis of nuclear magnetic resonance data in multiple alignments

    NASA Astrophysics Data System (ADS)

    Hus, Jean-Christophe; Bruschweiler, Rafael

    2002-07-01

    A general method is presented for the reconstruction of interatomic vector orientations from nuclear magnetic resonance (NMR) spectroscopic data of tensor interactions of rank 2, such as dipolar coupling and chemical shielding anisotropy interactions, in solids and partially aligned liquid-state systems. The method, called PRIMA, is based on a principal component analysis of the covariance matrix of the NMR parameters collected for multiple alignments. The five nonzero eigenvalues and their eigenvectors efficiently allow the approximate reconstruction of the vector orientations of the underlying interactions. The method is demonstrated for an isotropic distribution of sample orientations as well as for finite sets of orientations and internuclear vectors encountered in protein systems.

  14. Evolutionary distances in the twilight zone--a rational kernel approach.

    PubMed

    Schwarz, Roland F; Fletcher, William; Förster, Frank; Merget, Benjamin; Wolf, Matthias; Schultz, Jörg; Markowetz, Florian

    2010-12-31

    Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.

  15. GeneSilico protein structure prediction meta-server.

    PubMed

    Kurowski, Michal A; Bujnicki, Janusz M

    2003-07-01

    Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta.

  16. GeneSilico protein structure prediction meta-server

    PubMed Central

    Kurowski, Michal A.; Bujnicki, Janusz M.

    2003-01-01

    Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta. PMID:12824313

  17. Prediction of β-turns in proteins from multiple alignment using neural network

    PubMed Central

    Kaur, Harpreet; Raghava, Gajendra Pal Singh

    2003-01-01

    A neural network-based method has been developed for the prediction of β-turns in proteins by using multiple sequence alignment. Two feed-forward back-propagation networks with a single hidden layer are used where the first-sequence structure network is trained with the multiple sequence alignment in the form of PSI-BLAST–generated position-specific scoring matrices. The initial predictions from the first network and PSIPRED-predicted secondary structure are used as input to the second structure-structure network to refine the predictions obtained from the first net. A significant improvement in prediction accuracy has been achieved by using evolutionary information contained in the multiple sequence alignment. The final network yields an overall prediction accuracy of 75.5% when tested by sevenfold cross-validation on a set of 426 nonhomologous protein chains. The corresponding Qpred, Qobs, and Matthews correlation coefficient values are 49.8%, 72.3%, and 0.43, respectively, and are the best among all the previously published β-turn prediction methods. The Web server BetaTPred2 (http://www.imtech.res.in/raghava/betatpred2/) has been developed based on this approach. PMID:12592033

  18. Multiple nodes transfer alignment for airborne missiles based on inertial sensor network

    NASA Astrophysics Data System (ADS)

    Si, Fan; Zhao, Yan

    2017-09-01

    Transfer alignment is an important initialization method for airborne missiles because the alignment accuracy largely determines the performance of the missile. However, traditional alignment methods are limited by complicated and unknown flexure angle, and cannot meet the actual requirement when wing flexure deformation occurs. To address this problem, we propose a new method that uses the relative navigation parameters between the weapons and fighter to achieve transfer alignment. First, in the relative inertial navigation algorithm, the relative attitudes and positions are constantly computed in wing flexure deformation situations. Secondly, the alignment results of each weapon are processed using a data fusion algorithm to improve the overall performance. Finally, the feasibility and performance of the proposed method were evaluated under two typical types of deformation, and the simulation results demonstrated that the new transfer alignment method is practical and has high-precision.

  19. Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets.

    PubMed

    Hoffmann, Nils; Keck, Matthias; Neuweger, Heiko; Wilhelm, Mathias; Högy, Petra; Niehaus, Karsten; Stoye, Jens

    2012-08-27

    Modern analytical methods in biology and chemistry use separation techniques coupled to sensitive detectors, such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). These hyphenated methods provide high-dimensional data. Comparing such data manually to find corresponding signals is a laborious task, as each experiment usually consists of thousands of individual scans, each containing hundreds or even thousands of distinct signals. In order to allow for successful identification of metabolites or proteins within such data, especially in the context of metabolomics and proteomics, an accurate alignment and matching of corresponding features between two or more experiments is required. Such a matching algorithm should capture fluctuations in the chromatographic system which lead to non-linear distortions on the time axis, as well as systematic changes in recorded intensities. Many different algorithms for the retention time alignment of GC-MS and LC-MS data have been proposed and published, but all of them focus either on aligning previously extracted peak features or on aligning and comparing the complete raw data containing all available features. In this paper we introduce two algorithms for retention time alignment of multiple GC-MS datasets: multiple alignment by bidirectional best hits peak assignment and cluster extension (BIPACE) and center-star multiple alignment by pairwise partitioned dynamic time warping (CeMAPP-DTW). We show how the similarity-based peak group matching method BIPACE may be used for multiple alignment calculation individually and how it can be used as a preprocessing step for the pairwise alignments performed by CeMAPP-DTW. We evaluate the algorithms individually and in combination on a previously published small GC-MS dataset studying the Leishmania parasite and on a larger GC-MS dataset studying grains of wheat (Triticum aestivum). We have shown that BIPACE achieves very high precision and recall and a very low number of false positive peak assignments on both evaluation datasets. CeMAPP-DTW finds a high number of true positives when executed on its own, but achieves even better results when BIPACE is used to constrain its search space. The source code of both algorithms is included in the OpenSource software framework Maltcms, which is available from http://maltcms.sf.net. The evaluation scripts of the present study are available from the same source.

  20. Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets

    PubMed Central

    2012-01-01

    Background Modern analytical methods in biology and chemistry use separation techniques coupled to sensitive detectors, such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). These hyphenated methods provide high-dimensional data. Comparing such data manually to find corresponding signals is a laborious task, as each experiment usually consists of thousands of individual scans, each containing hundreds or even thousands of distinct signals. In order to allow for successful identification of metabolites or proteins within such data, especially in the context of metabolomics and proteomics, an accurate alignment and matching of corresponding features between two or more experiments is required. Such a matching algorithm should capture fluctuations in the chromatographic system which lead to non-linear distortions on the time axis, as well as systematic changes in recorded intensities. Many different algorithms for the retention time alignment of GC-MS and LC-MS data have been proposed and published, but all of them focus either on aligning previously extracted peak features or on aligning and comparing the complete raw data containing all available features. Results In this paper we introduce two algorithms for retention time alignment of multiple GC-MS datasets: multiple alignment by bidirectional best hits peak assignment and cluster extension (BIPACE) and center-star multiple alignment by pairwise partitioned dynamic time warping (CeMAPP-DTW). We show how the similarity-based peak group matching method BIPACE may be used for multiple alignment calculation individually and how it can be used as a preprocessing step for the pairwise alignments performed by CeMAPP-DTW. We evaluate the algorithms individually and in combination on a previously published small GC-MS dataset studying the Leishmania parasite and on a larger GC-MS dataset studying grains of wheat (Triticum aestivum). Conclusions We have shown that BIPACE achieves very high precision and recall and a very low number of false positive peak assignments on both evaluation datasets. CeMAPP-DTW finds a high number of true positives when executed on its own, but achieves even better results when BIPACE is used to constrain its search space. The source code of both algorithms is included in the OpenSource software framework Maltcms, which is available from http://maltcms.sf.net. The evaluation scripts of the present study are available from the same source. PMID:22920415

  1. Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

    PubMed Central

    2011-01-01

    Background Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. Results In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0. Conclusions The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research. PMID:21867510

  2. A dynamic programming approach for the alignment of signal peaks in multiple gas chromatography-mass spectrometry experiments.

    PubMed

    Robinson, Mark D; De Souza, David P; Keen, Woon Wai; Saunders, Eleanor C; McConville, Malcolm J; Speed, Terence P; Likić, Vladimir A

    2007-10-29

    Gas chromatography-mass spectrometry (GC-MS) is a robust platform for the profiling of certain classes of small molecules in biological samples. When multiple samples are profiled, including replicates of the same sample and/or different sample states, one needs to account for retention time drifts between experiments. This can be achieved either by the alignment of chromatographic profiles prior to peak detection, or by matching signal peaks after they have been extracted from chromatogram data matrices. Automated retention time correction is particularly important in non-targeted profiling studies. A new approach for matching signal peaks based on dynamic programming is presented. The proposed approach relies on both peak retention times and mass spectra. The alignment of more than two peak lists involves three steps: (1) all possible pairs of peak lists are aligned, and similarity of each pair of peak lists is estimated; (2) the guide tree is built based on the similarity between the peak lists; (3) peak lists are progressively aligned starting with the two most similar peak lists, following the guide tree until all peak lists are exhausted. When two or more experiments are performed on different sample states and each consisting of multiple replicates, peak lists within each set of replicate experiments are aligned first (within-state alignment), and subsequently the resulting alignments are aligned themselves (between-state alignment). When more than two sets of replicate experiments are present, the between-state alignment also employs the guide tree. We demonstrate the usefulness of this approach on GC-MS metabolic profiling experiments acquired on wild-type and mutant Leishmania mexicana parasites. We propose a progressive method to match signal peaks across multiple GC-MS experiments based on dynamic programming. A sensitive peak similarity function is proposed to balance peak retention time and peak mass spectra similarities. This approach can produce the optimal alignment between an arbitrary number of peak lists, and models explicitly within-state and between-state peak alignment. The accuracy of the proposed method was close to the accuracy of manually-curated peak matching, which required tens of man-hours for the analyzed data sets. The proposed approach may offer significant advantages for processing of high-throughput metabolomics data, especially when large numbers of experimental replicates and multiple sample states are analyzed.

  3. ASPIC: a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences.

    PubMed

    Bonizzoni, Paola; Rizzi, Raffaella; Pesole, Graziano

    2005-10-05

    Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs) to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems--hence the need to develop novel strategies. We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites. To avoid limitations of splice sites prediction (mainly, over-predictions) due to independent single EST alignments to the genomic sequence our approach performs a multiple alignment of transcript data to the genomic sequence based on the combined analysis of all available data. We recast the problem of predicting constitutive and alternative splicing as an optimization problem, where the optimal multiple transcript alignment minimizes the number of exons and hence of splice site observations. We have implemented a splice site predictor based on this algorithm in the software tool ASPIC (Alternative Splicing PredICtion). It is distinguished from other methods based on BLAST-like tools by the incorporation of entirely new ad hoc procedures for accurate and computationally efficient transcript alignment and adopts dynamic programming for the refinement of intron boundaries. ASPIC also provides the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events. The ASPIC web resource is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility. Extensive bench marking shows that ASPIC outperforms other existing methods in the detection of novel splicing isoforms and in the minimization of over-predictions. ASPIC also requires a lower computation time for processing a single gene and an EST cluster. The ASPIC web resource is available at http://aspic.algo.disco.unimib.it/aspic-devel/.

  4. Matt: local flexibility aids protein multiple structure alignment.

    PubMed

    Menke, Matthew; Berger, Bonnie; Cowen, Lenore

    2008-01-01

    Even when there is agreement on what measure a protein multiple structure alignment should be optimizing, finding the optimal alignment is computationally prohibitive. One approach used by many previous methods is aligned fragment pair chaining, where short structural fragments from all the proteins are aligned against each other optimally, and the final alignment chains these together in geometrically consistent ways. Ye and Godzik have recently suggested that adding geometric flexibility may help better model protein structures in a variety of contexts. We introduce the program Matt (Multiple Alignment with Translations and Twists), an aligned fragment pair chaining algorithm that, in intermediate steps, allows local flexibility between fragments: small translations and rotations are temporarily allowed to bring sets of aligned fragments closer, even if they are physically impossible under rigid body transformations. After a dynamic programming assembly guided by these "bent" alignments, geometric consistency is restored in the final step before the alignment is output. Matt is tested against other recent multiple protein structure alignment programs on the popular Homstrad and SABmark benchmark datasets. Matt's global performance is competitive with the other programs on Homstrad, but outperforms the other programs on SABmark, a benchmark of multiple structure alignments of proteins with more distant homology. On both datasets, Matt demonstrates an ability to better align the ends of alpha-helices and beta-strands, an important characteristic of any structure alignment program intended to help construct a structural template library for threading approaches to the inverse protein-folding problem. The related question of whether Matt alignments can be used to distinguish distantly homologous structure pairs from pairs of proteins that are not homologous is also considered. For this purpose, a p-value score based on the length of the common core and average root mean squared deviation (RMSD) of Matt alignments is shown to largely separate decoys from homologous protein structures in the SABmark benchmark dataset. We postulate that Matt's strong performance comes from its ability to model proteins in different conformational states and, perhaps even more important, its ability to model backbone distortions in more distantly related proteins.

  5. eShadow: A tool for comparing closely related sequences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ovcharenko, Ivan; Boffelli, Dario; Loots, Gabriela G.

    2004-01-15

    Primate sequence comparisons are difficult to interpret due to the high degree of sequence similarity shared between such closely related species. Recently, a novel method, phylogenetic shadowing, has been pioneered for predicting functional elements in the human genome through the analysis of multiple primate sequence alignments. We have expanded this theoretical approach to create a computational tool, eShadow, for the identification of elements under selective pressure in multiple sequence alignments of closely related genomes, such as in comparisons of human to primate or mouse to rat DNA. This tool integrates two different statistical methods and allows for the dynamic visualizationmore » of the resulting conservation profile. eShadow also includes a versatile optimization module capable of training the underlying Hidden Markov Model to differentially predict functional sequences. This module grants the tool high flexibility in the analysis of multiple sequence alignments and in comparing sequences with different divergence rates. Here, we describe the eShadow comparative tool and its potential uses for analyzing both multiple nucleotide and protein alignments to predict putative functional elements. The eShadow tool is publicly available at http://eshadow.dcode.org/« less

  6. Fine-tuning structural RNA alignments in the twilight zone.

    PubMed

    Bremges, Andreas; Schirmer, Stefanie; Giegerich, Robert

    2010-04-30

    A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index.

  7. Measuring the scale dependence of intrinsic alignments using multiple shear estimates

    NASA Astrophysics Data System (ADS)

    Leonard, C. Danielle; Mandelbaum, Rachel

    2018-06-01

    We present a new method for measuring the scale dependence of the intrinsic alignment (IA) contamination to the galaxy-galaxy lensing signal, which takes advantage of multiple shear estimation methods applied to the same source galaxy sample. By exploiting the resulting correlation of both shape noise and cosmic variance, our method can provide an increase in the signal-to-noise of the measured IA signal as compared to methods which rely on the difference of the lensing signal from multiple photometric redshift bins. For a galaxy-galaxy lensing measurement which uses LSST sources and DESI lenses, the signal-to-noise on the IA signal from our method is predicted to improve by a factor of ˜2 relative to the method of Blazek et al. (2012), for pairs of shear estimates which yield substantially different measured IA amplitudes and highly correlated shape noise terms. We show that statistical error necessarily dominates the measurement of intrinsic alignments using our method. We also consider a physically motivated extension of the Blazek et al. (2012) method which assumes that all nearby galaxy pairs, rather than only excess pairs, are subject to IA. In this case, the signal-to-noise of the method of Blazek et al. (2012) is improved.

  8. DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.

    PubMed

    Kelly, Steven; Maini, Philip K

    2013-01-01

    The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.

  9. Fine-tuning structural RNA alignments in the twilight zone

    PubMed Central

    2010-01-01

    Background A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Results Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Conclusions Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index. PMID:20433706

  10. Sequence harmony: detecting functional specificity from alignments

    PubMed Central

    Feenstra, K. Anton; Pirovano, Walter; Krab, Klaas; Heringa, Jaap

    2007-01-01

    Multiple sequence alignments are often used for the identification of key specificity-determining residues within protein families. We present a web server implementation of the Sequence Harmony (SH) method previously introduced. SH accurately detects subfamily specific positions from a multiple alignment by scoring compositional differences between subfamilies, without imposing conservation. The SH web server allows a quick selection of subtype specific sites from a multiple alignment given a subfamily grouping. In addition, it allows the predicted sites to be directly mapped onto a protein structure and displayed. We demonstrate the use of the SH server using the family of plant mitochondrial alternative oxidases (AOX). In addition, we illustrate the usefulness of combining sequence and structural information by showing that the predicted sites are clustered into a few distinct regions in an AOX homology model. The SH web server can be accessed at www.ibi.vu.nl/programs/seqharmwww. PMID:17584793

  11. Robust prediction of consensus secondary structures using averaged base pairing probability matrices.

    PubMed

    Kiryu, Hisanori; Kin, Taishin; Asai, Kiyoshi

    2007-02-15

    Recent transcriptomic studies have revealed the existence of a considerable number of non-protein-coding RNA transcripts in higher eukaryotic cells. To investigate the functional roles of these transcripts, it is of great interest to find conserved secondary structures from multiple alignments on a genomic scale. Since multiple alignments are often created using alignment programs that neglect the special conservation patterns of RNA secondary structures for computational efficiency, alignment failures can cause potential risks of overlooking conserved stem structures. We investigated the dependence of the accuracy of secondary structure prediction on the quality of alignments. We compared three algorithms that maximize the expected accuracy of secondary structures as well as other frequently used algorithms. We found that one of our algorithms, called McCaskill-MEA, was more robust against alignment failures than others. The McCaskill-MEA method first computes the base pairing probability matrices for all the sequences in the alignment and then obtains the base pairing probability matrix of the alignment by averaging over these matrices. The consensus secondary structure is predicted from this matrix such that the expected accuracy of the prediction is maximized. We show that the McCaskill-MEA method performs better than other methods, particularly when the alignment quality is low and when the alignment consists of many sequences. Our model has a parameter that controls the sensitivity and specificity of predictions. We discussed the uses of that parameter for multi-step screening procedures to search for conserved secondary structures and for assigning confidence values to the predicted base pairs. The C++ source code that implements the McCaskill-MEA algorithm and the test dataset used in this paper are available at http://www.ncrna.org/papers/McCaskillMEA/. Supplementary data are available at Bioinformatics online.

  12. Method for the fabrication of three-dimensional microstructures by deep X-ray lithography

    DOEpatents

    Sweatt, William C.; Christenson, Todd R.

    2005-04-05

    A method for the fabrication of three-dimensional microstructures by deep X-ray lithography (DXRL) comprises a masking process that uses a patterned mask with inclined mask holes and off-normal exposures with a DXRL beam aligned with the inclined mask holes. Microstructural features that are oriented in different directions can be obtained by using multiple off-normal exposures through additional mask holes having different orientations. Various methods can be used to block the non-aligned mask holes from the beam when using multiple exposures. A method for fabricating a precision 3D X-ray mask comprises forming an intermediate mask and a master mask on a common support membrane.

  13. DNAAlignEditor: DNA alignment editor tool

    PubMed Central

    Sanchez-Villeda, Hector; Schroeder, Steven; Flint-Garcia, Sherry; Guill, Katherine E; Yamasaki, Masanori; McMullen, Michael D

    2008-01-01

    Background With advances in DNA re-sequencing methods and Next-Generation parallel sequencing approaches, there has been a large increase in genomic efforts to define and analyze the sequence variability present among individuals within a species. For very polymorphic species such as maize, this has lead to a need for intuitive, user-friendly software that aids the biologist, often with naïve programming capability, in tracking, editing, displaying, and exporting multiple individual sequence alignments. To fill this need we have developed a novel DNA alignment editor. Results We have generated a nucleotide sequence alignment editor (DNAAlignEditor) that provides an intuitive, user-friendly interface for manual editing of multiple sequence alignments with functions for input, editing, and output of sequence alignments. The color-coding of nucleotide identity and the display of associated quality score aids in the manual alignment editing process. DNAAlignEditor works as a client/server tool having two main components: a relational database that collects the processed alignments and a user interface connected to database through universal data access connectivity drivers. DNAAlignEditor can be used either as a stand-alone application or as a network application with multiple users concurrently connected. Conclusion We anticipate that this software will be of general interest to biologists and population genetics in editing DNA sequence alignments and analyzing natural sequence variation regardless of species, and will be particularly useful for manual alignment editing of sequences in species with high levels of polymorphism. PMID:18366684

  14. Biclustering as a method for RNA local multiple sequence alignment.

    PubMed

    Wang, Shu; Gutell, Robin R; Miranker, Daniel P

    2007-12-15

    Biclustering is a clustering method that simultaneously clusters both the domain and range of a relation. A challenge in multiple sequence alignment (MSA) is that the alignment of sequences is often intended to reveal groups of conserved functional subsequences. Simultaneously, the grouping of the sequences can impact the alignment; precisely the kind of dual situation biclustering is intended to address. We define a representation of the MSA problem enabling the application of biclustering algorithms. We develop a computer program for local MSA, BlockMSA, that combines biclustering with divide-and-conquer. BlockMSA simultaneously finds groups of similar sequences and locally aligns subsequences within them. Further alignment is accomplished by dividing both the set of sequences and their contents. The net result is both a multiple sequence alignment and a hierarchical clustering of the sequences. BlockMSA was tested on the subsets of the BRAliBase 2.1 benchmark suite that display high variability and on an extension to that suite to larger problem sizes. Also, alignments were evaluated of two large datasets of current biological interest, T box sequences and Group IC1 Introns. The results were compared with alignments computed by ClustalW, MAFFT, MUCLE and PROBCONS alignment programs using Sum of Pairs (SPS) and Consensus Count. Results for the benchmark suite are sensitive to problem size. On problems of 15 or greater sequences, BlockMSA is consistently the best. On none of the problems in the test suite are there appreciable differences in scores among BlockMSA, MAFFT and PROBCONS. On the T box sequences, BlockMSA does the most faithful job of reproducing known annotations. MAFFT and PROBCONS do not. On the Intron sequences, BlockMSA, MAFFT and MUSCLE are comparable at identifying conserved regions. BlockMSA is implemented in Java. Source code and supplementary datasets are available at http://aug.csres.utexas.edu/msa/

  15. Resolving the multiple sequence alignment problem using biogeography-based optimization with multiple populations.

    PubMed

    Zemali, El-Amine; Boukra, Abdelmadjid

    2015-08-01

    The multiple sequence alignment (MSA) is one of the most challenging problems in bioinformatics, it involves discovering similarity between a set of protein or DNA sequences. This paper introduces a new method for the MSA problem called biogeography-based optimization with multiple populations (BBOMP). It is based on a recent metaheuristic inspired from the mathematics of biogeography named biogeography-based optimization (BBO). To improve the exploration ability of BBO, we have introduced a new concept allowing better exploration of the search space. It consists of manipulating multiple populations having each one its own parameters. These parameters are used to build up progressive alignments allowing more diversity. At each iteration, the best found solution is injected in each population. Moreover, to improve solution quality, six operators are defined. These operators are selected with a dynamic probability which changes according to the operators efficiency. In order to test proposed approach performance, we have considered a set of datasets from Balibase 2.0 and compared it with many recent algorithms such as GAPAM, MSA-GA, QEAMSA and RBT-GA. The results show that the proposed approach achieves better average score than the previously cited methods.

  16. AntiClustal: Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.

    PubMed

    Di Pietro, C; Di Pietro, V; Emmanuele, G; Ferro, A; Maugeri, T; Modica, E; Pigola, G; Pulvirenti, A; Purrello, M; Ragusa, M; Scalia, M; Shasha, D; Travali, S; Zimmitti, V

    2003-01-01

    In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.

  17. Aligning Metabolic Pathways Exploiting Binary Relation of Reactions.

    PubMed

    Huang, Yiran; Zhong, Cheng; Lin, Hai Xiang; Huang, Jing

    2016-01-01

    Metabolic pathway alignment has been widely used to find one-to-one and/or one-to-many reaction mappings to identify the alternative pathways that have similar functions through different sets of reactions, which has important applications in reconstructing phylogeny and understanding metabolic functions. The existing alignment methods exhaustively search reaction sets, which may become infeasible for large pathways. To address this problem, we present an effective alignment method for accurately extracting reaction mappings between two metabolic pathways. We show that connected relation between reactions can be formalized as binary relation of reactions in metabolic pathways, and the multiplications of zero-one matrices for binary relations of reactions can be accomplished in finite steps. By utilizing the multiplications of zero-one matrices for binary relation of reactions, we efficiently obtain reaction sets in a small number of steps without exhaustive search, and accurately uncover biologically relevant reaction mappings. Furthermore, we introduce a measure of topological similarity of nodes (reactions) by comparing the structural similarity of the k-neighborhood subgraphs of the nodes in aligning metabolic pathways. We employ this similarity metric to improve the accuracy of the alignments. The experimental results on the KEGG database show that when compared with other state-of-the-art methods, in most cases, our method obtains better performance in the node correctness and edge correctness, and the number of the edges of the largest common connected subgraph for one-to-one reaction mappings, and the number of correct one-to-many reaction mappings. Our method is scalable in finding more reaction mappings with better biological relevance in large metabolic pathways.

  18. PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.

    PubMed

    Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming

    2016-07-08

    The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Biological intuition in alignment-free methods: response to Posada.

    PubMed

    Ragan, Mark A; Chan, Cheong Xin

    2013-08-01

    A recent editorial in Journal of Molecular Evolution highlights opportunities and challenges facing molecular evolution in the era of next-generation sequencing. Abundant sequence data should allow more-complex models to be fit at higher confidence, making phylogenetic inference more reliable and improving our understanding of evolution at the molecular level. However, concern that approaches based on multiple sequence alignment may be computationally infeasible for large datasets is driving the development of so-called alignment-free methods for sequence comparison and phylogenetic inference. The recent editorial characterized these approaches as model-free, not based on the concept of homology, and lacking in biological intuition. We argue here that alignment-free methods have not abandoned models or homology, and can be biologically intuitive.

  20. MUSCLE: multiple sequence alignment with high accuracy and high throughput.

    PubMed

    Edgar, Robert C

    2004-01-01

    We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

  1. YAHA: fast and flexible long-read alignment with optimal breakpoint detection.

    PubMed

    Faust, Gregory G; Hall, Ira M

    2012-10-01

    With improved short-read assembly algorithms and the recent development of long-read sequencers, split mapping will soon be the preferred method for structural variant (SV) detection. Yet, current alignment tools are not well suited for this. We present YAHA, a fast and flexible hash-based aligner. YAHA is as fast and accurate as BWA-SW at finding the single best alignment per query and is dramatically faster and more sensitive than both SSAHA2 and MegaBLAST at finding all possible alignments. Unlike other aligners that report all, or one, alignment per query, or that use simple heuristics to select alignments, YAHA uses a directed acyclic graph to find the optimal set of alignments that cover a query using a biologically relevant breakpoint penalty. YAHA can also report multiple mappings per defined segment of the query. We show that YAHA detects more breakpoints in less time than BWA-SW across all SV classes, and especially excels at complex SVs comprising multiple breakpoints. YAHA is currently supported on 64-bit Linux systems. Binaries and sample data are freely available for download from http://faculty.virginia.edu/irahall/YAHA. imh4y@virginia.edu.

  2. Quantification of Cardiomyocyte Alignment from Three-Dimensional (3D) Confocal Microscopy of Engineered Tissue.

    PubMed

    Kowalski, William J; Yuan, Fangping; Nakane, Takeichiro; Masumoto, Hidetoshi; Dwenger, Marc; Ye, Fei; Tinney, Joseph P; Keller, Bradley B

    2017-08-01

    Biological tissues have complex, three-dimensional (3D) organizations of cells and matrix factors that provide the architecture necessary to meet morphogenic and functional demands. Disordered cell alignment is associated with congenital heart disease, cardiomyopathy, and neurodegenerative diseases and repairing or replacing these tissues using engineered constructs may improve regenerative capacity. However, optimizing cell alignment within engineered tissues requires quantitative 3D data on cell orientations and both efficient and validated processing algorithms. We developed an automated method to measure local 3D orientations based on structure tensor analysis and incorporated an adaptive subregion size to account for multiple scales. Our method calculates the statistical concentration parameter, κ, to quantify alignment, as well as the traditional orientational order parameter. We validated our method using synthetic images and accurately measured principal axis and concentration. We then applied our method to confocal stacks of cleared, whole-mount engineered cardiac tissues generated from human-induced pluripotent stem cells or embryonic chick cardiac cells and quantified cardiomyocyte alignment. We found significant differences in alignment based on cellular composition and tissue geometry. These results from our synthetic images and confocal data demonstrate the efficiency and accuracy of our method to measure alignment in 3D tissues.

  3. Measuring the distance between multiple sequence alignments.

    PubMed

    Blackburne, Benjamin P; Whelan, Simon

    2012-02-15

    Multiple sequence alignment (MSA) is a core method in bioinformatics. The accuracy of such alignments may influence the success of downstream analyses such as phylogenetic inference, protein structure prediction, and functional prediction. The importance of MSA has lead to the proliferation of MSA methods, with different objective functions and heuristics to search for the optimal MSA. Different methods of inferring MSAs produce different results in all but the most trivial cases. By measuring the differences between inferred alignments, we may be able to develop an understanding of how these differences (i) relate to the objective functions and heuristics used in MSA methods, and (ii) affect downstream analyses. We introduce four metrics to compare MSAs, which include the position in a sequence where a gap occurs or the location on a phylogenetic tree where an insertion or deletion (indel) event occurs. We use both real and synthetic data to explore the information given by these metrics and demonstrate how the different metrics in combination can yield more information about MSA methods and the differences between them. MetAl is a free software implementation of these metrics in Haskell. Source and binaries for Windows, Linux and Mac OS X are available from http://kumiho.smith.man.ac.uk/whelan/software/metal/.

  4. PFAAT version 2.0: a tool for editing, annotating, and analyzing multiple sequence alignments.

    PubMed

    Caffrey, Daniel R; Dana, Paul H; Mathur, Vidhya; Ocano, Marco; Hong, Eun-Jong; Wang, Yaoyu E; Somaroo, Shyamal; Caffrey, Brian E; Potluri, Shobha; Huang, Enoch S

    2007-10-11

    By virtue of their shared ancestry, homologous sequences are similar in their structure and function. Consequently, multiple sequence alignments are routinely used to identify trends that relate to function. This type of analysis is particularly productive when it is combined with structural and phylogenetic analysis. Here we describe the release of PFAAT version 2.0, a tool for editing, analyzing, and annotating multiple sequence alignments. Support for multiple annotations is a key component of this release as it provides a framework for most of the new functionalities. The sequence annotations are accessible from the alignment and tree, where they are typically used to label sequences or hyperlink them to related databases. Sequence annotations can be created manually or extracted automatically from UniProt entries. Once a multiple sequence alignment is populated with sequence annotations, sequences can be easily selected and sorted through a sophisticated search dialog. The selected sequences can be further analyzed using statistical methods that explicitly model relationships between the sequence annotations and residue properties. Residue annotations are accessible from the alignment viewer and are typically used to designate binding sites or properties for a particular residue. Residue annotations are also searchable, and allow one to quickly select alignment columns for further sequence analysis, e.g. computing percent identities. Other features include: novel algorithms to compute sequence conservation, mapping conservation scores to a 3D structure in Jmol, displaying secondary structure elements, and sorting sequences by residue composition. PFAAT provides a framework whereby end-users can specify knowledge for a protein family in the form of annotation. The annotations can be combined with sophisticated analysis to test hypothesis that relate to sequence, structure and function.

  5. JDet: interactive calculation and visualization of function-related conservation patterns in multiple sequence alignments and structures.

    PubMed

    Muth, Thilo; García-Martín, Juan A; Rausell, Antonio; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2012-02-15

    We have implemented in a single package all the features required for extracting, visualizing and manipulating fully conserved positions as well as those with a family-dependent conservation pattern in multiple sequence alignments. The program allows, among other things, to run different methods for extracting these positions, combine the results and visualize them in protein 3D structures and sequence spaces. JDet is a multiplatform application written in Java. It is freely available, including the source code, at http://csbg.cnb.csic.es/JDet. The package includes two of our recently developed programs for detecting functional positions in protein alignments (Xdet and S3Det), and support for other methods can be added as plug-ins. A help file and a guided tutorial for JDet are also available.

  6. Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm.

    PubMed

    Rani, R Ranjani; Ramyachitra, D

    2016-12-01

    Multiple sequence alignment (MSA) is a widespread approach in computational biology and bioinformatics. MSA deals with how the sequences of nucleotides and amino acids are sequenced with possible alignment and minimum number of gaps between them, which directs to the functional, evolutionary and structural relationships among the sequences. Still the computation of MSA is a challenging task to provide an efficient accuracy and statistically significant results of alignments. In this work, the Bacterial Foraging Optimization Algorithm was employed to align the biological sequences which resulted in a non-dominated optimal solution. It employs Multi-objective, such as: Maximization of Similarity, Non-gap percentage, Conserved blocks and Minimization of gap penalty. BAliBASE 3.0 benchmark database was utilized to examine the proposed algorithm against other methods In this paper, two algorithms have been proposed: Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC) and Bacterial Foraging Optimization Algorithm. It was found that Hybrid Genetic Algorithm with Artificial Bee Colony performed better than the existing optimization algorithms. But still the conserved blocks were not obtained using GA-ABC. Then BFO was used for the alignment and the conserved blocks were obtained. The proposed Multi-Objective Bacterial Foraging Optimization Algorithm (MO-BFO) was compared with widely used MSA methods Clustal Omega, Kalign, MUSCLE, MAFFT, Genetic Algorithm (GA), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO) and Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC). The final results show that the proposed MO-BFO algorithm yields better alignment than most widely used methods. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  7. The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes.

    PubMed

    Treangen, Todd J; Ondov, Brian D; Koren, Sergey; Phillippy, Adam M

    2014-01-01

    Whole-genome sequences are now available for many microbial species and clades, however existing whole-genome alignment methods are limited in their ability to perform sequence comparisons of multiple sequences simultaneously. Here we present the Harvest suite of core-genome alignment and visualization tools for the rapid and simultaneous analysis of thousands of intraspecific microbial strains. Harvest includes Parsnp, a fast core-genome multi-aligner, and Gingr, a dynamic visual platform. Together they provide interactive core-genome alignments, variant calls, recombination detection, and phylogenetic trees. Using simulated and real data we demonstrate that our approach exhibits unrivaled speed while maintaining the accuracy of existing methods. The Harvest suite is open-source and freely available from: http://github.com/marbl/harvest.

  8. Combining multiple thresholding binarization values to improve OCR output

    NASA Astrophysics Data System (ADS)

    Lund, William B.; Kennard, Douglas J.; Ringger, Eric K.

    2013-01-01

    For noisy, historical documents, a high optical character recognition (OCR) word error rate (WER) can render the OCR text unusable. Since image binarization is often the method used to identify foreground pixels, a body of research seeks to improve image-wide binarization directly. Instead of relying on any one imperfect binarization technique, our method incorporates information from multiple simple thresholding binarizations of the same image to improve text output. Using a new corpus of 19th century newspaper grayscale images for which the text transcription is known, we observe WERs of 13.8% and higher using current binarization techniques and a state-of-the-art OCR engine. Our novel approach combines the OCR outputs from multiple thresholded images by aligning the text output and producing a lattice of word alternatives from which a lattice word error rate (LWER) is calculated. Our results show a LWER of 7.6% when aligning two threshold images and a LWER of 6.8% when aligning five. From the word lattice we commit to one hypothesis by applying the methods of Lund et al. (2011) achieving an improvement over the original OCR output and a 8.41% WER result on this data set.

  9. Unified Alignment of Protein-Protein Interaction Networks.

    PubMed

    Malod-Dognin, Noël; Ban, Kristina; Pržulj, Nataša

    2017-04-19

    Paralleling the increasing availability of protein-protein interaction (PPI) network data, several network alignment methods have been proposed. Network alignments have been used to uncover functionally conserved network parts and to transfer annotations. However, due to the computational intractability of the network alignment problem, aligners are heuristics providing divergent solutions and no consensus exists on a gold standard, or which scoring scheme should be used to evaluate them. We comprehensively evaluate the alignment scoring schemes and global network aligners on large scale PPI data and observe that three methods, HUBALIGN, L-GRAAL and NATALIE, regularly produce the most topologically and biologically coherent alignments. We study the collective behaviour of network aligners and observe that PPI networks are almost entirely aligned with a handful of aligners that we unify into a new tool, Ulign. Ulign enables complete alignment of two networks, which traditional global and local aligners fail to do. Also, multiple mappings of Ulign define biologically relevant soft clusterings of proteins in PPI networks, which may be used for refining the transfer of annotations across networks. Hence, PPI networks are already well investigated by current aligners, so to gain additional biological insights, a paradigm shift is needed. We propose such a shift come from aligning all available data types collectively rather than any particular data type in isolation from others.

  10. Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices.

    PubMed

    Li, Guang; Wang, Yadong; Su, Xiaohong

    2012-10-01

    When developing personal DNA databases, there must be an appropriate guarantee of anonymity, which means that the data cannot be related back to individuals. DNA lattice anonymization (DNALA) is a successful method for making personal DNA sequences anonymous. However, it uses time-consuming multiple sequence alignment and a low-accuracy greedy clustering algorithm. Furthermore, DNALA is not an online algorithm, and so it cannot quickly return results when the database is updated. This study improves the DNALA method. Specifically, we replaced the multiple sequence alignment in DNALA with global pairwise sequence alignment to save time, and we designed a hybrid clustering algorithm comprised of a maximum weight matching (MWM)-based algorithm and an online algorithm. The MWM-based algorithm is more accurate than the greedy algorithm in DNALA and has the same time complexity. The online algorithm can process data quickly when the database is updated. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  11. A novel approach to multiple sequence alignment using hadoop data grids.

    PubMed

    Sudha Sadasivam, G; Baktavatchalam, G

    2010-01-01

    Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.

  12. System and method for 2D workpiece alignment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Weaver, William T.; Carlson, Charles T.; Smith, Scott A.

    2015-07-14

    A carrier capable of holding one or more workpieces is disclosed. The carrier includes movable projections located along the sides of each cell in the carrier. This carrier, in conjunction with a separate alignment apparatus, aligns each workpiece within its respective cell against several alignment pins, using a multiple step alignment process to guarantee proper positioning of the workpiece in the cell. First, the workpieces are moved toward one side of the cell. Once the workpieces have been aligned against this side, the workpieces are then moved toward an adjacent orthogonal side such that the workpieces are aligned to twomore » sides of the cell. Once aligned, the workpiece is held in place by the projections located along each side of each cell. In addition, the alignment pins are also used to align the associated mask, thereby guaranteeing that the mask is properly aligned to the workpiece.« less

  13. BatMis: a fast algorithm for k-mismatch mapping.

    PubMed

    Tennakoon, Chandana; Purbojati, Rikky W; Sung, Wing-Kin

    2012-08-15

    Second-generation sequencing (SGS) generates millions of reads that need to be aligned to a reference genome allowing errors. Although current aligners can efficiently map reads allowing a small number of mismatches, they are not well suited for handling a large number of mismatches. The efficiency of aligners can be improved using various heuristics, but the sensitivity and accuracy of the alignments are sacrificed. In this article, we introduce Basic Alignment tool for Mismatches (BatMis)--an efficient method to align short reads to a reference allowing k mismatches. BatMis is a Burrows-Wheeler transformation based aligner that uses a seed and extend approach, and it is an exact method. Benchmark tests show that BatMis performs better than competing aligners in solving the k-mismatch problem. Furthermore, it can compete favorably even when compared with the heuristic modes of the other aligners. BatMis is a useful alternative for applications where fast k-mismatch mappings, unique mappings or multiple mappings of SGS data are required. BatMis is written in C/C++ and is freely available from http://code.google.com/p/batmis/

  14. Evolutionary profiles from the QR factorization of multiple sequence alignments

    PubMed Central

    Sethi, Anurag; O'Donoghue, Patrick; Luthey-Schulten, Zaida

    2005-01-01

    We present an algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of the homologous group. The method, based on the multidimensional QR factorization of numerically encoded multiple sequence alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. We observe a general trend that these smaller, more evolutionarily balanced profiles have comparable and, in many cases, better performance in database searches than conventional profiles containing hundreds of sequences, constructed in an iterative and computationally intensive procedure. For more diverse families or superfamilies, with sequence identity <30%, structural alignments, based purely on the geometry of the protein structures, provide better alignments than pure sequence-based methods. Merging the structure and sequence information allows the construction of accurate profiles for distantly related groups. These structure-based profiles outperformed other sequence-based methods for finding distant homologs and were used to identify a putative class II cysteinyl-tRNA synthetase (CysRS) in several archaea that eluded previous annotation studies. Phylogenetic analysis showed the putative class II CysRSs to be a monophyletic group and homology modeling revealed a constellation of active site residues similar to that in the known class I CysRS. PMID:15741270

  15. Multiview echocardiography fusion using an electromagnetic tracking system.

    PubMed

    Punithakumar, Kumaradevan; Hareendranathan, Abhilash R; Paakkanen, Riitta; Khan, Nehan; Noga, Michelle; Boulanger, Pierre; Becher, Harald

    2016-08-01

    Three-dimensional ultrasound is an emerging modality for the assessment of complex cardiac anatomy and function. The advantages of this modality include lack of ionizing radiation, portability, low cost, and high temporal resolution. Major limitations include limited field-of-view, reliance on frequently limited acoustic windows, and poor signal to noise ratio. This study proposes a novel approach to combine multiple views into a single image using an electromagnetic tracking system in order to improve the field-of-view. The novel method has several advantages: 1) it does not rely on image information for alignment, and therefore, the method does not require image overlap; 2) the alignment accuracy of the proposed approach is not affected by any poor image quality as in the case of image registration based approaches; 3) in contrast to previous optical tracking based system, the proposed approach does not suffer from line-of-sight limitation; and 4) it does not require any initial calibration. In this pilot project, we were able to show that using a heart phantom, our method can fuse multiple echocardiographic images and improve the field-of view. Quantitative evaluations showed that the proposed method yielded a nearly optimal alignment of image data sets in three-dimensional space. The proposed method demonstrates the electromagnetic system can be used for the fusion of multiple echocardiography images with a seamless integration of sensors to the transducer.

  16. De novo identification of highly diverged protein repeats by probabilistic consistency.

    PubMed

    Biegert, A; Söding, J

    2008-03-15

    An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID

  17. SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments

    PubMed Central

    Di Tommaso, Paolo; Bussotti, Giovanni; Kemena, Carsten; Capriotti, Emidio; Chatzou, Maria; Prieto, Pablo; Notredame, Cedric

    2014-01-01

    This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee. PMID:24972831

  18. Scalable multi-sample single-cell data analysis by Partition-Assisted Clustering and Multiple Alignments of Networks

    PubMed Central

    Samusik, Nikolay; Wang, Xiaowei; Guan, Leying; Nolan, Garry P.

    2017-01-01

    Mass cytometry (CyTOF) has greatly expanded the capability of cytometry. It is now easy to generate multiple CyTOF samples in a single study, with each sample containing single-cell measurement on 50 markers for more than hundreds of thousands of cells. Current methods do not adequately address the issues concerning combining multiple samples for subpopulation discovery, and these issues can be quickly and dramatically amplified with increasing number of samples. To overcome this limitation, we developed Partition-Assisted Clustering and Multiple Alignments of Networks (PAC-MAN) for the fast automatic identification of cell populations in CyTOF data closely matching that of expert manual-discovery, and for alignments between subpopulations across samples to define dataset-level cellular states. PAC-MAN is computationally efficient, allowing the management of very large CyTOF datasets, which are increasingly common in clinical studies and cancer studies that monitor various tissue samples for each subject. PMID:29281633

  19. AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

    PubMed Central

    2010-01-01

    Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid) obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used to reliably detect divergent regions via several scoring methods that provide different levels of selectivity. Its predictions have been verified by experimental means. Hence, it is expected that its usage will save researchers' time and ensure an objective selection of the best-possible divergent region when closely related sequences are analysed. AlignMiner is freely available at http://www.scbi.uma.es/alignminer. PMID:20525162

  20. Heuristics for multiobjective multiple sequence alignment.

    PubMed

    Abbasi, Maryam; Paquete, Luís; Pereira, Francisco B

    2016-07-15

    Aligning multiple sequences arises in many tasks in Bioinformatics. However, the alignments produced by the current software packages are highly dependent on the parameters setting, such as the relative importance of opening gaps with respect to the increase of similarity. Choosing only one parameter setting may provide an undesirable bias in further steps of the analysis and give too simplistic interpretations. In this work, we reformulate multiple sequence alignment from a multiobjective point of view. The goal is to generate several sequence alignments that represent a trade-off between maximizing the substitution score and minimizing the number of indels/gaps in the sum-of-pairs score function. This trade-off gives to the practitioner further information about the similarity of the sequences, from which she could analyse and choose the most plausible alignment. We introduce several heuristic approaches, based on local search procedures, that compute a set of sequence alignments, which are representative of the trade-off between the two objectives (substitution score and indels). Several algorithm design options are discussed and analysed, with particular emphasis on the influence of the starting alignment and neighborhood search definitions on the overall performance. A perturbation technique is proposed to improve the local search, which provides a wide range of high-quality alignments. The proposed approach is tested experimentally on a wide range of instances. We performed several experiments with sequences obtained from the benchmark database BAliBASE 3.0. To evaluate the quality of the results, we calculate the hypervolume indicator of the set of score vectors returned by the algorithms. The results obtained allow us to identify reasonably good choices of parameters for our approach. Further, we compared our method in terms of correctly aligned pairs ratio and columns correctly aligned ratio with respect to reference alignments. Experimental results show that our approaches can obtain better results than TCoffee and Clustal Omega in terms of the first ratio.

  1. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome

    PubMed Central

    Margulies, Elliott H.; Cooper, Gregory M.; Asimenos, George; Thomas, Daryl J.; Dewey, Colin N.; Siepel, Adam; Birney, Ewan; Keefe, Damian; Schwartz, Ariel S.; Hou, Minmei; Taylor, James; Nikolaev, Sergey; Montoya-Burgos, Juan I.; Löytynoja, Ari; Whelan, Simon; Pardi, Fabio; Massingham, Tim; Brown, James B.; Bickel, Peter; Holmes, Ian; Mullikin, James C.; Ureta-Vidal, Abel; Paten, Benedict; Stone, Eric A.; Rosenbloom, Kate R.; Kent, W. James; Bouffard, Gerard G.; Guan, Xiaobin; Hansen, Nancy F.; Idol, Jacquelyn R.; Maduro, Valerie V.B.; Maskeri, Baishali; McDowell, Jennifer C.; Park, Morgan; Thomas, Pamela J.; Young, Alice C.; Blakesley, Robert W.; Muzny, Donna M.; Sodergren, Erica; Wheeler, David A.; Worley, Kim C.; Jiang, Huaiyang; Weinstock, George M.; Gibbs, Richard A.; Graves, Tina; Fulton, Robert; Mardis, Elaine R.; Wilson, Richard K.; Clamp, Michele; Cuff, James; Gnerre, Sante; Jaffe, David B.; Chang, Jean L.; Lindblad-Toh, Kerstin; Lander, Eric S.; Hinrichs, Angie; Trumbower, Heather; Clawson, Hiram; Zweig, Ann; Kuhn, Robert M.; Barber, Galt; Harte, Rachel; Karolchik, Donna; Field, Matthew A.; Moore, Richard A.; Matthewson, Carrie A.; Schein, Jacqueline E.; Marra, Marco A.; Antonarakis, Stylianos E.; Batzoglou, Serafim; Goldman, Nick; Hardison, Ross; Haussler, David; Miller, Webb; Pachter, Lior; Green, Eric D.; Sidow, Arend

    2007-01-01

    A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization. PMID:17567995

  2. FASMA: a service to format and analyze sequences in multiple alignments.

    PubMed

    Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M

    2007-12-01

    Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.

  3. Sleeve Push Technique: A Novel Method of Space Gaining.

    PubMed

    Verma, Sanjeev; Bhupali, Nameksh Raj; Gupta, Deepak Kumar; Singh, Sombir; Singh, Satinder Pal

    2018-01-01

    Space gaining is frequently required in orthodontics. Multiple loops were initially used for space gaining and alignment. The most common used mechanics for space gaining is the use of nickel-titanium open coil springs. The disadvantage of nickel-titanium coil spring is that they cannot be used until the arches are well aligned to receive the stiffer stainless steel wires. Therefore, a new method of gaining space during initial alignment and leveling has been developed and named as sleeve push technique (SPT). The nickel-titanium wires, i.e. 0.012 inches and 0.014 inches along with archwire sleeve (protective tubing) can be used in a modified way to gain space along with alignment. This method helps in gaining space right from day 1 of treatment. The archwire sleeve and nickel-titanium wire in this new SPT act as a mutually synergistic combination and provide the orthodontist with a completely new technique for space opening.

  4. Fabricating cooled electronic system with liquid-cooled cold plate and thermal spreader

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chainer, Timothy J.; Graybill, David P.; Iyengar, Madhusudan K.

    Methods are provided for facilitating cooling of an electronic component. The method includes providing a liquid-cooled cold plate and a thermal spreader associated with the cold plate. The cold plate includes multiple coolant-carrying channel sections extending within the cold plate, and a thermal conduction surface with a larger surface area than a surface area of the component to be cooled. The thermal spreader includes one or more heat pipes including multiple heat pipe sections. One or more heat pipe sections are partially aligned to a first region of the cold plate, that is, where aligned to the surface to bemore » cooled, and partially aligned to a second region of the cold plate, which is outside the first region. The one or more heat pipes facilitate distribution of heat from the electronic component to coolant-carrying channel sections of the cold plate located in the second region of the cold plate.« less

  5. Fabricating cooled electronic system with liquid-cooled cold plate and thermal spreader

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chainer, Timothy J.; Graybill, David P.; Iyengar, Madhusudan K.

    Methods are provided for facilitating cooling of an electronic component. The methods include providing a liquid-cooled cold plate and a thermal spreader associated with the cold plate. The cold plate includes multiple coolant-carrying channel sections extending within the cold plate, and a thermal conduction surface with a larger surface area than a surface area of the component to be cooled. The thermal spreader includes one or more heat pipes including multiple heat pipe sections. One or more heat pipe sections are partially aligned to a first region of the cold plate, that is, where aligned to the surface to bemore » cooled, and partially aligned to a second region of the cold plate, which is outside the first region. The one or more heat pipes facilitate distribution of heat from the electronic component to coolant-carrying channel sections of the cold plate located in the second region of the cold plate.« less

  6. System and method for detecting components of a mixture including tooth elements for alignment

    DOEpatents

    Sommer, Gregory Jon; Schaff, Ulrich Y.

    2016-11-22

    Examples are described including assay platforms having tooth elements. An impinging element may sequentially engage tooth elements on the assay platform to sequentially align corresponding detection regions with a detection unit. In this manner, multiple measurements may be made of detection regions on the assay platform without necessarily requiring the starting and stopping of a motor.

  7. Retrieving transient conformational molecular structure information from inner-shell photoionization of laser-aligned molecules

    PubMed Central

    Wang, Xu; Le, Anh-Thu; Yu, Chao; Lucchese, R. R.; Lin, C. D.

    2016-01-01

    We discuss a scheme to retrieve transient conformational molecular structure information using photoelectron angular distributions (PADs) that have averaged over partial alignments of isolated molecules. The photoelectron is pulled out from a localized inner-shell molecular orbital by an X-ray photon. We show that a transient change in the atomic positions from their equilibrium will lead to a sensitive change in the alignment-averaged PADs, which can be measured and used to retrieve the former. Exploiting the experimental convenience of changing the photon polarization direction, we show that it is advantageous to use PADs obtained from multiple photon polarization directions. A simple single-scattering model is proposed and benchmarked to describe the photoionization process and to do the retrieval using a multiple-parameter fitting method. PMID:27025410

  8. Retrieving transient conformational molecular structure information from inner-shell photoionization of laser-aligned molecules

    NASA Astrophysics Data System (ADS)

    Wang, Xu; Le, Anh-Thu; Yu, Chao; Lucchese, R. R.; Lin, C. D.

    2016-03-01

    We discuss a scheme to retrieve transient conformational molecular structure information using photoelectron angular distributions (PADs) that have averaged over partial alignments of isolated molecules. The photoelectron is pulled out from a localized inner-shell molecular orbital by an X-ray photon. We show that a transient change in the atomic positions from their equilibrium will lead to a sensitive change in the alignment-averaged PADs, which can be measured and used to retrieve the former. Exploiting the experimental convenience of changing the photon polarization direction, we show that it is advantageous to use PADs obtained from multiple photon polarization directions. A simple single-scattering model is proposed and benchmarked to describe the photoionization process and to do the retrieval using a multiple-parameter fitting method.

  9. QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families

    PubMed Central

    Gudyś, Adam; Deorowicz, Sebastian

    2017-01-01

    The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are ineffective when large protein families are investigated. We present QuickProbs 2, an algorithm for multiple sequence alignment. Based on probabilistic models, equipped with novel column-oriented refinement and selective consistency, it offers outstanding accuracy. When analysing hundreds of sequences, Quick-Probs 2 is noticeably better than ClustalΩ and MAFFT, the previous leaders for processing numerous protein families. In the case of smaller sets, for which consistency-based methods are the best performing, QuickProbs 2 is also superior to the competitors. Due to low computational requirements of selective consistency and utilization of massively parallel architectures, presented algorithm has similar execution times to ClustalΩ, and is orders of magnitude faster than full consistency approaches, like MSAProbs or PicXAA. All these make QuickProbs 2 an excellent tool for aligning families ranging from few, to hundreds of proteins. PMID:28139687

  10. Homeotropic alignment of multiple bent-core liquid crystal phases using a polydimethylsiloxane alignment layer

    NASA Astrophysics Data System (ADS)

    Carlson, Eric D.; Foley, Lee M.; Guzman, Edward; Korblova, Eva D.; Visvanathan, Rayshan; Ryu, SeongHo; Gim, Min-Jun; Tuchband, Michael R.; Yoon, Dong Ki; Clark, Noel A.; Walba, David M.

    2017-08-01

    The control of the molecular orientation of liquid crystals (LCs) is important in both understanding phase properties and the continuing development of new LC technologies including displays, organic transistors, and electro-optic devices. Many techniques have been developed for successfully inducing alignment of calamitic LCs, though these techniques typically do not translate to the alignment of bent-core liquid crystals (BCLCs). Some techniques have been utilized to align various phases of BCLCs, but these techniques are often unsuccessful for general alignment of multiple materials and/or multiple phases. Here, we demonstrate that glass cells treated with polydimethylsiloxane (PDMS) thin films induce high quality homeotropic alignment of multiple mesophases of four BCLCs. On cooling to the lowest temperature phase the homeotropic alignment is lost, and spherulitic growth is seen in crystal and crystal-like phases including the dark conglomerate (DC) and helical nanofilament (HNF) phases. Evidence of homeotropic alignment is observed using polarized optical microscopy. We speculate that the methyl groups on the surface of the PDMS films strongly interact with the aliphatic tails of each mesogens, resulting in homeotropic alignment.

  11. Parallel alignment of bacteria using near-field optical force array for cell sorting

    NASA Astrophysics Data System (ADS)

    Zhao, H. T.; Zhang, Y.; Chin, L. K.; Yap, P. H.; Wang, K.; Ser, W.; Liu, A. Q.

    2017-08-01

    This paper presents a near-field approach to align multiple rod-shaped bacteria based on the interference pattern in silicon nano-waveguide arrays. The bacteria in the optical field will be first trapped by the gradient force and then rotated by the scattering force to the equilibrium position. In the experiment, the Shigella bacteria is rotated 90 deg and aligned to horizontal direction in 9.4 s. Meanwhile, 150 Shigella is trapped on the surface in 5 min and 86% is aligned with angle < 5 deg. This method is a promising toolbox for the research of parallel single-cell biophysical characterization, cell-cell interaction, etc.

  12. Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring

    PubMed Central

    2012-01-01

    Background Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. Results The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Conclusions Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family. PMID:22793672

  13. Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring.

    PubMed

    Durston, Kirk K; Chiu, David Ky; Wong, Andrew Kc; Li, Gary Cl

    2012-07-13

    Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family.

  14. Whole-proteome phylogeny of large dsDNA viruses and parvoviruses through a composition vector method related to dynamical language model

    PubMed Central

    2010-01-01

    Background The vast sequence divergence among different virus groups has presented a great challenge to alignment-based analysis of virus phylogeny. Due to the problems caused by the uncertainty in alignment, existing tools for phylogenetic analysis based on multiple alignment could not be directly applied to the whole-genome comparison and phylogenomic studies of viruses. There has been a growing interest in alignment-free methods for phylogenetic analysis using complete genome data. Among the alignment-free methods, a dynamical language (DL) method proposed by our group has successfully been applied to the phylogenetic analysis of bacteria and chloroplast genomes. Results In this paper, the DL method is used to analyze the whole-proteome phylogeny of 124 large dsDNA viruses and 30 parvoviruses, two data sets with large difference in genome size. The trees from our analyses are in good agreement to the latest classification of large dsDNA viruses and parvoviruses by the International Committee on Taxonomy of Viruses (ICTV). Conclusions The present method provides a new way for recovering the phylogeny of large dsDNA viruses and parvoviruses, and also some insights on the affiliation of a number of unclassified viruses. In comparison, some alignment-free methods such as the CV Tree method can be used for recovering the phylogeny of large dsDNA viruses, but they are not suitable for resolving the phylogeny of parvoviruses with a much smaller genome size. PMID:20565983

  15. Multiple sequence alignment in HTML: colored, possibly hyperlinked, compact representations.

    PubMed

    Campagne, F; Maigret, B

    1998-02-01

    Protein sequence alignments are widely used in protein structure prediction, protein engineering, modeling of proteins, etc. This type of representation is useful at different stages of scientific activity: looking at previous results, working on a research project, and presenting the results. There is a need to make it available through a network (intranet or WWW), in a way that allows biologists, chemists, and noncomputer specialists to look at the data and carry on research--possibly in a collaborative research. Previous methods (text-based, Java-based) are reported and their advantages are discussed. We have developed two novel approaches to represent the alignments as colored, hyper-linked HTML pages. The first method creates an HTML page that uses efficiently the image cache mechanism of a WWW browser, thereby allowing the user to browse different alignments without waiting for the images to be loaded through the network, but only for the first viewed alignment. The generated pages can be browsed with any HTML2.0-compliant browser. The second method that we propose uses W3C-CSS1-style sheets to render alignments. This new method generates pages that require recent browsers to be viewed. We implemented these methods in the Viseur program and made a WWW service available that allows a user to convert an MSF alignment file in HTML for WWW publishing. The latter service is available at http:@www.lctn.u-nancy.fr/viseur/services.htm l.

  16. Alignment-free microbial phylogenomics under scenarios of sequence divergence, genome rearrangement and lateral genetic transfer.

    PubMed

    Bernard, Guillaume; Chan, Cheong Xin; Ragan, Mark A

    2016-07-01

    Alignment-free (AF) approaches have recently been highlighted as alternatives to methods based on multiple sequence alignment in phylogenetic inference. However, the sensitivity of AF methods to genome-scale evolutionary scenarios is little known. Here, using simulated microbial genome data we systematically assess the sensitivity of nine AF methods to three important evolutionary scenarios: sequence divergence, lateral genetic transfer (LGT) and genome rearrangement. Among these, AF methods are most sensitive to the extent of sequence divergence, less sensitive to low and moderate frequencies of LGT, and most robust against genome rearrangement. We describe the application of AF methods to three well-studied empirical genome datasets, and introduce a new application of the jackknife to assess node support. Our results demonstrate that AF phylogenomics is computationally scalable to multi-genome data and can generate biologically meaningful phylogenies and insights into microbial evolution.

  17. Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization.

    PubMed

    Bauer, Markus; Klau, Gunnar W; Reinert, Knut

    2007-07-27

    The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. We present a graph-based representation for sequence-structure alignments, which we model as an integer linear program (ILP). We sketch how we compute an optimal or near-optimal solution to the ILP using methods from combinatorial optimization, and present results on a recently published benchmark set for RNA alignments. The implementation of our algorithm yields better alignments in terms of two published scores than the other programs that we tested: This is especially the case with an increasing number of input sequences. Our program LARA is freely available for academic purposes from http://www.planet-lisa.net.

  18. Reliability of lower limb alignment measures using an established landmark-based method with a customized computer software program

    PubMed Central

    Sled, Elizabeth A.; Sheehy, Lisa M.; Felson, David T.; Costigan, Patrick A.; Lam, Miu; Cooke, T. Derek V.

    2010-01-01

    The objective of the study was to evaluate the reliability of frontal plane lower limb alignment measures using a landmark-based method by (1) comparing inter- and intra-reader reliability between measurements of alignment obtained manually with those using a computer program, and (2) determining inter- and intra-reader reliability of computer-assisted alignment measures from full-limb radiographs. An established method for measuring alignment was used, involving selection of 10 femoral and tibial bone landmarks. 1) To compare manual and computer methods, we used digital images and matching paper copies of five alignment patterns simulating healthy and malaligned limbs drawn using AutoCAD. Seven readers were trained in each system. Paper copies were measured manually and repeat measurements were performed daily for 3 days, followed by a similar routine with the digital images using the computer. 2) To examine the reliability of computer-assisted measures from full-limb radiographs, 100 images (200 limbs) were selected as a random sample from 1,500 full-limb digital radiographs which were part of the Multicenter Osteoarthritis (MOST) Study. Three trained readers used the software program to measure alignment twice from the batch of 100 images, with two or more weeks between batch handling. Manual and computer measures of alignment showed excellent agreement (intraclass correlations [ICCs] 0.977 – 0.999 for computer analysis; 0.820 – 0.995 for manual measures). The computer program applied to full-limb radiographs produced alignment measurements with high inter- and intra-reader reliability (ICCs 0.839 – 0.998). In conclusion, alignment measures using a bone landmark-based approach and a computer program were highly reliable between multiple readers. PMID:19882339

  19. Application of Alignment Methodologies to Spatial Ontologies in the Hydro Domain

    NASA Astrophysics Data System (ADS)

    Lieberman, J. E.; Cheatham, M.; Varanka, D.

    2015-12-01

    Ontologies are playing an increasing role in facilitating mediation and translation between datasets representing diverse schemas, vocabularies, or knowledge communities. This role is relatively straightforward when there is one ontology comprising all relevant common concepts that can be mapped to entities in each dataset. Frequently, one common ontology has not been agreed to. Either each dataset is represented by a distinct ontology, or there are multiple candidates for commonality. Either the one most appropriate (expressive, relevant, correct) ontology must be chosen, or else concepts and relationships matched across multiple ontologies through an alignment process so that they may be used in concert to carry out mediation or other semantic operations. A resulting alignment can be effective to the extent that entities in in the ontologies represent differing terminology for comparable conceptual knowledge. In cases such as spatial ontologies, though, ontological entities may also represent disparate conceptualizations of space according to the discernment methods and application domains on which they are based. One ontology's wetland concept may overlap in space with another ontology's recharge zone or wildlife range or water feature. In order to evaluate alignment with respect to spatial ontologies, alignment has been applied to a series of ontologies pertaining to surface water that are used variously in hydrography (characterization of water features), hydrology (study of water cycling), and water quality (nutrient and contaminant transport) application domains. There is frequently a need to mediate between datasets in each domain in order to develop broader understanding of surface water systems, so there is a practical as well theoretical value in the alignment. From a domain expertise standpoint, the ontologies under consideration clearly contain some concepts that are spatially as well as conceptually identical and then others with less clear similarities in either sense. Our study serves both to determine the limits of standard methods for aligning spatial ontologies and to suggest new methods of calculating similarity axioms that take into account semantic, spatial, and cognitive criteria relevant to fitness for relevant usage scenarios.

  20. Fourier transform power spectrum is a potential measure of tissue alignment in standard MRI: A multiple sclerosis study.

    PubMed

    Sharma, Shrushrita; Zhang, Yunyan

    2017-01-01

    Loss of tissue coherency in brain white matter is found in many neurological diseases such as multiple sclerosis (MS). While several approaches have been proposed to evaluate white matter coherency including fractional anisotropy and fiber tracking in diffusion-weighted imaging, few are available for standard magnetic resonance imaging (MRI). Here we present an image post-processing method for this purpose based on Fourier transform (FT) power spectrum. T2-weighted images were collected from 19 patients (10 relapsing-remitting and 9 secondary progressive MS) and 19 age- and gender-matched controls. Image processing steps included: computation, normalization, and thresholding of FT power spectrum; determination of tissue alignment profile and dominant alignment direction; and calculation of alignment complexity using a new measure named angular entropy. To test the validity of this method, we used a highly organized brain white matter structure, corpus callosum. Six regions of interest were examined from the left, central and right aspects of both genu and splenium. We found that the dominant orientation of each ROI derived from our method was significantly correlated with the predicted directions based on anatomy. There was greater angular entropy in patients than controls, and a trend to be greater in secondary progressive MS patients. These findings suggest that it is possible to detect tissue alignment and anisotropy using traditional MRI, which are routinely acquired in clinical practice. Analysis of FT power spectrum may become a new approach for advancing the evaluation and management of patients with MS and similar disorders. Further confirmation is warranted.

  1. Simultaneous dual-color fluorescence microscope: a characterization study.

    PubMed

    Li, Zheng; Chen, Xiaodong; Ren, Liqiang; Song, Jie; Li, Yuhua; Zheng, Bin; Liu, Hong

    2013-01-01

    High spatial resolution and geometric accuracy is crucial for chromosomal analysis of clinical cytogenetic applications. High resolution and rapid simultaneous acquisition of multiple fluorescent wavelengths can be achieved by utilizing concurrent imaging with multiple detectors. However, such class of microscopic systems functions differently from traditional fluorescence microscopes. To develop a practical characterization framework to assess and optimize the performance of a high resolution and dual-color fluorescence microscope designed for clinical chromosomal analysis. A dual-band microscopic imaging system utilizes a dichroic mirror, two sets of specially selected optical filters, and two detectors to simultaneously acquire two fluorescent wavelengths. The system's geometric distortion, linearity, the modulation transfer function, and the dual detectors' alignment were characterized. Experiment results show that the geometric distortion at lens periphery is less than 1%. Both fluorescent channels show linear signal responses, but there exists discrepancy between the two due to the detectors' non-uniform response ratio to different wavelengths. In terms of the spatial resolution, the two contrast transfer function curves trend agreeably with the spatial frequency. The alignment measurement allows quantitatively assessing the cameras' alignment. A result image of adjusted alignment is demonstrated to show the reduced discrepancy by using the alignment measurement method. In this paper, we present a system characterization study and its methods for a specially designed imaging system for clinical cytogenetic applications. The presented characterization methods are not only unique to this dual-color imaging system but also applicable to evaluation and optimization of other similar multi-color microscopic image systems for improving their clinical utilities for future cytogenetic applications.

  2. Phylogeny Reconstruction with Alignment-Free Method That Corrects for Horizontal Gene Transfer.

    PubMed

    Bromberg, Raquel; Grishin, Nick V; Otwinowski, Zbyszek

    2016-06-01

    Advances in sequencing have generated a large number of complete genomes. Traditionally, phylogenetic analysis relies on alignments of orthologs, but defining orthologs and separating them from paralogs is a complex task that may not always be suited to the large datasets of the future. An alternative to traditional, alignment-based approaches are whole-genome, alignment-free methods. These methods are scalable and require minimal manual intervention. We developed SlopeTree, a new alignment-free method that estimates evolutionary distances by measuring the decay of exact substring matches as a function of match length. SlopeTree corrects for horizontal gene transfer, for composition variation and low complexity sequences, and for branch-length nonlinearity caused by multiple mutations at the same site. We tested SlopeTree on 495 bacteria, 73 archaea, and 72 strains of Escherichia coli and Shigella. We compared our trees to the NCBI taxonomy, to trees based on concatenated alignments, and to trees produced by other alignment-free methods. The results were consistent with current knowledge about prokaryotic evolution. We assessed differences in tree topology over different methods and settings and found that the majority of bacteria and archaea have a core set of proteins that evolves by descent. In trees built from complete genomes rather than sets of core genes, we observed some grouping by phenotype rather than phylogeny, for instance with a cluster of sulfur-reducing thermophilic bacteria coming together irrespective of their phyla. The source-code for SlopeTree is available at: http://prodata.swmed.edu/download/pub/slopetree_v1/slopetree.tar.gz.

  3. Phylogeny Reconstruction with Alignment-Free Method That Corrects for Horizontal Gene Transfer

    PubMed Central

    Grishin, Nick V.; Otwinowski, Zbyszek

    2016-01-01

    Advances in sequencing have generated a large number of complete genomes. Traditionally, phylogenetic analysis relies on alignments of orthologs, but defining orthologs and separating them from paralogs is a complex task that may not always be suited to the large datasets of the future. An alternative to traditional, alignment-based approaches are whole-genome, alignment-free methods. These methods are scalable and require minimal manual intervention. We developed SlopeTree, a new alignment-free method that estimates evolutionary distances by measuring the decay of exact substring matches as a function of match length. SlopeTree corrects for horizontal gene transfer, for composition variation and low complexity sequences, and for branch-length nonlinearity caused by multiple mutations at the same site. We tested SlopeTree on 495 bacteria, 73 archaea, and 72 strains of Escherichia coli and Shigella. We compared our trees to the NCBI taxonomy, to trees based on concatenated alignments, and to trees produced by other alignment-free methods. The results were consistent with current knowledge about prokaryotic evolution. We assessed differences in tree topology over different methods and settings and found that the majority of bacteria and archaea have a core set of proteins that evolves by descent. In trees built from complete genomes rather than sets of core genes, we observed some grouping by phenotype rather than phylogeny, for instance with a cluster of sulfur-reducing thermophilic bacteria coming together irrespective of their phyla. The source-code for SlopeTree is available at: http://prodata.swmed.edu/download/pub/slopetree_v1/slopetree.tar.gz. PMID:27336403

  4. Label-aligned Multi-task Feature Learning for Multimodal Classification of Alzheimer’s Disease and Mild Cognitive Impairment

    PubMed Central

    Zu, Chen; Jie, Biao; Liu, Mingxia; Chen, Songcan

    2015-01-01

    Multimodal classification methods using different modalities of imaging and non-imaging data have recently shown great advantages over traditional single-modality-based ones for diagnosis and prognosis of Alzheimer’s disease (AD), as well as its prodromal stage, i.e., mild cognitive impairment (MCI). However, to the best of our knowledge, most existing methods focus on mining the relationship across multiple modalities of the same subjects, while ignoring the potentially useful relationship across different subjects. Accordingly, in this paper, we propose a novel learning method for multimodal classification of AD/MCI, by fully exploring the relationships across both modalities and subjects. Specifically, our proposed method includes two subsequent components, i.e., label-aligned multi-task feature selection and multimodal classification. In the first step, the feature selection learning from multiple modalities are treated as different learning tasks and a group sparsity regularizer is imposed to jointly select a subset of relevant features. Furthermore, to utilize the discriminative information among labeled subjects, a new label-aligned regularization term is added into the objective function of standard multi-task feature selection, where label-alignment means that all multi-modality subjects with the same class labels should be closer in the new feature-reduced space. In the second step, a multi-kernel support vector machine (SVM) is adopted to fuse the selected features from multi-modality data for final classification. To validate our method, we perform experiments on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database using baseline MRI and FDG-PET imaging data. The experimental results demonstrate that our proposed method achieves better classification performance compared with several state-of-the-art methods for multimodal classification of AD/MCI. PMID:26572145

  5. Retrieving transient conformational molecular structure information from inner-shell photoionization of laser-aligned molecules

    DOE PAGES

    Wang, Xu; Le, Anh -Thu; Yu, Chao; ...

    2016-03-30

    We discuss a scheme to retrieve transient conformational molecular structure information using photoelectron angular distributions (PADs) that have averaged over partial alignments of isolated molecules. The photoelectron is pulled out from a localized inner-shell molecular orbital by an X-ray photon. We show that a transient change in the atomic positions from their equilibrium will lead to a sensitive change in the alignment-averaged PADs, which can be measured and used to retrieve the former. Exploiting the experimental convenience of changing the photon polarization direction, we show that it is advantageous to use PADs obtained from multiple photon polarization directions. Lastly, amore » simple single-scattering model is proposed and benchmarked to describe the photoionization process and to do the retrieval using a multiple-parameter fitting method.« less

  6. Structural phylogeny by profile extraction and multiple superimposition using electrostatic congruence as a discriminator

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chakraborty, Sandeep; Rao, Basuthkar J.; Baker, Nathan A.

    2013-04-01

    Phylogenetic analysis of proteins using multiple sequence alignment (MSA) assumes an underlying evolutionary relationship in these proteins which occasionally remains undetected due to considerable sequence divergence. Structural alignment programs have been developed to unravel such fuzzy relationships. However, none of these structure based methods have used electrostatic properties to discriminate between spatially equivalent residues. We present a methodology for MSA of a set of related proteins with known structures using electrostatic properties as an additional discriminator (STEEP). STEEP first extracts a profile, then generates a multiple structural superimposition providing a consolidated spatial framework for comparing residues and finally emits themore » MSA. Residues that are aligned differently by including or excluding electrostatic properties can be targeted by directed evolution experiments to transform the enzymatic properties of one protein into another. We have compared STEEP results to those obtained from a MSA program (ClustalW) and a structural alignment method (MUSTANG) for chymotrypsin serine proteases. Subsequently, we used PhyML to generate phylogenetic trees for the serine and metallo-β-lactamase superfamilies from the STEEP generated MSA, and corroborated the accepted relationships in these superfamilies. We have observed that STEEP acts as a functional classifier when electrostatic congruence is used as a discriminator, and thus identifies potential targets for directed evolution experiments. In summary, STEEP is unique among phylogenetic methods for its ability to use electrostatic congruence to specify mutations that might be the source of the functional divergence in a protein family. Based on our results, we also hypothesize that the active site and its close vicinity contains enough information to infer the correct phylogeny for related proteins.« less

  7. Adaptive Local Realignment of Protein Sequences.

    PubMed

    DeBlasio, Dan; Kececioglu, John

    2018-06-11

    While mutation rates can vary markedly over the residues of a protein, multiple sequence alignment tools typically use the same values for their scoring-function parameters across a protein's entire length. We present a new approach, called adaptive local realignment, that in contrast automatically adapts to the diversity of mutation rates along protein sequences. This builds upon a recent technique known as parameter advising, which finds global parameter settings for an aligner, to now adaptively find local settings. Our approach in essence identifies local regions with low estimated accuracy, constructs a set of candidate realignments using a carefully-chosen collection of parameter settings, and replaces the region if a realignment has higher estimated accuracy. This new method of local parameter advising, when combined with prior methods for global advising, boosts alignment accuracy as much as 26% over the best default setting on hard-to-align protein benchmarks, and by 6.4% over global advising alone. Adaptive local realignment has been implemented within the Opal aligner using the Facet accuracy estimator.

  8. MICA: Multiple interval-based curve alignment

    NASA Astrophysics Data System (ADS)

    Mann, Martin; Kahle, Hans-Peter; Beck, Matthias; Bender, Bela Johannes; Spiecker, Heinrich; Backofen, Rolf

    2018-01-01

    MICA enables the automatic synchronization of discrete data curves. To this end, characteristic points of the curves' shapes are identified. These landmarks are used within a heuristic curve registration approach to align profile pairs by mapping similar characteristics onto each other. In combination with a progressive alignment scheme, this enables the computation of multiple curve alignments. Multiple curve alignments are needed to derive meaningful representative consensus data of measured time or data series. MICA was already successfully applied to generate representative profiles of tree growth data based on intra-annual wood density profiles or cell formation data. The MICA package provides a command-line and graphical user interface. The R interface enables the direct embedding of multiple curve alignment computation into larger analyses pipelines. Source code, binaries and documentation are freely available at https://github.com/BackofenLab/MICA

  9. Kernel-aligned multi-view canonical correlation analysis for image recognition

    NASA Astrophysics Data System (ADS)

    Su, Shuzhi; Ge, Hongwei; Yuan, Yun-Hao

    2016-09-01

    Existing kernel-based correlation analysis methods mainly adopt a single kernel in each view. However, only a single kernel is usually insufficient to characterize nonlinear distribution information of a view. To solve the problem, we transform each original feature vector into a 2-dimensional feature matrix by means of kernel alignment, and then propose a novel kernel-aligned multi-view canonical correlation analysis (KAMCCA) method on the basis of the feature matrices. Our proposed method can simultaneously employ multiple kernels to better capture the nonlinear distribution information of each view, so that correlation features learned by KAMCCA can have well discriminating power in real-world image recognition. Extensive experiments are designed on five real-world image datasets, including NIR face images, thermal face images, visible face images, handwritten digit images, and object images. Promising experimental results on the datasets have manifested the effectiveness of our proposed method.

  10. Alignment of gold nanorods by angular photothermal depletion

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Taylor, Adam B.; Chow, Timothy T. Y.; Chon, James W. M., E-mail: jchon@swin.edu.au

    2014-02-24

    In this paper, we demonstrate that a high degree of alignment can be imposed upon randomly oriented gold nanorod films by angular photothermal depletion with linearly polarized laser irradiation. The photothermal reshaping of gold nanorods is observed to follow quadratic melting model rather than the threshold melting model, which distorts the angular and spectral hole created on 2D distribution map of nanorods to be an open crater shape. We have accounted these observations to the alignment procedures and demonstrated good agreement between experiment and simulations. The use of multiple laser depletion wavelengths allowed alignment criteria over a large range ofmore » aspect ratios, achieving 80% of the rods in the target angular range. We extend the technique to demonstrate post-alignment in a multilayer of randomly oriented gold nanorod films, with arbitrary control of alignment shown across the layers. Photothermal angular depletion alignment of gold nanorods is a simple, promising post-alignment method for creating future 3D or multilayer plasmonic nanorod based devices and structures.« less

  11. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

    PubMed

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-07-15

    In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.

  12. Score distributions of gapped multiple sequence alignments down to the low-probability tail

    NASA Astrophysics Data System (ADS)

    Fieth, Pascal; Hartmann, Alexander K.

    2016-08-01

    Assessing the significance of alignment scores of optimally aligned DNA or amino acid sequences can be achieved via the knowledge of the score distribution of random sequences. But this requires obtaining the distribution in the biologically relevant high-scoring region, where the probabilities are exponentially small. For gapless local alignments of infinitely long sequences this distribution is known analytically to follow a Gumbel distribution. Distributions for gapped local alignments and global alignments of finite lengths can only be obtained numerically. To obtain result for the small-probability region, specific statistical mechanics-based rare-event algorithms can be applied. In previous studies, this was achieved for pairwise alignments. They showed that, contrary to results from previous simple sampling studies, strong deviations from the Gumbel distribution occur in case of finite sequence lengths. Here we extend the studies to multiple sequence alignments with gaps, which are much more relevant for practical applications in molecular biology. We study the distributions of scores over a large range of the support, reaching probabilities as small as 10-160, for global and local (sum-of-pair scores) multiple alignments. We find that even after suitable rescaling, eliminating the sequence-length dependence, the distributions for multiple alignment differ from the pairwise alignment case. Furthermore, we also show that the previously discussed Gaussian correction to the Gumbel distribution needs to be refined, also for the case of pairwise alignments.

  13. Accelerated Profile HMM Searches

    PubMed Central

    Eddy, Sean R.

    2011-01-01

    Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call “sparse rescaling”. These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches. PMID:22039361

  14. A Novel Center Star Multiple Sequence Alignment Algorithm Based on Affine Gap Penalty and K-Band

    NASA Astrophysics Data System (ADS)

    Zou, Quan; Shan, Xiao; Jiang, Yi

    Multiple sequence alignment is one of the most important topics in computational biology, but it cannot deal with the large data so far. As the development of copy-number variant(CNV) and Single Nucleotide Polymorphisms(SNP) research, many researchers want to align numbers of similar sequences for detecting CNV and SNP. In this paper, we propose a novel multiple sequence alignment algorithm based on affine gap penalty and k-band. It can align more quickly and accurately, that will be helpful for mining CNV and SNP. Experiments prove the performance of our algorithm.

  15. Development of unidentified dna-specific hif 1α gene of lizard (hemidactylus platyurus) which plays a role in tissue regeneration process

    NASA Astrophysics Data System (ADS)

    Novianti, T.; Sadikin, M.; Widia, S.; Juniantito, V.; Arida, E. A.

    2018-03-01

    Development of unidentified specific gene is essential to analyze the availability these genes in biological process. Identification unidentified specific DNA of HIF 1α genes is important to analyze their contribution in tissue regeneration process in lizard tail (Hemidactylus platyurus). Bioinformatics and PCR techniques are relatively an easier method to identify an unidentified gene. The most widely used method is BLAST (Basic Local Alignment Sequence Tools) method for alignment the sequences from the other organism. BLAST technique is online software from website https://blast.ncbi.nlm.nih.gov/Blast.cgi that capable to generate the similar sequences from closest kinship to distant kindship. Gecko japonicus is a species that it has closest kinship with H. platyurus. Comparing HIF 1 α gene sequence of G. japonicus with the other species used multiple alignment methods from Mega7 software. Conserved base areas were identified using Clustal IX method. Primary DNA of HIF 1 α gene was design by Primer3 software. HIF 1α gene of lizard (H. platyurus) was successfully amplified using a real-time PCR machine by primary DNA that we had designed from Gecko japonicus. Identification unidentified gene of HIF 1a lizard has been done successfully with multiple alignment method. The study was conducted by analyzing during the growth of tail on day 1, 3, 5, 7, 10, 13 and 17 of lizard tail after autotomy. Process amplification of HIF 1α gene was described by CT value in real time PCR machine. HIF 1α expression of gene is quantified by Livak formula. Chi-square statistic test is 0.000 which means that there is a different expression of HIF 1 α gene in every growth day treatment.

  16. DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors.

    PubMed

    Schmollinger, Martin; Nieselt, Kay; Kaufmann, Michael; Morgenstern, Burkhard

    2004-09-09

    Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a) pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b) For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.

  17. AlignMe—a membrane protein sequence alignment web server

    PubMed Central

    Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.

    2014-01-01

    We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425

  18. Sub-Diffraction Limited Writing based on Laser Induced Periodic Surface Structures (LIPSS).

    PubMed

    He, Xiaolong; Datta, Anurup; Nam, Woongsik; Traverso, Luis M; Xu, Xianfan

    2016-10-10

    Controlled fabrication of single and multiple nanostructures far below the diffraction limit using a method based on laser induced periodic surface structure (LIPSS) is presented. In typical LIPSS, multiple lines with a certain spatial periodicity, but often not well-aligned, were produced. In this work, well-controlled and aligned nanowires and nanogrooves with widths as small as 40 nm and 60 nm with desired orientation and length are fabricated. Moreover, single nanowire and nanogroove were fabricated based on the same mechanism for forming multiple, periodic structures. Combining numerical modeling and AFM/SEM analyses, it was found these nanostructures were formed through the interference between the incident laser radiation and the surface plasmons, the mechanism for forming LIPSS on a dielectric surface using a high power femtosecond laser. We expect that our method, in particular, the fabrication of single nanowires and nanogrooves could be a promising alternative for fabrication of nanoscale devices due to its simplicity, flexibility, and versatility.

  19. Sub-Diffraction Limited Writing based on Laser Induced Periodic Surface Structures (LIPSS)

    PubMed Central

    He, Xiaolong; Datta, Anurup; Nam, Woongsik; Traverso, Luis M.; Xu, Xianfan

    2016-01-01

    Controlled fabrication of single and multiple nanostructures far below the diffraction limit using a method based on laser induced periodic surface structure (LIPSS) is presented. In typical LIPSS, multiple lines with a certain spatial periodicity, but often not well-aligned, were produced. In this work, well-controlled and aligned nanowires and nanogrooves with widths as small as 40 nm and 60 nm with desired orientation and length are fabricated. Moreover, single nanowire and nanogroove were fabricated based on the same mechanism for forming multiple, periodic structures. Combining numerical modeling and AFM/SEM analyses, it was found these nanostructures were formed through the interference between the incident laser radiation and the surface plasmons, the mechanism for forming LIPSS on a dielectric surface using a high power femtosecond laser. We expect that our method, in particular, the fabrication of single nanowires and nanogrooves could be a promising alternative for fabrication of nanoscale devices due to its simplicity, flexibility, and versatility. PMID:27721428

  20. A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities.

    PubMed

    Bastien, Olivier; Ortet, Philippe; Roy, Sylvaine; Maréchal, Eric

    2005-03-10

    Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction. We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations.

  1. Image Alignment for Multiple Camera High Dynamic Range Microscopy.

    PubMed

    Eastwood, Brian S; Childs, Elisabeth C

    2012-01-09

    This paper investigates the problem of image alignment for multiple camera high dynamic range (HDR) imaging. HDR imaging combines information from images taken with different exposure settings. Combining information from multiple cameras requires an alignment process that is robust to the intensity differences in the images. HDR applications that use a limited number of component images require an alignment technique that is robust to large exposure differences. We evaluate the suitability for HDR alignment of three exposure-robust techniques. We conclude that image alignment based on matching feature descriptors extracted from radiant power images from calibrated cameras yields the most accurate and robust solution. We demonstrate the use of this alignment technique in a high dynamic range video microscope that enables live specimen imaging with a greater level of detail than can be captured with a single camera.

  2. Image Alignment for Multiple Camera High Dynamic Range Microscopy

    PubMed Central

    Eastwood, Brian S.; Childs, Elisabeth C.

    2012-01-01

    This paper investigates the problem of image alignment for multiple camera high dynamic range (HDR) imaging. HDR imaging combines information from images taken with different exposure settings. Combining information from multiple cameras requires an alignment process that is robust to the intensity differences in the images. HDR applications that use a limited number of component images require an alignment technique that is robust to large exposure differences. We evaluate the suitability for HDR alignment of three exposure-robust techniques. We conclude that image alignment based on matching feature descriptors extracted from radiant power images from calibrated cameras yields the most accurate and robust solution. We demonstrate the use of this alignment technique in a high dynamic range video microscope that enables live specimen imaging with a greater level of detail than can be captured with a single camera. PMID:22545028

  3. Embedding strategies for effective use of information from multiple sequence alignments.

    PubMed Central

    Henikoff, S.; Henikoff, J. G.

    1997-01-01

    We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain. PMID:9070452

  4. Taxonaut: an application software for comparative display of multiple taxonomies with a use case of GBIF Species API.

    PubMed

    Ytow, Nozomi

    2016-01-01

    The Species API of the Global Biodiversity Information Facility (GBIF) provides public access to taxonomic data aggregated from multiple data sources. Each data source follows its own classification which can be inconsistent with classifications from other sources. Even with a reference classification e.g. the GBIF Backbone taxonomy, a comprehensive method to compare classifications in the data aggregation is essential, especially for non-expert users. A Java application was developed to compare multiple taxonomies graphically using classification data acquired from GBIF's ChecklistBank via the GBIF Species API. It uses a table to display taxonomies where each column represents a taxonomy under comparison, with an aligner column to organise taxa by name. Each cell contains the name of a taxon if the classification in that column contains the name. Each column also has a cell showing the hierarchy of the taxonomy by a folder metaphor where taxa are aligned and synchronised in the aligner column. A set of those comparative tables shows taxa categorised by relationship between taxonomies. The result set is also available as tables in an Excel format file.

  5. Using reconfigurable hardware to accelerate multiple sequence alignment with ClustalW.

    PubMed

    Oliver, Tim; Schmidt, Bertil; Nathan, Darran; Clemens, Ralf; Maskell, Douglas

    2005-08-15

    Aligning hundreds of sequences using progressive alignment tools such as ClustalW requires several hours on state-of-the-art workstations. We present a new approach to compute multiple sequence alignments in far shorter time using reconfigurable hardware. This results in an implementation of ClustalW with significant runtime savings on a standard off-the-shelf FPGA.

  6. Resolving the double tension: Toward a new approach to measurement modeling in cross-national research

    NASA Astrophysics Data System (ADS)

    Medina, Tait Runnfeldt

    The increasing global reach of survey research provides sociologists with new opportunities to pursue theory building and refinement through comparative analysis. However, comparison across a broad array of diverse contexts introduces methodological complexities related to the development of constructs (i.e., measurement modeling) that if not adequately recognized and properly addressed undermine the quality of research findings and cast doubt on the validity of substantive conclusions. The motivation for this dissertation arises from a concern that the availability of cross-national survey data has outpaced sociologists' ability to appropriately analyze and draw meaningful conclusions from such data. I examine the implicit assumptions and detail the limitations of three commonly used measurement models in cross-national analysis---summative scale, pooled factor model, and multiple-group factor model with measurement invariance. Using the orienting lens of the double tension I argue that a new approach to measurement modeling that incorporates important cross-national differences into the measurement process is needed. Two such measurement models---multiple-group factor model with partial measurement invariance (Byrne, Shavelson and Muthen 1989) and the alignment method (Asparouhov and Muthen 2014; Muthen and Asparouhov 2014)---are discussed in detail and illustrated using a sociologically relevant substantive example. I demonstrate that the former approach is vulnerable to an identification problem that arbitrarily impacts substantive conclusions. I conclude that the alignment method is built on model assumptions that are consistent with theoretical understandings of cross-national comparability and provides an approach to measurement modeling and construct development that is uniquely suited for cross-national research. The dissertation makes three major contributions: First, it provides theoretical justification for a new cross-national measurement model and explicates a link between theoretical conceptions of cross-national comparability and a statistical method. Second, it provides a clear and detailed discussion of model identification in multiple-group confirmatory factor analysis that is missing from the literature. This discussion sets the stage for the introduction of the identification problem within multiple-group confirmatory factor analysis with partial measurement invariance and the alternative approach to model identification employed by the alignment method. Third, it offers the first pedagogical presentation of the alignment method using a sociologically relevant example.

  7. Enzyme sequence similarity improves the reaction alignment method for cross-species pathway comparison

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ovacik, Meric A.; Androulakis, Ioannis P., E-mail: yannis@rci.rutgers.edu; Biomedical Engineering Department, Rutgers University, Piscataway, NJ 08854

    2013-09-15

    Pathway-based information has become an important source of information for both establishing evolutionary relationships and understanding the mode of action of a chemical or pharmaceutical among species. Cross-species comparison of pathways can address two broad questions: comparison in order to inform evolutionary relationships and to extrapolate species differences used in a number of different applications including drug and toxicity testing. Cross-species comparison of metabolic pathways is complex as there are multiple features of a pathway that can be modeled and compared. Among the various methods that have been proposed, reaction alignment has emerged as the most successful at predicting phylogeneticmore » relationships based on NCBI taxonomy. We propose an improvement of the reaction alignment method by accounting for sequence similarity in addition to reaction alignment method. Using nine species, including human and some model organisms and test species, we evaluate the standard and improved comparison methods by analyzing glycolysis and citrate cycle pathways conservation. In addition, we demonstrate how organism comparison can be conducted by accounting for the cumulative information retrieved from nine pathways in central metabolism as well as a more complete study involving 36 pathways common in all nine species. Our results indicate that reaction alignment with enzyme sequence similarity results in a more accurate representation of pathway specific cross-species similarities and differences based on NCBI taxonomy.« less

  8. Attenuation-emission alignment in cardiac PET∕CT based on consistency conditions

    PubMed Central

    Alessio, Adam M.; Kinahan, Paul E.; Champley, Kyle M.; Caldwell, James H.

    2010-01-01

    Purpose: In cardiac PET and PET∕CT imaging, misaligned transmission and emission images are a common problem due to respiratory and cardiac motion. This misalignment leads to erroneous attenuation correction and can cause errors in perfusion mapping and quantification. This study develops and tests a method for automated alignment of attenuation and emission data. Methods: The CT-based attenuation map is iteratively transformed until the attenuation corrected emission data minimize an objective function based on the Radon consistency conditions. The alignment process is derived from previous work by Welch et al. [“Attenuation correction in PET using consistency information,” IEEE Trans. Nucl. Sci. 45, 3134–3141 (1998)] for stand-alone PET imaging. The process was evaluated with the simulated data and measured patient data from multiple cardiac ammonia PET∕CT exams. The alignment procedure was applied to simulations of five different noise levels with three different initial attenuation maps. For the measured patient data, the alignment procedure was applied to eight attenuation-emission combinations with initially acceptable alignment and eight combinations with unacceptable alignment. The initially acceptable alignment studies were forced out of alignment a known amount and quantitatively evaluated for alignment and perfusion accuracy. The initially unacceptable studies were compared to the proposed aligned images in a blinded side-by-side review. Results: The proposed automatic alignment procedure reduced errors in the simulated data and iteratively approaches global minimum solutions with the patient data. In simulations, the alignment procedure reduced the root mean square error to less than 5 mm and reduces the axial translation error to less than 1 mm. In patient studies, the procedure reduced the translation error by >50% and resolved perfusion artifacts after a known misalignment for the eight initially acceptable patient combinations. The side-by-side review of the proposed aligned attenuation-emission maps and initially misaligned attenuation-emission maps revealed that reviewers preferred the proposed aligned maps in all cases, except one inconclusive case. Conclusions: The proposed alignment procedure offers an automatic method to reduce attenuation correction artifacts in cardiac PET∕CT and provides a viable supplement to subjective manual realignment tools. PMID:20384256

  9. MONKEY: Identifying conserved transcription-factor binding sitesin multiple alignments using a binding site-specific evolutionarymodel

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moses, Alan M.; Chiang, Derek Y.; Pollard, Daniel A.

    2004-10-28

    We introduce a method (MONKEY) to identify conserved transcription-factor binding sites in multispecies alignments. MONKEY employs probabilistic models of factor specificity and binding site evolution, on which basis we compute the likelihood that putative sites are conserved and assign statistical significance to each hit. Using genomes from the genus Saccharomyces, we illustrate how the significance of real sites increases with evolutionary distance and explore the relationship between conservation and function.

  10. Automatic initialization for 3D bone registration

    NASA Astrophysics Data System (ADS)

    Foroughi, Pezhman; Taylor, Russell H.; Fichtinger, Gabor

    2008-03-01

    In image-guided bone surgery, sample points collected from the surface of the bone are registered to the preoperative CT model using well-known registration methods such as Iterative Closest Point (ICP). These techniques are generally very sensitive to the initial alignment of the datasets. Poor initialization significantly increases the chances of getting trapped local minima. In order to reduce the risk of local minima, the registration is manually initialized by locating the sample points close to the corresponding points on the CT model. In this paper, we present an automatic initialization method that aligns the sample points collected from the surface of pelvis with CT model of the pelvis. The main idea is to exploit a mean shape of pelvis created from a large number of CT scans as the prior knowledge to guide the initial alignment. The mean shape is constant for all registrations and facilitates the inclusion of application-specific information into the registration process. The CT model is first aligned with the mean shape using the bilateral symmetry of the pelvis and the similarity of multiple projections. The surface points collected using ultrasound are then aligned with the pelvis mean shape. This will, in turn, lead to initial alignment of the sample points with the CT model. The experiments using a dry pelvis and two cadavers show that the method can align the randomly dislocated datasets close enough for successful registration. The standard ICP has been used for final registration of datasets.

  11. A precise method for adjusting the optical system of laser sub-aperture

    NASA Astrophysics Data System (ADS)

    Song, Xing; Zhang, Xue-min; Yang, Jianfeng; Xue, Li

    2018-02-01

    In order to adapt to the requirement of modern astronomical observation and warfare, the resolution of the space telescope is needed to improve, sub-aperture stitching imaging technique is one method to improve the resolution, which could be used not only the foundation and space-based large optical systems, also used in laser transmission and microscopic imaging. A large aperture main mirror of sub-aperture stitching imaging system is composed of multiple sub-mirrors distributed according to certain laws. All sub-mirrors are off-axis mirror, so the alignment of sub-aperture stitching imaging system is more complicated than a single off-axis optical system. An alignment method based on auto-collimation imaging and interferometric imaging is introduced in this paper, by using this alignment method, a sub-aperture stitching imaging system which is composed of 12 sub-mirrors was assembled with high resolution, the beam coincidence precision is better than 0.01mm, and the system wave aberration is better than 0.05λ.

  12. Rapid detection, classification and accurate alignment of up to a million or more related protein sequences.

    PubMed

    Neuwald, Andrew F

    2009-08-01

    The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical. This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. Supplementary data are available at Bioinformatics online.

  13. Simultaneous gene finding in multiple genomes.

    PubMed

    König, Stefanie; Romoth, Lars W; Gerischer, Lizzy; Stanke, Mario

    2016-11-15

    As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or-if not-where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach. The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances. The method is implemented in C ++ as part of Augustus and available open source at http://bioinf.uni-greifswald.de/augustus/ CONTACT: stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Simultaneous alignment and clustering of peptide data using a Gibbs sampling approach.

    PubMed

    Andreatta, Massimo; Lund, Ole; Nielsen, Morten

    2013-01-01

    Proteins recognizing short peptide fragments play a central role in cellular signaling. As a result of high-throughput technologies, peptide-binding protein specificities can be studied using large peptide libraries at dramatically lower cost and time. Interpretation of such large peptide datasets, however, is a complex task, especially when the data contain multiple receptor binding motifs, and/or the motifs are found at different locations within distinct peptides. The algorithm presented in this article, based on Gibbs sampling, identifies multiple specificities in peptide data by performing two essential tasks simultaneously: alignment and clustering of peptide data. We apply the method to de-convolute binding motifs in a panel of peptide datasets with different degrees of complexity spanning from the simplest case of pre-aligned fixed-length peptides to cases of unaligned peptide datasets of variable length. Example applications described in this article include mixtures of binders to different MHC class I and class II alleles, distinct classes of ligands for SH3 domains and sub-specificities of the HLA-A*02:01 molecule. The Gibbs clustering method is available online as a web server at http://www.cbs.dtu.dk/services/GibbsCluster.

  15. Classification of G-protein coupled receptors based on a rich generation of convolutional neural network, N-gram transformation and multiple sequence alignments.

    PubMed

    Li, Man; Ling, Cheng; Xu, Qi; Gao, Jingyang

    2018-02-01

    Sequence classification is crucial in predicting the function of newly discovered sequences. In recent years, the prediction of the incremental large-scale and diversity of sequences has heavily relied on the involvement of machine-learning algorithms. To improve prediction accuracy, these algorithms must confront the key challenge of extracting valuable features. In this work, we propose a feature-enhanced protein classification approach, considering the rich generation of multiple sequence alignment algorithms, N-gram probabilistic language model and the deep learning technique. The essence behind the proposed method is that if each group of sequences can be represented by one feature sequence, composed of homologous sites, there should be less loss when the sequence is rebuilt, when a more relevant sequence is added to the group. On the basis of this consideration, the prediction becomes whether a query sequence belonging to a group of sequences can be transferred to calculate the probability that the new feature sequence evolves from the original one. The proposed work focuses on the hierarchical classification of G-protein Coupled Receptors (GPCRs), which begins by extracting the feature sequences from the multiple sequence alignment results of the GPCRs sub-subfamilies. The N-gram model is then applied to construct the input vectors. Finally, these vectors are imported into a convolutional neural network to make a prediction. The experimental results elucidate that the proposed method provides significant performance improvements. The classification error rate of the proposed method is reduced by at least 4.67% (family level I) and 5.75% (family Level II), in comparison with the current state-of-the-art methods. The implementation program of the proposed work is freely available at: https://github.com/alanFchina/CNN .

  16. A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

    PubMed

    Luczak, Brian B; James, Benjamin T; Girgis, Hani Z

    2017-12-06

    Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.

  17. Cooled electronic system with liquid-cooled cold plate and thermal spreader coupled to electronic component

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chainer, Timothy J.; Graybill, David P.; Iyengar, Madhusudan K.

    Apparatus and method are provided for facilitating cooling of an electronic component. The apparatus includes a liquid-cooled cold plate and a thermal spreader associated with the cold plate. The cold plate includes multiple coolant-carrying channel sections extending within the cold plate, and a thermal conduction surface with a larger surface area than a surface area of the component to be cooled. The thermal spreader includes one or more heat pipes including multiple heat pipe sections. One or more heat pipe sections are partially aligned to a first region of the cold plate, that is, where aligned to the surface tomore » be cooled, and partially aligned to a second region of the cold plate, which is outside the first region. The one or more heat pipes facilitate distribution of heat from the electronic component to coolant-carrying channel sections of the cold plate located in the second region of the cold plate.« less

  18. Cooled electronic system with liquid-cooled cold plate and thermal spreader coupled to electronic component

    DOEpatents

    Chainer, Timothy J.; Graybill, David P.; Iyengar, Madhusudan K.; Kamath, Vinod; Kochuparambil, Bejoy J.; Schmidt, Roger R.; Steinke, Mark E.

    2016-08-09

    Apparatus and method are provided for facilitating cooling of an electronic component. The apparatus includes a liquid-cooled cold plate and a thermal spreader associated with the cold plate. The cold plate includes multiple coolant-carrying channel sections extending within the cold plate, and a thermal conduction surface with a larger surface area than a surface area of the component to be cooled. The thermal spreader includes one or more heat pipes including multiple heat pipe sections. One or more heat pipe sections are partially aligned to a first region of the cold plate, that is, where aligned to the surface to be cooled, and partially aligned to a second region of the cold plate, which is outside the first region. The one or more heat pipes facilitate distribution of heat from the electronic component to coolant-carrying channel sections of the cold plate located in the second region of the cold plate.

  19. Cooled electronic system with liquid-cooled cold plate and thermal spreader coupled to electronic component

    DOEpatents

    Chainer, Timothy J.; Graybill, David P.; Iyengar, Madhusudan K.; Kamath, Vinod; Kochuparambil, Bejoy J.; Schmidt, Roger R.; Steinke, Mark E.

    2016-04-05

    Apparatus and method are provided for facilitating cooling of an electronic component. The apparatus includes a liquid-cooled cold plate and a thermal spreader associated with the cold plate. The cold plate includes multiple coolant-carrying channel sections extending within the cold plate, and a thermal conduction surface with a larger surface area than a surface area of the component to be cooled. The thermal spreader includes one or more heat pipes including multiple heat pipe sections. One or more heat pipe sections are partially aligned to a first region of the cold plate, that is, where aligned to the surface to be cooled, and partially aligned to a second region of the cold plate, which is outside the first region. The one or more heat pipes facilitate distribution of heat from the electronic component to coolant-carrying channel sections of the cold plate located in the second region of the cold plate.

  20. Taxonaut: an application software for comparative display of multiple taxonomies with a use case of GBIF Species API

    PubMed Central

    2016-01-01

    Abstract Background The Species API of the Global Biodiversity Information Facility (GBIF) provides public access to taxonomic data aggregated from multiple data sources. Each data source follows its own classification which can be inconsistent with classifications from other sources. Even with a reference classification e.g. the GBIF Backbone taxonomy, a comprehensive method to compare classifications in the data aggregation is essential, especially for non-expert users. New information A Java application was developed to compare multiple taxonomies graphically using classification data acquired from GBIF’s ChecklistBank via the GBIF Species API. It uses a table to display taxonomies where each column represents a taxonomy under comparison, with an aligner column to organise taxa by name. Each cell contains the name of a taxon if the classification in that column contains the name. Each column also has a cell showing the hierarchy of the taxonomy by a folder metaphor where taxa are aligned and synchronised in the aligner column. A set of those comparative tables shows taxa categorised by relationship between taxonomies. The result set is also available as tables in an Excel format file. PMID:27932916

  1. Aligning Solution-Derived Carbon Nanotube Film with Full Surface Coverage for High-Performance Electronics Applications.

    PubMed

    Zhu, Ma-Guang; Si, Jia; Zhang, Zhiyong; Peng, Lian-Mao

    2018-06-01

    The main challenge for application of solution-derived carbon nanotubes (CNTs) in high performance field-effect transistor (FET) is how to align CNTs into an array with high density and full surface coverage. A directional shrinking transfer method is developed to realize high density aligned array based on randomly orientated CNT network film. Through transferring a solution-derived CNT network film onto a stretched retractable film followed by a shrinking process, alignment degree and density of CNT film increase with the shrinking multiple. The quadruply shrunk CNT films present well alignment, which is identified by the polarized Raman spectroscopy and electrical transport measurements. Based on the high quality and high density aligned CNT array, the fabricated FETs with channel length of 300 nm present ultrahigh performance including on-state current I on of 290 µA µm -1 (V ds = -1.5 V and V gs = -2 V) and peak transconductance g m of 150 µS µm -1 , which are, respectively, among the highest corresponding values in the reported CNT array FETs. High quality and high semiconducting purity CNT arrays with high density and full coverage obtained through this method promote the development of high performance CNT-based electronics. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. BiPACE 2D--graph-based multiple alignment for comprehensive 2D gas chromatography-mass spectrometry.

    PubMed

    Hoffmann, Nils; Wilhelm, Mathias; Doebbe, Anja; Niehaus, Karsten; Stoye, Jens

    2014-04-01

    Comprehensive 2D gas chromatography-mass spectrometry is an established method for the analysis of complex mixtures in analytical chemistry and metabolomics. It produces large amounts of data that require semiautomatic, but preferably automatic handling. This involves the location of significant signals (peaks) and their matching and alignment across different measurements. To date, there exist only a few openly available algorithms for the retention time alignment of peaks originating from such experiments that scale well with increasing sample and peak numbers, while providing reliable alignment results. We describe BiPACE 2D, an automated algorithm for retention time alignment of peaks from 2D gas chromatography-mass spectrometry experiments and evaluate it on three previously published datasets against the mSPA, SWPA and Guineu algorithms. We also provide a fourth dataset from an experiment studying the H2 production of two different strains of Chlamydomonas reinhardtii that is available from the MetaboLights database together with the experimental protocol, peak-detection results and manually curated multiple peak alignment for future comparability with newly developed algorithms. BiPACE 2D is contained in the freely available Maltcms framework, version 1.3, hosted at http://maltcms.sf.net, under the terms of the L-GPL v3 or Eclipse Open Source licenses. The software used for the evaluation along with the underlying datasets is available at the same location. The C.reinhardtii dataset is freely available at http://www.ebi.ac.uk/metabolights/MTBLS37.

  3. Precision alignment and calibration of optical systems using computer generated holograms

    NASA Astrophysics Data System (ADS)

    Coyle, Laura Elizabeth

    As techniques for manufacturing and metrology advance, optical systems are being designed with more complexity than ever before. Given these prescriptions, alignment and calibration can be a limiting factor in their final performance. Computer generated holograms (CGHs) have several unique properties that make them powerful tools for meeting these demanding tolerances. This work will present three novel methods for alignment and calibration of optical systems using computer generated holograms. Alignment methods using CGHs require that the optical wavefront created by the CGH be related to a mechanical datum to locate it space. An overview of existing methods is provided as background, then two new alignment methods are discussed in detail. In the first method, the CGH contact Ball Alignment Tool (CBAT) is used to align a ball or sphere mounted retroreflector (SMR) to a Fresnel zone plate pattern with micron level accuracy. The ball is bonded directly onto the CGH substrate and provides permanent, accurate registration between the optical wavefront and a mechanical reference to locate the CGH in space. A prototype CBAT was built and used to align and bond an SMR to a CGH. In the second method, CGH references are used to align axi-symmetric optics in four degrees of freedom with low uncertainty and real time feedback. The CGHs create simultaneous 3D optical references where the zero order reflection sets tilt and the first diffracted order sets centration. The flexibility of the CGH design can be used to accommodate a wide variety of optical systems and maximize sensitivity to misalignments. A 2-CGH prototype system was aligned multiplied times and the alignment uncertainty was quantified and compared to an error model. Finally, an enhanced calibration method is presented. It uses multiple perturbed measurements of a master sphere to improve the calibration of CGH-based Fizeau interferometers ultimately measuring aspheric test surfaces. The improvement in the calibration is a function of the interferometer error and the aspheric departure of the desired test surface. This calibration is most effective at reducing coma and trefoil from figure error or misalignments of the interferometer components. The enhanced calibration can reduce overall measurement uncertainty or allow the budgeted error contribution from another source to be increased. A single set of sphere measurements can be used to calculate calibration maps for closely related aspheres, including segmented primary mirrors for telescopes. A parametric model is developed and compared to the simulated calibration of a case study interferometer.

  4. CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments.

    PubMed

    Zhou, Carol L Ecale

    2015-01-01

    In order to better define regions of similarity among related protein structures, it is useful to identify the residue-residue correspondences among proteins. Few codes exist for constructing a one-to-many multiple sequence alignment derived from a set of structure or sequence alignments, and a need was evident for creating such a tool for combining pairwise structure alignments that would allow for insertion of gaps in the reference structure. This report describes a new Python code, CombAlign, which takes as input a set of pairwise sequence alignments (which may be structure based) and generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA). The use and utility of CombAlign was demonstrated by generating gapped MSSAs using sets of pairwise structure-based sequence alignments between structure models of the matrix protein (VP40) and pre-small/secreted glycoprotein (sGP) of Reston Ebolavirus and the corresponding proteins of several other filoviruses. The gapped MSSAs revealed structure-based residue-residue correspondences, which enabled identification of structurally similar versus differing regions in the Reston proteins compared to each of the other corresponding proteins. CombAlign is a new Python code that generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA) given a set of pairwise sequence alignments (which may be structure based). CombAlign has utility in assisting the user in distinguishing structurally conserved versus divergent regions on a reference protein structure relative to other closely related proteins. CombAlign was developed in Python 2.6, and the source code is available for download from the GitHub code repository.

  5. Practical, computer-aided registration of multiple, three-dimensional, magnetic-resonance observations of the human brain

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Diegert, C.; Sanders, J.A.; Orrison, W.W. Jr.

    1992-12-31

    Researchers working with MR observations generally agree that far more information is available in a volume (3D) observation than is considered for diagnosis. The key to the new alignment method is in basing it on available information on surfaces. Using the skin surface is effective a robust algorithm can reliably extract this surface from almost any scan of the head, and a human operator`s exquisite sensitivity to facial features is allows him to manually align skin surfaces with precision. Following the definitions, we report on a preliminary experiment where we align three MR observations taken during a single MR examination,more » each weighting arterial, venous, and tissue features. When accurately aligned, a neurosurgeon can use these features as anatomical landmarks for planning and executing interventional procedures.« less

  6. High-speed multiple sequence alignment on a reconfigurable platform.

    PubMed

    Oliver, Tim; Schmidt, Bertil; Maskell, Douglas; Nathan, Darran; Clemens, Ralf

    2006-01-01

    Progressive alignment is a widely used approach to compute multiple sequence alignments (MSAs). However, aligning several hundred sequences by popular progressive alignment tools requires hours on sequential computers. Due to the rapid growth of sequence databases biologists have to compute MSAs in a far shorter time. In this paper we present a new approach to MSA on reconfigurable hardware platforms to gain high performance at low cost. We have constructed a linear systolic array to perform pairwise sequence distance computations using dynamic programming. This results in an implementation with significant runtime savings on a standard FPGA.

  7. Effect of the sequence data deluge on the performance of methods for detecting protein functional residues.

    PubMed

    Garrido-Martín, Diego; Pazos, Florencio

    2018-02-27

    The exponential accumulation of new sequences in public databases is expected to improve the performance of all the approaches for predicting protein structural and functional features. Nevertheless, this was never assessed or quantified for some widely used methodologies, such as those aimed at detecting functional sites and functional subfamilies in protein multiple sequence alignments. Using raw protein sequences as only input, these approaches can detect fully conserved positions, as well as those with a family-dependent conservation pattern. Both types of residues are routinely used as predictors of functional sites and, consequently, understanding how the sequence content of the databases affects them is relevant and timely. In this work we evaluate how the growth and change with time in the content of sequence databases affect five sequence-based approaches for detecting functional sites and subfamilies. We do that by recreating historical versions of the multiple sequence alignments that would have been obtained in the past based on the database contents at different time points, covering a period of 20 years. Applying the methods to these historical alignments allows quantifying the temporal variation in their performance. Our results show that the number of families to which these methods can be applied sharply increases with time, while their ability to detect potentially functional residues remains almost constant. These results are informative for the methods' developers and final users, and may have implications in the design of new sequencing initiatives.

  8. Deblurring of Class-Averaged Images in Single-Particle Electron Microscopy.

    PubMed

    Park, Wooram; Madden, Dean R; Rockmore, Daniel N; Chirikjian, Gregory S

    2010-03-01

    This paper proposes a method for deblurring of class-averaged images in single-particle electron microscopy (EM). Since EM images of biological samples are very noisy, the images which are nominally identical projection images are often grouped, aligned and averaged in order to cancel or reduce the background noise. However, the noise in the individual EM images generates errors in the alignment process, which creates an inherent limit on the accuracy of the resulting class averages. This inaccurate class average due to the alignment errors can be viewed as the result of a convolution of an underlying clear image with a blurring function. In this work, we develop a deconvolution method that gives an estimate for the underlying clear image from a blurred class-averaged image using precomputed statistics of misalignment. Since this convolution is over the group of rigid body motions of the plane, SE(2), we use the Fourier transform for SE(2) in order to convert the convolution into a matrix multiplication in the corresponding Fourier space. For practical implementation we use a Hermite-function-based image modeling technique, because Hermite expansions enable lossless Cartesian-polar coordinate conversion using the Laguerre-Fourier expansions, and Hermite expansion and Laguerre-Fourier expansion retain their structures under the Fourier transform. Based on these mathematical properties, we can obtain the deconvolution of the blurred class average using simple matrix multiplication. Tests of the proposed deconvolution method using synthetic and experimental EM images confirm the performance of our method.

  9. Development and application of an algorithm to compute weighted multiple glycan alignments.

    PubMed

    Hosoda, Masae; Akune, Yukie; Aoki-Kinoshita, Kiyoko F

    2017-05-01

    A glycan consists of monosaccharides linked by glycosidic bonds, has branches and forms complex molecular structures. Databases have been developed to store large amounts of glycan-binding experiments, including glycan arrays with glycan-binding proteins. However, there are few bioinformatics techniques to analyze large amounts of data for glycans because there are few tools that can handle the complexity of glycan structures. Thus, we have developed the MCAW (Multiple Carbohydrate Alignment with Weights) tool that can align multiple glycan structures, to aid in the understanding of their function as binding recognition molecules. We have described in detail the first algorithm to perform multiple glycan alignments by modeling glycans as trees. To test our tool, we prepared several data sets, and as a result, we found that the glycan motif could be successfully aligned without any prior knowledge applied to the tool, and the known recognition binding sites of glycans could be aligned at a high rate amongst all our datasets tested. We thus claim that our tool is able to find meaningful glycan recognition and binding patterns using data obtained by glycan-binding experiments. The development and availability of an effective multiple glycan alignment tool opens possibilities for many other glycoinformatics analysis, making this work a big step towards furthering glycomics analysis. http://www.rings.t.soka.ac.jp. kkiyoko@soka.ac.jp. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  10. GenomeVista

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Poliakov, Alexander; Couronne, Olivier

    2002-11-04

    Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less

  11. Aligning the unalignable: bacteriophage whole genome alignments.

    PubMed

    Bérard, Sèverine; Chateau, Annie; Pompidor, Nicolas; Guertin, Paul; Bergeron, Anne; Swenson, Krister M

    2016-01-13

    In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressive Mauve aligner - which implements a partial order strategy, but whose alignments are linearized - shows a greatly improved interactive graphic display, while avoiding misalignments. Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha).

  12. GASP: Gapped Ancestral Sequence Prediction for proteins

    PubMed Central

    Edwards, Richard J; Shields, Denis C

    2004-01-01

    Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199

  13. SATCHMO-JS: a webserver for simultaneous protein multiple sequence alignment and phylogenetic tree construction.

    PubMed

    Hagopian, Raffi; Davidson, John R; Datta, Ruchira S; Samad, Bushra; Jarvis, Glen R; Sjölander, Kimmen

    2010-07-01

    We present the jump-start simultaneous alignment and tree construction using hidden Markov models (SATCHMO-JS) web server for simultaneous estimation of protein multiple sequence alignments (MSAs) and phylogenetic trees. The server takes as input a set of sequences in FASTA format, and outputs a phylogenetic tree and MSA; these can be viewed online or downloaded from the website. SATCHMO-JS is an extension of the SATCHMO algorithm, and employs a divide-and-conquer strategy to jump-start SATCHMO at a higher point in the phylogenetic tree, reducing the computational complexity of the progressive all-versus-all HMM-HMM scoring and alignment. Results on a benchmark dataset of 983 structurally aligned pairs from the PREFAB benchmark dataset show that SATCHMO-JS provides a statistically significant improvement in alignment accuracy over MUSCLE, Multiple Alignment using Fast Fourier Transform (MAFFT), ClustalW and the original SATCHMO algorithm. The SATCHMO-JS webserver is available at http://phylogenomics.berkeley.edu/satchmo-js. The datasets used in these experiments are available for download at http://phylogenomics.berkeley.edu/satchmo-js/supplementary/.

  14. PASTA for Proteins.

    PubMed

    Collins, Kodi; Warnow, Tandy

    2018-06-19

    PASTA is a multiple sequence method that uses divide-and-conquer plus iteration to enable base alignment methods to scale with high accuracy to large sequence datasets. By default, PASTA included MAFFT L-INS-i; our new extension of PASTA enables the use of MAFFT G-INS-i, MAFFT Homologs, CONTRAlign, and ProbCons. We analyzed the performance of each base method and PASTA using these base methods on 224 datasets from BAliBASE 4 with at least 50 sequences. We show that PASTA enables the most accurate base methods to scale to larger datasets at reduced computational effort, and generally improves alignment and tree accuracy on the largest BAliBASE datasets. PASTA is available at https://github.com/kodicollins/pasta and has also been integrated into the original PASTA repository at https://github.com/smirarab/pasta. Supplementary data are available at Bioinformatics online.

  15. Phylo-mLogo: an interactive and hierarchical multiple-logo visualization tool for alignment of many sequences

    PubMed Central

    Shih, Arthur Chun-Chieh; Lee, DT; Peng, Chin-Lin; Wu, Yu-Wei

    2007-01-01

    Background When aligning several hundreds or thousands of sequences, such as epidemic virus sequences or homologous/orthologous sequences of some big gene families, to reconstruct the epidemiological history or their phylogenies, how to analyze and visualize the alignment results of many sequences has become a new challenge for computational biologists. Although there are several tools available for visualization of very long sequence alignments, few of them are applicable to the alignments of many sequences. Results A multiple-logo alignment visualization tool, called Phylo-mLogo, is presented in this paper. Phylo-mLogo calculates the variabilities and homogeneities of alignment sequences by base frequencies or entropies. Different from the traditional representations of sequence logos, Phylo-mLogo not only displays the global logo patterns of the whole alignment of multiple sequences, but also demonstrates their local homologous logos for each clade hierarchically. In addition, Phylo-mLogo also allows the user to focus only on the analysis of some important, structurally or functionally constrained sites in the alignment selected by the user or by built-in automatic calculation. Conclusion With Phylo-mLogo, the user can symbolically and hierarchically visualize hundreds of aligned sequences simultaneously and easily check the changes of their amino acid sites when analyzing many homologous/orthologous or influenza virus sequences. More information of Phylo-mLogo can be found at URL . PMID:17319966

  16. Processing methods for differential analysis of LC/MS profile data

    PubMed Central

    Katajamaa, Mikko; Orešič, Matej

    2005-01-01

    Background Liquid chromatography coupled to mass spectrometry (LC/MS) has been widely used in proteomics and metabolomics research. In this context, the technology has been increasingly used for differential profiling, i.e. broad screening of biomolecular components across multiple samples in order to elucidate the observed phenotypes and discover biomarkers. One of the major challenges in this domain remains development of better solutions for processing of LC/MS data. Results We present a software package MZmine that enables differential LC/MS analysis of metabolomics data. This software is a toolbox containing methods for all data processing stages preceding differential analysis: spectral filtering, peak detection, alignment and normalization. Specifically, we developed and implemented a new recursive peak search algorithm and a secondary peak picking method for improving already aligned results, as well as a normalization tool that uses multiple internal standards. Visualization tools enable comparative viewing of data across multiple samples. Peak lists can be exported into other data analysis programs. The toolbox has already been utilized in a wide range of applications. We demonstrate its utility on an example of metabolic profiling of Catharanthus roseus cell cultures. Conclusion The software is freely available under the GNU General Public License and it can be obtained from the project web page at: . PMID:16026613

  17. Processing methods for differential analysis of LC/MS profile data.

    PubMed

    Katajamaa, Mikko; Oresic, Matej

    2005-07-18

    Liquid chromatography coupled to mass spectrometry (LC/MS) has been widely used in proteomics and metabolomics research. In this context, the technology has been increasingly used for differential profiling, i.e. broad screening of biomolecular components across multiple samples in order to elucidate the observed phenotypes and discover biomarkers. One of the major challenges in this domain remains development of better solutions for processing of LC/MS data. We present a software package MZmine that enables differential LC/MS analysis of metabolomics data. This software is a toolbox containing methods for all data processing stages preceding differential analysis: spectral filtering, peak detection, alignment and normalization. Specifically, we developed and implemented a new recursive peak search algorithm and a secondary peak picking method for improving already aligned results, as well as a normalization tool that uses multiple internal standards. Visualization tools enable comparative viewing of data across multiple samples. Peak lists can be exported into other data analysis programs. The toolbox has already been utilized in a wide range of applications. We demonstrate its utility on an example of metabolic profiling of Catharanthus roseus cell cultures. The software is freely available under the GNU General Public License and it can be obtained from the project web page at: http://mzmine.sourceforge.net/.

  18. MACSIMS : multiple alignment of complete sequences information management system

    PubMed Central

    Thompson, Julie D; Muller, Arnaud; Waterhouse, Andrew; Procter, Jim; Barton, Geoffrey J; Plewniak, Frédéric; Poch, Olivier

    2006-01-01

    Background In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family. Results MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis. Conclusion MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at . PMID:16792820

  19. Clinical Phenotype Classifications Based on Static Varus Alignment and Varus Thrust in Japanese Patients With Medial Knee Osteoarthritis

    PubMed Central

    Iijima, Hirotaka; Fukutani, Naoto; Fukumoto, Takahiko; Uritani, Daisuke; Kaneda, Eishi; Ota, Kazuo; Kuroki, Hiroshi; Matsuda, Shuichi

    2015-01-01

    Objective To investigate the association between knee pain during gait and 4 clinical phenotypes based on static varus alignment and varus thrust in patients with medial knee osteoarthritis (OA). Methods Patients in an orthopedic clinic (n = 266) diagnosed as having knee OA (Kellgren/Lawrence [K/L] grade ≥1) were divided into 4 phenotype groups according to the presence or absence of static varus alignment and varus thrust (dynamic varus): no varus (n = 173), dynamic varus (n = 17), static varus (n = 50), and static varus + dynamic varus (n = 26). The knee range of motion, spatiotemporal gait parameters, visual analog scale scores for knee pain, and scores on the Japanese Knee Osteoarthritis Measure were used to assess clinical outcomes. Multiple logistic regression analyses identified the relationship between knee pain during gait and the 4 phenotypes, adjusted for possible risk factors, including age, sex, body mass index, K/L grade, and gait velocity. Results Multiple logistic regression analysis showed that varus thrust without varus alignment was associated with knee pain during gait (odds ratio [OR] 3.30, 95% confidence interval [95% CI] 1.08–12.4), and that varus thrust combined with varus alignment was strongly associated with knee pain during gait (OR 17.1, 95% CI 3.19–320.0). Sensitivity analyses applying alternative cutoff values for defining static varus alignment showed comparable results. Conclusion Varus thrust with or without static varus alignment was associated with the occurrence of knee pain during gait. Tailored interventions based on individual malalignment phenotypes may improve clinical outcomes in patients with knee OA. PMID:26017348

  20. Mechanical design of multiple zone plates precision alignment apparatus for hard X-ray focusing in twenty-nanometer scale

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shu, Deming; Liu, Jie; Gleber, Sophie C.

    An enhanced mechanical design of multiple zone plates precision alignment apparatus for hard x-ray focusing in a twenty-nanometer scale is provided. The precision alignment apparatus includes a zone plate alignment base frame; a plurality of zone plates; and a plurality of zone plate holders, each said zone plate holder for mounting and aligning a respective zone plate for hard x-ray focusing. At least one respective positioning stage drives and positions each respective zone plate holder. Each respective positioning stage is mounted on the zone plate alignment base frame. A respective linkage component connects each respective positioning stage and the respectivemore » zone plate holder. The zone plate alignment base frame, each zone plate holder and each linkage component is formed of a selected material for providing thermal expansion stability and positioning stability for the precision alignment apparatus.« less

  1. mRAISE: an alternative algorithmic approach to ligand-based virtual screening

    NASA Astrophysics Data System (ADS)

    von Behren, Mathias M.; Bietz, Stefan; Nittinger, Eva; Rarey, Matthias

    2016-08-01

    Ligand-based virtual screening is a well established method to find new lead molecules in todays drug discovery process. In order to be applicable in day to day practice, such methods have to face multiple challenges. The most important part is the reliability of the results, which can be shown and compared in retrospective studies. Furthermore, in the case of 3D methods, they need to provide biologically relevant molecular alignments of the ligands, that can be further investigated by a medicinal chemist. Last but not least, they have to be able to screen large databases in reasonable time. Many algorithms for ligand-based virtual screening have been proposed in the past, most of them based on pairwise comparisons. Here, a new method is introduced called mRAISE. Based on structural alignments, it uses a descriptor-based bitmap search engine (RAISE) to achieve efficiency. Alignments created on the fly by the search engine get evaluated with an independent shape-based scoring function also used for ranking of compounds. The correct ranking as well as the alignment quality of the method are evaluated and compared to other state of the art methods. On the commonly used Directory of Useful Decoys dataset mRAISE achieves an average area under the ROC curve of 0.76, an average enrichment factor at 1 % of 20.2 and an average hit rate at 1 % of 55.5. With these results, mRAISE is always among the top performing methods with available data for comparison. To access the quality of the alignments calculated by ligand-based virtual screening methods, we introduce a new dataset containing 180 prealigned ligands for 11 diverse targets. Within the top ten ranked conformations, the alignment closest to X-ray structure calculated with mRAISE has a root-mean-square deviation of less than 2.0 Å for 80.8 % of alignment pairs and achieves a median of less than 2.0 Å for eight of the 11 cases. The dataset used to rate the quality of the calculated alignments is freely available at http://www.zbh.uni-hamburg.de/mraise-dataset.html. The table of all PDB codes contained in the ensembles can be found in the supplementary material. The software tool mRAISE is freely available for evaluation purposes and academic use (see http://www.zbh.uni-hamburg.de/raise).

  2. Least Squares Approach to the Alignment of the Generic High Precision Tracking System

    NASA Astrophysics Data System (ADS)

    de Renstrom, Pawel Brückman; Haywood, Stephen

    2006-04-01

    A least squares method to solve a generic alignment problem of a high granularity tracking system is presented. The algorithm is based on an analytical linear expansion and allows for multiple nested fits, e.g. imposing a common vertex for groups of particle tracks is of particular interest. We present a consistent and complete recipe to impose constraints on either implicit or explicit parameters. The method has been applied to the full simulation of a subset of the ATLAS silicon tracking system. The ultimate goal is to determine ≈35,000 degrees of freedom (DoF's). We present a limited scale exercise exploring various aspects of the solution.

  3. Hydra multiple head star sensor and its in-flight self-calibration of optical heads alignment

    NASA Astrophysics Data System (ADS)

    Majewski, L.; Blarre, L.; Perrimon, N.; Kocher, Y.; Martinez, P. E.; Dussy, S.

    2017-11-01

    HYDRA is EADS SODERN new product line of APS-based autonomous star trackers. The baseline is a multiple head sensor made of three separated optical heads and one electronic unit. Actually the concept which was chosen offers more than three single-head star trackers working independently. Since HYDRA merges all fields of view the result is a more accurate, more robust and completely autonomous multiple-head sensor, releasing the AOCS from the need to manage the outputs of independent single-head star trackers. Specific to the multiple head architecture and the underlying data fusion, is the calibration of the relative alignments between the sensor optical heads. The performance of the sensor is related to its estimation of such alignments. HYDRA design is first reminded in this paper along with simplification it can bring at system level (AOCS). Then self-calibration of optical heads alignment is highlighted through descriptions and simulation results, thus demonstrating the performances of a key part of HYDRA multiple-head concept.

  4. The Astro-Blaster.

    ERIC Educational Resources Information Center

    Mancuso, Richard V.; Long, Kevin R.

    1995-01-01

    Presents the Astro-Blaster as a method of the laws of conservation of momentum and energy during the creation of a supernova. Several elastic balls are aligned for a drop, followed by multiple collisions which result in the top ball reaching tremendous heights relative to the drop height. (JRH)

  5. Retention time alignment of LC/MS data by a divide-and-conquer algorithm.

    PubMed

    Zhang, Zhongqi

    2012-04-01

    Liquid chromatography-mass spectrometry (LC/MS) has become the method of choice for characterizing complex mixtures. These analyses often involve quantitative comparison of components in multiple samples. To achieve automated sample comparison, the components of interest must be detected and identified, and their retention times aligned and peak areas calculated. This article describes a simple pairwise iterative retention time alignment algorithm, based on the divide-and-conquer approach, for alignment of ion features detected in LC/MS experiments. In this iterative algorithm, ion features in the sample run are first aligned with features in the reference run by applying a single constant shift of retention time. The sample chromatogram is then divided into two shorter chromatograms, which are aligned to the reference chromatogram the same way. Each shorter chromatogram is further divided into even shorter chromatograms. This process continues until each chromatogram is sufficiently narrow so that ion features within it have a similar retention time shift. In six pairwise LC/MS alignment examples containing a total of 6507 confirmed true corresponding feature pairs with retention time shifts up to five peak widths, the algorithm successfully aligned these features with an error rate of 0.2%. The alignment algorithm is demonstrated to be fast, robust, fully automatic, and superior to other algorithms. After alignment and gap-filling of detected ion features, their abundances can be tabulated for direct comparison between samples.

  6. Ingot slicing machine and method

    NASA Technical Reports Server (NTRS)

    Kuo, Y. S. (Inventor)

    1984-01-01

    An improved method for simultaneously slicing one or a multiplicity of boules of silicon into silicon wafers is described. A plurality of vertical stacks of horizontal saw blades of circular configuration are arranged in juxtaposed coaxial alignment. Each blade is characterized by having a cutting diameter slightly greater than the cutting diameter of the blade arranged immediately above, imparting a simultaneous rotation to the blades.

  7. Laser illumination of multiple capillaries that form a waveguide

    DOEpatents

    Dhadwal, Harbans S.; Quesada, Mark A.; Studier, F. William

    1998-08-04

    A system and method are disclosed for efficient laser illumination of the interiors of multiple capillaries simultaneously, and collection of light emitted from them. Capillaries in a parallel array can form an optical waveguide wherein refraction at the cylindrical surfaces confines side-on illuminating light to the core of each successive capillary in the array. Methods are provided for determining conditions where capillaries will form a waveguide and for assessing and minimizing losses due to reflection. Light can be delivered to the arrayed capillaries through an integrated fiber optic transmitter or through a pair of such transmitters aligned coaxially at opposite sides of the array. Light emitted from materials within the capillaries can be carried to a detection system through optical fibers, each of which collects light from a single capillary, with little cross talk between the capillaries. The collection ends of the optical fibers can be in a parallel array with the same spacing as the capillary array, so that the collection fibers can all be aligned to the capillaries simultaneously. Applicability includes improving the efficiency of many analytical methods that use capillaries, including particularly high-throughput DNA sequencing and diagnostic methods based on capillary electrophoresis.

  8. Laser illumination of multiple capillaries that form a waveguide

    DOEpatents

    Dhadwal, H.S.; Quesada, M.A.; Studier, F.W.

    1998-08-04

    A system and method are disclosed for efficient laser illumination of the interiors of multiple capillaries simultaneously, and collection of light emitted from them. Capillaries in a parallel array can form an optical waveguide wherein refraction at the cylindrical surfaces confines side-on illuminating light to the core of each successive capillary in the array. Methods are provided for determining conditions where capillaries will form a waveguide and for assessing and minimizing losses due to reflection. Light can be delivered to the arrayed capillaries through an integrated fiber optic transmitter or through a pair of such transmitters aligned coaxially at opposite sides of the array. Light emitted from materials within the capillaries can be carried to a detection system through optical fibers, each of which collects light from a single capillary, with little cross talk between the capillaries. The collection ends of the optical fibers can be in a parallel array with the same spacing as the capillary array, so that the collection fibers can all be aligned to the capillaries simultaneously. Applicability includes improving the efficiency of many analytical methods that use capillaries, including particularly high-throughput DNA sequencing and diagnostic methods based on capillary electrophoresis. 35 figs.

  9. SeqFIRE: a web application for automated extraction of indel regions and conserved blocks from protein multiple sequence alignments.

    PubMed

    Ajawatanawong, Pravech; Atkinson, Gemma C; Watson-Haigh, Nathan S; Mackenzie, Bryony; Baldauf, Sandra L

    2012-07-01

    Analyses of multiple sequence alignments generally focus on well-defined conserved sequence blocks, while the rest of the alignment is largely ignored or discarded. This is especially true in phylogenomics, where large multigene datasets are produced through automated pipelines. However, some of the most powerful phylogenetic markers have been found in the variable length regions of multiple alignments, particularly insertions/deletions (indels) in protein sequences. We have developed Sequence Feature and Indel Region Extractor (SeqFIRE) to enable the automated identification and extraction of indels from protein sequence alignments. The program can also extract conserved blocks and identify fast evolving sites using a combination of conservation and entropy. All major variables can be adjusted by the user, allowing them to identify the sets of variables most suited to a particular analysis or dataset. Thus, all major tasks in preparing an alignment for further analysis are combined in a single flexible and user-friendly program. The output includes a numbered list of indels, alignments in NEXUS format with indels annotated or removed and indel-only matrices. SeqFIRE is a user-friendly web application, freely available online at www.seqfire.org/.

  10. Analyses of the radiation of birnaviruses from diverse host phyla and of their evolutionary affinities with other double-stranded RNA and positive strand RNA viruses using robust structure-based multiple sequence alignments and advanced phylogenetic methods

    PubMed Central

    2013-01-01

    Background Birnaviruses form a distinct family of double-stranded RNA viruses infecting animals as different as vertebrates, mollusks, insects and rotifers. With such a wide host range, they constitute a good model for studying the adaptation to the host. Additionally, several lines of evidence link birnaviruses to positive strand RNA viruses and suggest that phylogenetic analyses may provide clues about transition. Results We characterized the genome of a birnavirus from the rotifer Branchionus plicalitis. We used X-ray structures of RNA-dependent RNA polymerases and capsid proteins to obtain multiple structure alignments that allowed us to obtain reliable multiple sequence alignments and we employed “advanced” phylogenetic methods to study the evolutionary relationships between some positive strand and double-stranded RNA viruses. We showed that the rotifer birnavirus genome exhibited an organization remarkably similar to other birnaviruses. As this host was phylogenetically very distant from the other known species targeted by birnaviruses, we revisited the evolutionary pathways within the Birnaviridae family using phylogenetic reconstruction methods. We also applied a number of phylogenetic approaches based on structurally conserved domains/regions of the capsid and RNA-dependent RNA polymerase proteins to study the evolutionary relationships between birnaviruses, other double-stranded RNA viruses and positive strand RNA viruses. Conclusions We show that there is a good correlation between the phylogeny of the birnaviruses and that of their hosts at the phylum level using the RNA-dependent RNA polymerase (genomic segment B) on the one hand and a concatenation of the capsid protein, protease and ribonucleoprotein (genomic segment A) on the other hand. This correlation tends to vanish within phyla. The use of advanced phylogenetic methods and robust structure-based multiple sequence alignments allowed us to obtain a more accurate picture (in terms of probability of the tree topologies) of the evolutionary affinities between double-stranded RNA and positive strand RNA viruses. In particular, we were able to show that there exists a good statistical support for the claims that dsRNA viruses are not monophyletic and that viruses with permuted RdRps belong to a common evolution lineage as previously proposed by other groups. We also propose a tree topology with a good statistical support describing the evolutionary relationships between the Picornaviridae, Caliciviridae, Flaviviridae families and a group including the Alphatetraviridae, Nodaviridae, Permutotretraviridae, Birnaviridae, and Cystoviridae families. PMID:23865988

  11. High-resolution imaging of basal cell carcinoma: a comparison between multiphoton microscopy with fluorescence lifetime imaging and reflectance confocal microscopy.

    PubMed

    Manfredini, Marco; Arginelli, Federica; Dunsby, Christopher; French, Paul; Talbot, Clifford; König, Karsten; Pellacani, Giovanni; Ponti, Giovanni; Seidenari, Stefania

    2013-02-01

    The aim of this study was to compare morphological aspects of basal cell carcinoma (BCC) as assessed by two different imaging methods: in vivo reflectance confocal microscopy (RCM) and multiphoton tomography with fluorescence lifetime imaging implementation (MPT-FLIM). The study comprised 16 BCCs for which a complete set of RCM and MPT-FLIM images were available. The presence of seven MPT-FLIM descriptors was evaluated. The presence of seven RCM equivalent parameters was scored in accordance to their extension. Chi-squared test with Fisher's exact test and Spearman's rank correlation coefficient were determined between MPT-FLIM scores and adjusted-RCM scores. MPT-FLIM and RCM descriptors of BCC were coupled to match the descriptors that define the same pathological structures. The comparison included: Streaming and Aligned elongated cells, Streaming with multiple directions and Double alignment, Palisading (RCM) and Palisading (MPT-FLIM), Typical tumor islands, and Cell islands surrounded by fibers, Dark silhouettes and Phantom islands, Plump bright cells and Melanophages, Vessels (RCM), and Vessels (MPT-FLIM). The parameters that were significantly correlated were Melanophages/Plump Bright Cells, Aligned elongated cells/Streaming, Double alignment/Streaming with multiple directions, and Palisading (MPT-FLIM)/Palisading (RCM). According to our data, both methods are suitable to image BCC's features. The concordance between MPT-FLIM and RCM is high, with some limitations due to the technical differences between the two devices. The hardest difficulty when comparing the images generated by the two imaging modalities is represented by their different field of view. © 2012 John Wiley & Sons A/S.

  12. Discovering Sequence Motifs with Arbitrary Insertions and Deletions

    PubMed Central

    Frith, Martin C.; Saunders, Neil F. W.; Kobe, Bostjan; Bailey, Timothy L.

    2008-01-01

    Biology is encoded in molecular sequences: deciphering this encoding remains a grand scientific challenge. Functional regions of DNA, RNA, and protein sequences often exhibit characteristic but subtle motifs; thus, computational discovery of motifs in sequences is a fundamental and much-studied problem. However, most current algorithms do not allow for insertions or deletions (indels) within motifs, and the few that do have other limitations. We present a method, GLAM2 (Gapped Local Alignment of Motifs), for discovering motifs allowing indels in a fully general manner, and a companion method GLAM2SCAN for searching sequence databases using such motifs. glam2 is a generalization of the gapless Gibbs sampling algorithm. It re-discovers variable-width protein motifs from the PROSITE database significantly more accurately than the alternative methods PRATT and SAM-T2K. Furthermore, it usefully refines protein motifs from the ELM database: in some cases, the refined motifs make orders of magnitude fewer overpredictions than the original ELM regular expressions. GLAM2 performs respectably on the BAliBASE multiple alignment benchmark, and may be superior to leading multiple alignment methods for “motif-like” alignments with N- and C-terminal extensions. Finally, we demonstrate the use of GLAM2 to discover protein kinase substrate motifs and a gapped DNA motif for the LIM-only transcriptional regulatory complex: using GLAM2SCAN, we identify promising targets for the latter. GLAM2 is especially promising for short protein motifs, and it should improve our ability to identify the protein cleavage sites, interaction sites, post-translational modification attachment sites, etc., that underlie much of biology. It may be equally useful for arbitrarily gapped motifs in DNA and RNA, although fewer examples of such motifs are known at present. GLAM2 is public domain software, available for download at http://bioinformatics.org.au/glam2. PMID:18437229

  13. NoFold: RNA structure clustering without folding or alignment.

    PubMed

    Middleton, Sarah A; Kim, Junhyong

    2014-11-01

    Structures that recur across multiple different transcripts, called structure motifs, often perform a similar function-for example, recruiting a specific RNA-binding protein that then regulates translation, splicing, or subcellular localization. Identifying common motifs between coregulated transcripts may therefore yield significant insight into their binding partners and mechanism of regulation. However, as most methods for clustering structures are based on folding individual sequences or doing many pairwise alignments, this results in a tradeoff between speed and accuracy that can be problematic for large-scale data sets. Here we describe a novel method for comparing and characterizing RNA secondary structures that does not require folding or pairwise alignment of the input sequences. Our method uses the idea of constructing a distance function between two objects by their respective distances to a collection of empirical examples or models, which in our case consists of 1973 Rfam family covariance models. Using this as a basis for measuring structural similarity, we developed a clustering pipeline called NoFold to automatically identify and annotate structure motifs within large sequence data sets. We demonstrate that NoFold can simultaneously identify multiple structure motifs with an average sensitivity of 0.80 and precision of 0.98 and generally exceeds the performance of existing methods. We also perform a cross-validation analysis of the entire set of Rfam families, achieving an average sensitivity of 0.57. We apply NoFold to identify motifs enriched in dendritically localized transcripts and report 213 enriched motifs, including both known and novel structures. © 2014 Middleton and Kim; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  14. Template based protein structure modeling by global optimization in CASP11.

    PubMed

    Joo, Keehyoung; Joung, InSuk; Lee, Sun Young; Kim, Jong Yun; Cheng, Qianyi; Manavalan, Balachandran; Joung, Jong Young; Heo, Seungryong; Lee, Juyong; Nam, Mikyung; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung

    2016-09-01

    For the template-based modeling (TBM) of CASP11 targets, we have developed three new protein modeling protocols (nns for server prediction and LEE and LEER for human prediction) by improving upon our previous CASP protocols (CASP7 through CASP10). We applied the powerful global optimization method of conformational space annealing to three stages of optimization, including multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain remodeling. For more successful fold recognition, a new alignment method called CRFalign was developed. It can incorporate sensitive positional and environmental dependence in alignment scores as well as strong nonlinear correlations among various features. Modifications and adjustments were made to the form of the energy function and weight parameters pertaining to the chain building procedure. For the side-chain remodeling step, residue-type dependence was introduced to the cutoff value that determines the entry of a rotamer to the side-chain modeling library. The improved performance of the nns server method is attributed to successful fold recognition achieved by combining several methods including CRFalign and to the current modeling formulation that can incorporate native-like structural aspects present in multiple templates. The LEE protocol is identical to the nns one except that CASP11-released server models are used as templates. The success of LEE in utilizing CASP11 server models indicates that proper template screening and template clustering assisted by appropriate cluster ranking promises a new direction to enhance protein 3D modeling. Proteins 2016; 84(Suppl 1):221-232. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  15. Multicolor microcontact printing of proteins on nanoporous surface for patterned immunoassay

    NASA Astrophysics Data System (ADS)

    Ng, Elaine; Gopal, Ashwini; Hoshino, Kazunori; Zhang, Xiaojing

    2011-07-01

    The large scale patterning of therapeutic proteins is a key to the efficient design, characterization, and production of biologics for cost effective, high throughput, and point-of-care detection and analysis system. We demonstrate an efficient method for protein deposition and adsorption on nanoporous silica substrates in specific patterns using a method called "micro-contact printing". Multiple color-tagged proteins can be printed through sequential application of such micro-patterning technique. Two groups of experiments were performed. In the first group, the protein stamp was aligned precisely with the printing sites, where the stamp was applied multiple times. Optimal conditions were identified for protein transfer and adsorption using the pore size of 4 nm and thickness of 30 nm porous silica thin film. In the second group, we demonstrate the patterning of two-color rabbit immunoglobin labeled with fluorescein isothiocyanate and tetramethyl rhodamine iso-thiocyanate on porous silica substrates that have a pore size 4 nm, porosity 57% and thickness of the porous layer 30 nm. A pair of protein stamps, with corresponding alignment markings and coupled patterns, were aligned and used to produce a two-colored stamp pattern of proteins on porous silica. Different colored proteins can be applied to exemplify the diverse protein composition within a sample. This method of multicolor microcontact printing can be used to perform a fluorescence-based patterned enzyme-linked immunosorbent assay to detect the presence of various proteins within a sample.

  16. Simple chained guide trees give high-quality protein multiple sequence alignments

    PubMed Central

    Boyce, Kieran; Sievers, Fabian; Higgins, Desmond G.

    2014-01-01

    Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be benchmarked with known structures, simple chained guide trees give the most accurate alignments. These also happen to be the fastest and simplest guide trees to construct, computationally. Such guide trees have a striking effect on the accuracy of alignments produced by some of the most widely used alignment packages. There is a marked increase in accuracy and a marked decrease in computational time, once the number of sequences goes much above a few hundred. This is true, even if the order of sequences in the guide tree is random. PMID:25002495

  17. Parallel seed-based approach to multiple protein structure similarities detection

    DOE PAGES

    Chapuis, Guillaume; Le Boudic-Jamin, Mathilde; Andonov, Rumen; ...

    2015-01-01

    Finding similarities between protein structures is a crucial task in molecular biology. Most of the existing tools require proteins to be aligned in order-preserving way and only find single alignments even when multiple similar regions exist. We propose a new seed-based approach that discovers multiple pairs of similar regions. Its computational complexity is polynomial and it comes with a quality guarantee—the returned alignments have both root mean squared deviations (coordinate-based as well as internal-distances based) lower than a given threshold, if such exist. We do not require the alignments to be order preserving (i.e., we consider nonsequential alignments), which makesmore » our algorithm suitable for detecting similar domains when comparing multidomain proteins as well as to detect structural repetitions within a single protein. Because the search space for nonsequential alignments is much larger than for sequential ones, the computational burden is addressed by extensive use of parallel computing techniques: a coarse-grain level parallelism making use of available CPU cores for computation and a fine-grain level parallelism exploiting bit-level concurrency as well as vector instructions.« less

  18. Alignment and integration of complex networks by hypergraph-based spectral clustering

    NASA Astrophysics Data System (ADS)

    Michoel, Tom; Nachtergaele, Bruno

    2012-11-01

    Complex networks possess a rich, multiscale structure reflecting the dynamical and functional organization of the systems they model. Often there is a need to analyze multiple networks simultaneously, to model a system by more than one type of interaction, or to go beyond simple pairwise interactions, but currently there is a lack of theoretical and computational methods to address these problems. Here we introduce a framework for clustering and community detection in such systems using hypergraph representations. Our main result is a generalization of the Perron-Frobenius theorem from which we derive spectral clustering algorithms for directed and undirected hypergraphs. We illustrate our approach with applications for local and global alignment of protein-protein interaction networks between multiple species, for tripartite community detection in folksonomies, and for detecting clusters of overlapping regulatory pathways in directed networks.

  19. Alignment and integration of complex networks by hypergraph-based spectral clustering.

    PubMed

    Michoel, Tom; Nachtergaele, Bruno

    2012-11-01

    Complex networks possess a rich, multiscale structure reflecting the dynamical and functional organization of the systems they model. Often there is a need to analyze multiple networks simultaneously, to model a system by more than one type of interaction, or to go beyond simple pairwise interactions, but currently there is a lack of theoretical and computational methods to address these problems. Here we introduce a framework for clustering and community detection in such systems using hypergraph representations. Our main result is a generalization of the Perron-Frobenius theorem from which we derive spectral clustering algorithms for directed and undirected hypergraphs. We illustrate our approach with applications for local and global alignment of protein-protein interaction networks between multiple species, for tripartite community detection in folksonomies, and for detecting clusters of overlapping regulatory pathways in directed networks.

  20. WEB-server for search of a periodicity in amino acid and nucleotide sequences

    NASA Astrophysics Data System (ADS)

    E Frenkel, F.; Skryabin, K. G.; Korotkov, E. V.

    2017-12-01

    A new web server (http://victoria.biengi.ac.ru/splinter/login.php) was designed and developed to search for periodicity in nucleotide and amino acid sequences. The web server operation is based upon a new mathematical method of searching for multiple alignments, which is founded on the position weight matrices optimization, as well as on implementation of the two-dimensional dynamic programming. This approach allows the construction of multiple alignments of the indistinctly similar amino acid and nucleotide sequences that accumulated more than 1.5 substitutions per a single amino acid or a nucleotide without performing the sequences paired comparisons. The article examines the principles of the web server operation and two examples of studying amino acid and nucleotide sequences, as well as information that could be obtained using the web server.

  1. ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos.

    PubMed

    Roca, Alberto I

    2014-01-01

    The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org.

  2. SubVis: an interactive R package for exploring the effects of multiple substitution matrices on pairwise sequence alignment

    PubMed Central

    Coan, Heather B.; Youker, Robert T.

    2017-01-01

    Understanding how proteins mutate is critical to solving a host of biological problems. Mutations occur when an amino acid is substituted for another in a protein sequence. The set of likelihoods for amino acid substitutions is stored in a matrix and input to alignment algorithms. The quality of the resulting alignment is used to assess the similarity of two or more sequences and can vary according to assumptions modeled by the substitution matrix. Substitution strategies with minor parameter variations are often grouped together in families. For example, the BLOSUM and PAM matrix families are commonly used because they provide a standard, predefined way of modeling substitutions. However, researchers often do not know if a given matrix family or any individual matrix within a family is the most suitable. Furthermore, predefined matrix families may inaccurately reflect a particular hypothesis that a researcher wishes to model or otherwise result in unsatisfactory alignments. In these cases, the ability to compare the effects of one or more custom matrices may be needed. This laborious process is often performed manually because the ability to simultaneously load multiple matrices and then compare their effects on alignments is not readily available in current software tools. This paper presents SubVis, an interactive R package for loading and applying multiple substitution matrices to pairwise alignments. Users can simultaneously explore alignments resulting from multiple predefined and custom substitution matrices. SubVis utilizes several of the alignment functions found in R, a common language among protein scientists. Functions are tied together with the Shiny platform which allows the modification of input parameters. Information regarding alignment quality and individual amino acid substitutions is displayed with the JavaScript language which provides interactive visualizations for revealing both high-level and low-level alignment information. PMID:28674656

  3. DISCO: Distance and Spectrum Correlation Optimization Alignment for Two Dimensional Gas Chromatography Time-of-Flight Mass Spectrometry-based Metabolomics

    PubMed Central

    Wang, Bing; Fang, Aiqin; Heim, John; Bogdanov, Bogdan; Pugh, Scott; Libardoni, Mark; Zhang, Xiang

    2010-01-01

    A novel peak alignment algorithm using a distance and spectrum correlation optimization (DISCO) method has been developed for two-dimensional gas chromatography time-of-flight mass spectrometry (GC×GC/TOF-MS) based metabolomics. This algorithm uses the output of the instrument control software, ChromaTOF, as its input data. It detects and merges multiple peak entries of the same metabolite into one peak entry in each input peak list. After a z-score transformation of metabolite retention times, DISCO selects landmark peaks from all samples based on both two-dimensional retention times and mass spectrum similarity of fragment ions measured by Pearson’s correlation coefficient. A local linear fitting method is employed in the original two-dimensional retention time space to correct retention time shifts. A progressive retention time map searching method is used to align metabolite peaks in all samples together based on optimization of the Euclidean distance and mass spectrum similarity. The effectiveness of the DISCO algorithm is demonstrated using data sets acquired under different experiment conditions and a spiked-in experiment. PMID:20476746

  4. Application of the MAFFT sequence alignment program to large data—reexamination of the usefulness of chained guide trees

    PubMed Central

    Yamada, Kazunori D.; Tomii, Kentaro; Katoh, Kazutaka

    2016-01-01

    Motivation: Large multiple sequence alignments (MSAs), consisting of thousands of sequences, are becoming more and more common, due to advances in sequencing technologies. The MAFFT MSA program has several options for building large MSAs, but their performances have not been sufficiently assessed yet, because realistic benchmarking of large MSAs has been difficult. Recently, such assessments have been made possible through the HomFam and ContTest benchmark protein datasets. Along with the development of these datasets, an interesting theory was proposed: chained guide trees increase the accuracy of MSAs of structurally conserved regions. This theory challenges the basis of progressive alignment methods and needs to be examined by being compared with other known methods including computationally intensive ones. Results: We used HomFam, ContTest and OXFam (an extended version of OXBench) to evaluate several methods enabled in MAFFT: (1) a progressive method with approximate guide trees, (2) a progressive method with chained guide trees, (3) a combination of an iterative refinement method and a progressive method and (4) a less approximate progressive method that uses a rigorous guide tree and consistency score. Other programs, Clustal Omega and UPP, available for large MSAs, were also included into the comparison. The effect of method 2 (chained guide trees) was positive in ContTest but negative in HomFam and OXFam. Methods 3 and 4 increased the benchmark scores more consistently than method 2 for the three datasets, suggesting that they are safer to use. Availability and Implementation: http://mafft.cbrc.jp/alignment/software/ Contact: katoh@ifrec.osaka-u.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27378296

  5. Highly Enhanced Gas Adsorption Properties in Vertically Aligned MoS2 Layers.

    PubMed

    Cho, Soo-Yeon; Kim, Seon Joon; Lee, Youhan; Kim, Jong-Seon; Jung, Woo-Bin; Yoo, Hae-Wook; Kim, Jihan; Jung, Hee-Tae

    2015-09-22

    In this work, we demonstrate that gas adsorption is significantly higher in edge sites of vertically aligned MoS2 compared to that of the conventional basal plane exposed MoS2 films. To compare the effect of the alignment of MoS2 on the gas adsorption properties, we synthesized three distinct MoS2 films with different alignment directions ((1) horizontally aligned MoS2 (basal plane exposed), (2) mixture of horizontally aligned MoS2 and vertically aligned layers (basal and edge exposed), and (3) vertically aligned MoS2 (edge exposed)) by using rapid sulfurization method of CVD process. Vertically aligned MoS2 film shows about 5-fold enhanced sensitivity to NO2 gas molecules compared to horizontally aligned MoS2 film. Vertically aligned MoS2 has superior resistance variation compared to horizontally aligned MoS2 even with same surface area exposed to identical concentration of gas molecules. We found that electrical response to target gas molecules correlates directly with the density of the exposed edge sites of MoS2 due to high adsorption of gas molecules onto edge sites of vertically aligned MoS2. Density functional theory (DFT) calculations corroborate the experimental results as stronger NO2 binding energies are computed for multiple configurations near the edge sites of MoS2, which verifies that electrical response to target gas molecules (NO2) correlates directly with the density of the exposed edge sites of MoS2 due to high adsorption of gas molecules onto edge sites of vertically aligned MoS2. We believe that this observation extends to other 2D TMD materials as well as MoS2 and can be applied to significantly enhance the gas sensor performance in these materials.

  6. Quantum Communication without Alignment using Multiple-Qubit Single-Photon States

    NASA Astrophysics Data System (ADS)

    Aolita, L.; Walborn, S. P.

    2007-03-01

    We propose a scheme for encoding logical qubits in a subspace protected against collective rotations around the propagation axis using the polarization and transverse spatial degrees of freedom of single photons. This encoding allows for quantum key distribution without the need of a shared reference frame. We present methods to generate entangled states of two logical qubits using present day down-conversion sources and linear optics, and show that the application of these entangled logical states to quantum information schemes allows for alignment-free tests of Bell’s inequalities, quantum dense coding, and quantum teleportation.

  7. Analysis of Ribosome Inactivating Protein (RIP): A Bioinformatics Approach

    NASA Astrophysics Data System (ADS)

    Jothi, G. Edward Gnana; Majilla, G. Sahaya Jose; Subhashini, D.; Deivasigamani, B.

    2012-10-01

    In spite of the medical advances in recent years, the world is in need of different sources to encounter certain health issues.Ribosome Inactivating Proteins (RIPs) were found to be one among them. In order to get easy access about RIPs, there is a need to analyse RIPs towards constructing a database on RIPs. Also, multiple sequence alignment was done towards screening for homologues of significant RIPs from rare sources against RIPs from easily available sources in terms of similarity. Protein sequences were retrieved from SWISS-PROT and are further analysed using pair wise and multiple sequence alignment.Analysis shows that, 151 RIPs have been characterized to date. Amongst them, there are 87 type I, 37 type II, 1 type III and 25 unknown RIPs. The sequence length information of various RIPs about the availability of full or partial sequence was also found. The multiple sequence alignment of 37 type I RIP using the online server Multalin, indicates the presence of 20 conserved residues. Pairwise alignment and multiple sequence alignment of certain selected RIPs in two groups namely Group I and Group II were carried out and the consensus level was found to be 98%, 98% and 90% respectively.

  8. Differential evolution-simulated annealing for multiple sequence alignment

    NASA Astrophysics Data System (ADS)

    Addawe, R. C.; Addawe, J. M.; Sueño, M. R. K.; Magadia, J. C.

    2017-10-01

    Multiple sequence alignments (MSA) are used in the analysis of molecular evolution and sequence structure relationships. In this paper, a hybrid algorithm, Differential Evolution - Simulated Annealing (DESA) is applied in optimizing multiple sequence alignments (MSAs) based on structural information, non-gaps percentage and totally conserved columns. DESA is a robust algorithm characterized by self-organization, mutation, crossover, and SA-like selection scheme of the strategy parameters. Here, the MSA problem is treated as a multi-objective optimization problem of the hybrid evolutionary algorithm, DESA. Thus, we name the algorithm as DESA-MSA. Simulated sequences and alignments were generated to evaluate the accuracy and efficiency of DESA-MSA using different indel sizes, sequence lengths, deletion rates and insertion rates. The proposed hybrid algorithm obtained acceptable solutions particularly for the MSA problem evaluated based on the three objectives.

  9. Query-seeded iterative sequence similarity searching improves selectivity 5–20-fold

    PubMed Central

    Li, Weizhong; Lopez, Rodrigo

    2017-01-01

    Abstract Iterative similarity search programs, like psiblast, jackhmmer, and psisearch, are much more sensitive than pairwise similarity search methods like blast and ssearch because they build a position specific scoring model (a PSSM or HMM) that captures the pattern of sequence conservation characteristic to a protein family. But models are subject to contamination; once an unrelated sequence has been added to the model, homologs of the unrelated sequence will also produce high scores, and the model can diverge from the original protein family. Examination of alignment errors during psiblast PSSM contamination suggested a simple strategy for dramatically reducing PSSM contamination. psiblast PSSMs are built from the query-based multiple sequence alignment (MSA) implied by the pairwise alignments between the query model (PSSM, HMM) and the subject sequences in the library. When the original query sequence residues are inserted into gapped positions in the aligned subject sequence, the resulting PSSM rarely produces alignment over-extensions or alignments to unrelated sequences. This simple step, which tends to anchor the PSSM to the original query sequence and slightly increase target percent identity, can reduce the frequency of false-positive alignments more than 20-fold compared with psiblast and jackhmmer, with little loss in search sensitivity. PMID:27923999

  10. A machine-learning approach reveals that alignment properties alone can accurately predict inference of lateral gene transfer from discordant phylogenies.

    PubMed

    Roettger, Mayo; Martin, William; Dagan, Tal

    2009-09-01

    Among the methods currently used in phylogenomic practice to detect the presence of lateral gene transfer (LGT), one of the most frequently employed is the comparison of gene tree topologies for different genes. In cases where the phylogenies for different genes are incompatible, or discordant, for well-supported branches there are three simple interpretations for the result: 1) gene duplications (paralogy) followed by many independent gene losses have occurred, 2) LGT has occurred, or 3) the phylogeny is well supported but for reasons unknown is nonetheless incorrect. Here, we focus on the third possibility by examining the properties of 22,437 published multiple sequence alignments, the Bayesian maximum likelihood trees for which either do or do not suggest the occurrence of LGT by the criterion of discordant branches. The alignments that produce discordant phylogenies differ significantly in several salient alignment properties from those that do not. Using a support vector machine, we were able to predict the inference of discordant tree topologies with up to 80% accuracy from alignment properties alone.

  11. New Challenges of the Computation of Multiple Sequence Alignments in the High-Throughput Era (2010 JGI/ANL HPC Workshop)

    ScienceCinema

    Notredame, Cedric

    2018-05-02

    Cedric Notredame from the Centre for Genomic Regulation gives a presentation on New Challenges of the Computation of Multiple Sequence Alignments in the High-Throughput Era at the JGI/Argonne HPC Workshop on January 26, 2010.

  12. Two Simple and Efficient Algorithms to Compute the SP-Score Objective Function of a Multiple Sequence Alignment.

    PubMed

    Ranwez, Vincent

    2016-01-01

    Multiple sequence alignment (MSA) is a crucial step in many molecular analyses and many MSA tools have been developed. Most of them use a greedy approach to construct a first alignment that is then refined by optimizing the sum of pair score (SP-score). The SP-score estimation is thus a bottleneck for most MSA tools since it is repeatedly required and is time consuming. Given an alignment of n sequences and L sites, I introduce here optimized solutions reaching O(nL) time complexity for affine gap cost, instead of O(n2L), which are easy to implement.

  13. Multiple Frequency Audio Signal Communication as a Mechanism for Neurophysiology and Video Data Synchronization

    PubMed Central

    Topper, Nicholas C.; Burke, S.N.; Maurer, A.P.

    2014-01-01

    BACKGROUND Current methods for aligning neurophysiology and video data are either prepackaged, requiring the additional purchase of a software suite, or use a blinking LED with a stationary pulse-width and frequency. These methods lack significant user interface for adaptation, are expensive, or risk a misalignment of the two data streams. NEW METHOD A cost-effective means to obtain high-precision alignment of behavioral and neurophysiological data is obtained by generating an audio-pulse embedded with two domains of information, a low-frequency binary-counting signal and a high, randomly changing frequency. This enabled the derivation of temporal information while maintaining enough entropy in the system for algorithmic alignment. RESULTS The sample to frame index constructed using the audio input correlation method described in this paper enables video and data acquisition to be aligned at a sub-frame level of precision. COMPARISONS WITH EXISTING METHOD Traditionally, a synchrony pulse is recorded on-screen via a flashing diode. The higher sampling rate of the audio input of the camcorder enables the timing of an event to be detected with greater precision. CONCLUSIONS While On-line analysis and synchronization using specialized equipment may be the ideal situation in some cases, the method presented in the current paper presents a viable, low cost alternative, and gives the flexibility to interface with custom off-line analysis tools. Moreover, the ease of constructing and implements this set-up presented in the current paper makes it applicable to a wide variety of applications that require video recording. PMID:25256648

  14. TaxI: a software tool for DNA barcoding using distance methods

    PubMed Central

    Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel

    2005-01-01

    DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755

  15. Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters.

    PubMed

    Lan, Haidong; Chan, Yuandong; Xu, Kai; Schmidt, Bertil; Peng, Shaoliang; Liu, Weiguo

    2016-07-19

    Computing alignments between two or more sequences are common operations frequently performed in computational molecular biology. The continuing growth of biological sequence databases establishes the need for their efficient parallel implementation on modern accelerators. This paper presents new approaches to high performance biological sequence database scanning with the Smith-Waterman algorithm and the first stage of progressive multiple sequence alignment based on the ClustalW heuristic on a Xeon Phi-based compute cluster. Our approach uses a three-level parallelization scheme to take full advantage of the compute power available on this type of architecture; i.e. cluster-level data parallelism, thread-level coarse-grained parallelism, and vector-level fine-grained parallelism. Furthermore, we re-organize the sequence datasets and use Xeon Phi shuffle operations to improve I/O efficiency. Evaluations show that our method achieves a peak overall performance up to 220 GCUPS for scanning real protein sequence databanks on a single node consisting of two Intel E5-2620 CPUs and two Intel Xeon Phi 7110P cards. It also exhibits good scalability in terms of sequence length and size, and number of compute nodes for both database scanning and multiple sequence alignment. Furthermore, the achieved performance is highly competitive in comparison to optimized Xeon Phi and GPU implementations. Our implementation is available at https://github.com/turbo0628/LSDBS-mpi .

  16. New nurse transition: success through aligning multiple identities.

    PubMed

    Leong, Yee Mun Jessica; Crossman, Joanna

    2015-01-01

    The purpose of this paper is to explore the perceptions of new nurses in Singapore of their experiences of role transition and to examine the implications for managers in terms of employee training, development and retention. This qualitative study was conducted using a constructivist grounded theory approach. In total 26 novice nurses and five preceptors (n=31) from five different hospitals participated in the study. Data were collected from semi-structured interviews and reflective journal entries and analysed using the constant comparative method. The findings revealed that novice nurses remained emotionally and physically challenged when experiencing role transition. Two major constructs appear to play an important part in the transition process; learning how to Fit in and aligning personal with professional and organisational identities. The findings highlight factors that facilitate or impede Fitting in and aligning these identities. Although the concept of Fitting in and its relation to the attrition of novice nurses has been explored in global studies, that relationship has not yet been theorised as the dynamic alignment of multiple identities. Also, whilst most research around Fitting in, identity and retention has been conducted in western countries, little is known about these issues and their interrelationship in the context of Singapore. The study should inform decision making by healthcare organisations, nurse managers and nursing training institutions with respect to improving the transition experience of novice nurses.

  17. School-Based Assessment of ADHD: Purpose, Alignment with Best Practice Guidelines, and Training

    ERIC Educational Resources Information Center

    Ogg, Julia; Fefer, Sarah; Sundman-Wheat, Ashley; McMahan, Melanie; Stewart, Tiffany; Chappel, Ashley; Bateman, Lisa

    2013-01-01

    Youth exhibiting symptoms of attention deficit hyperactivity disorder are frequently referred to school psychologists because of academic, social, and behavioral difficulties that they face. To address these difficulties, evidence-based assessment methods have been outlined for multiple purposes of assessment. The goals of this study were to…

  18. Swarm intelligence in bioinformatics: methods and implementations for discovering patterns of multiple sequences.

    PubMed

    Cui, Zhihua; Zhang, Yi

    2014-02-01

    As a promising and innovative research field, bioinformatics has attracted increasing attention recently. Beneath the enormous number of open problems in this field, one fundamental issue is about the accurate and efficient computational methodology that can deal with tremendous amounts of data. In this paper, we survey some applications of swarm intelligence to discover patterns of multiple sequences. To provide a deep insight, ant colony optimization, particle swarm optimization, artificial bee colony and artificial fish swarm algorithm are selected, and their applications to multiple sequence alignment and motif detecting problem are discussed.

  19. Bellerophon: A program to detect chimeric sequences in multiple sequence alignments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip

    2003-12-23

    Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.

  20. Nanomechanical testing system

    DOEpatents

    Vodnick, David James; Dwivedi, Arpit; Keranen, Lucas Paul; Okerlund, Michael David; Schmitz, Roger William; Warren, Oden Lee; Young, Christopher David

    2014-07-08

    An automated testing system includes systems and methods to facilitate inline production testing of samples at a micro (multiple microns) or less scale with a mechanical testing instrument. In an example, the system includes a probe changing assembly for coupling and decoupling a probe of the instrument. The probe changing assembly includes a probe change unit configured to grasp one of a plurality of probes in a probe magazine and couple one of the probes with an instrument probe receptacle. An actuator is coupled with the probe change unit, and the actuator is configured to move and align the probe change unit with the probe magazine and the instrument probe receptacle. In another example, the automated testing system includes a multiple degree of freedom stage for aligning a sample testing location with the instrument. The stage includes a sample stage and a stage actuator assembly including translational and rotational actuators.

  1. Nanomechanical testing system

    DOEpatents

    Vodnick, David James; Dwivedi, Arpit; Keranen, Lucas Paul; Okerlund, Michael David; Schmitz, Roger William; Warren, Oden Lee; Young, Christopher David

    2015-01-27

    An automated testing system includes systems and methods to facilitate inline production testing of samples at a micro (multiple microns) or less scale with a mechanical testing instrument. In an example, the system includes a probe changing assembly for coupling and decoupling a probe of the instrument. The probe changing assembly includes a probe change unit configured to grasp one of a plurality of probes in a probe magazine and couple one of the probes with an instrument probe receptacle. An actuator is coupled with the probe change unit, and the actuator is configured to move and align the probe change unit with the probe magazine and the instrument probe receptacle. In another example, the automated testing system includes a multiple degree of freedom stage for aligning a sample testing location with the instrument. The stage includes a sample stage and a stage actuator assembly including translational and rotational actuators.

  2. Nanomechanical testing system

    DOEpatents

    Vodnick, David James; Dwivedi, Arpit; Keranen, Lucas Paul; Okerlund, Michael David; Schmitz, Roger William; Warren, Oden Lee; Young, Christopher David

    2015-02-24

    An automated testing system includes systems and methods to facilitate inline production testing of samples at a micro (multiple microns) or less scale with a mechanical testing instrument. In an example, the system includes a probe changing assembly for coupling and decoupling a probe of the instrument. The probe changing assembly includes a probe change unit configured to grasp one of a plurality of probes in a probe magazine and couple one of the probes with an instrument probe receptacle. An actuator is coupled with the probe change unit, and the actuator is configured to move and align the probe change unit with the probe magazine and the instrument probe receptacle. In another example, the automated testing system includes a multiple degree of freedom stage for aligning a sample testing location with the instrument. The stage includes a sample stage and a stage actuator assembly including translational and rotational actuators.

  3. A distributed system for fast alignment of next-generation sequencing data.

    PubMed

    Srimani, Jaydeep K; Wu, Po-Yen; Phan, John H; Wang, May D

    2010-12-01

    We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment results. Moreover, as the data and alignment algorithms become more prevalent, it will become necessary to examine the effect of the multitude of alignment parameters on various NGS systems. We validate the distributed software system by (1) computing simple timing results to show the speed-up gained by using multiple computers, (2) optimizing alignment parameters using simulated NGS data, and (3) computing NGS expression levels for a single biological sample using optimal parameters and comparing these expression levels to that of a microarray sample. Results indicate that the distributed alignment system achieves approximately a linear speed-up and correctly distributes sequence data to and gathers alignment results from multiple compute clients.

  4. ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos

    PubMed Central

    2014-01-01

    Background The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. Results The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. Conclusions The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org. PMID:25237393

  5. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures.

    PubMed

    Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

    2009-01-01

    ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/

  6. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures

    PubMed Central

    Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

    2009-01-01

    ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/ PMID:18971256

  7. Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns.

    PubMed

    Ortuño, Francisco M; Valenzuela, Olga; Rojas, Fernando; Pomares, Hector; Florido, Javier P; Urquiza, Jose M; Rojas, Ignacio

    2013-09-01

    Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, above all when sequences are less similar. Consequently, researchers and biologists do not agree about which is the most suitable way to evaluate MSAs. Recent evaluations tend to use more complex scores including further biological features. Among them, 3D structures are increasingly being used to evaluate alignments. Because structures are more conserved in proteins than sequences, scores with structural information are better suited to evaluate more distant relationships between sequences. The proposed multiobjective algorithm, based on the non-dominated sorting genetic algorithm, aims to jointly optimize three objectives: STRIKE score, non-gaps percentage and totally conserved columns. It was significantly assessed on the BAliBASE benchmark according to the Kruskal-Wallis test (P < 0.01). This algorithm also outperforms other aligners, such as ClustalW, Multiple Sequence Alignment Genetic Algorithm (MSA-GA), PRRP, DIALIGN, Hidden Markov Model Training (HMMT), Pattern-Induced Multi-sequence Alignment (PIMA), MULTIALIGN, Sequence Alignment Genetic Algorithm (SAGA), PILEUP, Rubber Band Technique Genetic Algorithm (RBT-GA) and Vertical Decomposition Genetic Algorithm (VDGA), according to the Wilcoxon signed-rank test (P < 0.05), whereas it shows results not significantly different to 3D-COFFEE (P > 0.05) with the advantage of being able to use less structures. Structural information is included within the objective function to evaluate more accurately the obtained alignments. The source code is available at http://www.ugr.es/~fortuno/MOSAStrE/MO-SAStrE.zip.

  8. A new method to cluster genomes based on cumulative Fourier power spectrum.

    PubMed

    Dong, Rui; Zhu, Ziyue; Yin, Changchuan; He, Rong L; Yau, Stephen S-T

    2018-06-20

    Analyzing phylogenetic relationships using mathematical methods has always been of importance in bioinformatics. Quantitative research may interpret the raw biological data in a precise way. Multiple Sequence Alignment (MSA) is used frequently to analyze biological evolutions, but is very time-consuming. When the scale of data is large, alignment methods cannot finish calculation in reasonable time. Therefore, we present a new method using moments of cumulative Fourier power spectrum in clustering the DNA sequences. Each sequence is translated into a vector in Euclidean space. Distances between the vectors can reflect the relationships between sequences. The mapping between the spectra and moment vector is one-to-one, which means that no information is lost in the power spectra during the calculation. We cluster and classify several datasets including Influenza A, primates, and human rhinovirus (HRV) datasets to build up the phylogenetic trees. Results show that the new proposed cumulative Fourier power spectrum is much faster and more accurately than MSA and another alignment-free method known as k-mer. The research provides us new insights in the study of phylogeny, evolution, and efficient DNA comparison algorithms for large genomes. The computer programs of the cumulative Fourier power spectrum are available at GitHub (https://github.com/YaulabTsinghua/cumulative-Fourier-power-spectrum). Copyright © 2018. Published by Elsevier B.V.

  9. A line-source method for aligning on-board and other pinhole SPECT systems

    PubMed Central

    Yan, Susu; Bowsher, James; Yin, Fang-Fang

    2013-01-01

    Purpose: In order to achieve functional and molecular imaging as patients are in position for radiation therapy, a robotic multipinhole SPECT system is being developed. Alignment of the SPECT system—to the linear accelerator (LINAC) coordinate frame and to the coordinate frames of other on-board imaging systems such as cone-beam CT (CBCT)—is essential for target localization and image reconstruction. An alignment method that utilizes line sources and one pinhole projection is proposed and investigated to achieve this goal. Potentially, this method could also be applied to the calibration of the other pinhole SPECT systems. Methods: An alignment model consisting of multiple alignment parameters was developed which maps line sources in three-dimensional (3D) space to their two-dimensional (2D) projections on the SPECT detector. In a computer-simulation study, 3D coordinates of line-sources were defined in a reference room coordinate frame, such as the LINAC coordinate frame. Corresponding 2D line-source projections were generated by computer simulation that included SPECT blurring and noise effects. The Radon transform was utilized to detect angles (α) and offsets (ρ) of the line-source projections. Alignment parameters were then estimated by a nonlinear least squares method, based on the α and ρ values and the alignment model. Alignment performance was evaluated as a function of number of line sources, Radon transform accuracy, finite line-source width, intrinsic camera resolution, Poisson noise, and acquisition geometry. Experimental evaluations were performed using a physical line-source phantom and a pinhole-collimated gamma camera attached to a robot. Results: In computer-simulation studies, when there was no error in determining angles (α) and offsets (ρ) of the measured projections, six alignment parameters (three translational and three rotational) were estimated perfectly using three line sources. When angles (α) and offsets (ρ) were provided by the Radon transform, estimation accuracy was reduced. The estimation error was associated with rounding errors of Radon transform, finite line-source width, Poisson noise, number of line sources, intrinsic camera resolution, and detector acquisition geometry. Statistically, the estimation accuracy was significantly improved by using four line sources rather than three and by thinner line-source projections (obtained by better intrinsic detector resolution). With five line sources, median errors were 0.2 mm for the detector translations, 0.7 mm for the detector radius of rotation, and less than 0.5° for detector rotation, tilt, and twist. In experimental evaluations, average errors relative to a different, independent registration technique were about 1.8 mm for detector translations, 1.1 mm for the detector radius of rotation (ROR), 0.5° and 0.4° for detector rotation and tilt, respectively, and 1.2° for detector twist. Conclusions: Alignment parameters can be estimated using one pinhole projection of line sources. Alignment errors are largely associated with limited accuracy of the Radon transform in determining angles (α) and offsets (ρ) of the line-source projections. This alignment method may be important for multipinhole SPECT, where relative pinhole alignment may vary during rotation. For pinhole and multipinhole SPECT imaging on-board radiation therapy machines, the method could provide alignment of SPECT coordinates with those of CBCT and the LINAC. PMID:24320537

  10. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.

    PubMed

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-12-01

    The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  11. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements

    PubMed Central

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-01-01

    Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488

  12. Fungi in Thailand: a case study of the efficacy of an ITS barcode for automatically identifying species within the Annulohypoxylon and Hypoxylon genera.

    PubMed

    Suwannasai, Nuttika; Martín, María P; Phosri, Cherdchai; Sihanonth, Prakitsin; Whalley, Anthony J S; Spouge, John L

    2013-01-01

    Thailand, a part of the Indo-Burma biodiversity hotspot, has many endemic animals and plants. Some of its fungal species are difficult to recognize and separate, complicating assessments of biodiversity. We assessed species diversity within the fungal genera Annulohypoxylon and Hypoxylon, which produce biologically active and potentially therapeutic compounds, by applying classical taxonomic methods to 552 teleomorphs collected from across Thailand. Using probability of correct identification (PCI), we also assessed the efficacy of automated species identification with a fungal barcode marker, ITS, in the model system of Annulohypoxylon and Hypoxylon. The 552 teleomorphs yielded 137 ITS sequences; in addition, we examined 128 GenBank ITS sequences, to assess biases in evaluating a DNA barcode with GenBank data. The use of multiple sequence alignment in a barcode database like BOLD raises some concerns about non-protein barcode markers like ITS, so we also compared species identification using different alignment methods. Our results suggest the following. (1) Multiple sequence alignment of ITS sequences is competitive with pairwise alignment when identifying species, so BOLD should be able to preserve its present bioinformatics workflow for species identification for ITS, and possibly therefore with at least some other non-protein barcode markers. (2) Automated species identification is insensitive to a specific choice of evolutionary distance, contributing to resolution of a current debate in DNA barcoding. (3) Statistical methods are available to address, at least partially, the possibility of expert misidentification of species. Phylogenetic trees discovered a cryptic species and strongly supported monophyletic clades for many Annulohypoxylon and Hypoxylon species, suggesting that ITS can contribute usefully to a barcode for these fungi. The PCIs here, derived solely from ITS, suggest that a fungal barcode will require secondary markers in Annulohypoxylon and Hypoxylon, however. The URL http://tinyurl.com/spouge-barcode contains computer programs and other supplementary material relevant to this article.

  13. A new fast method for inferring multiple consensus trees using k-medoids.

    PubMed

    Tahiri, Nadia; Willems, Matthieu; Makarenkov, Vladimir

    2018-04-05

    Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome. Many methods have been proposed to infer a single consensus tree from a collection of gene trees. Still, the application of these tree merging methods can lead to the loss of specific evolutionary patterns which characterize some gene families or some groups of gene families. Thus, the problem of inferring multiple consensus trees from a given set of gene trees becomes relevant. We describe a new fast method for inferring multiple consensus trees from a given set of phylogenetic trees (i.e. additive trees or X-trees) defined on the same set of species (i.e. objects or taxa). The traditional consensus approach yields a single consensus tree. We use the popular k-medoids partitioning algorithm to divide a given set of trees into several clusters of trees. We propose novel versions of the well-known Silhouette and Caliński-Harabasz cluster validity indices that are adapted for tree clustering with k-medoids. The efficiency of the new method was assessed using both synthetic and real data, such as a well-known phylogenetic dataset consisting of 47 gene trees inferred for 14 archaeal organisms. The method described here allows inference of multiple consensus trees from a given set of gene trees. It can be used to identify groups of gene trees having similar intragroup and different intergroup evolutionary histories. The main advantage of our method is that it is much faster than the existing tree clustering approaches, while providing similar or better clustering results in most cases. This makes it particularly well suited for the analysis of large genomic and phylogenetic datasets.

  14. Using analogy to learn about phenomena at scales outside human perception.

    PubMed

    Resnick, Ilyse; Davatzes, Alexandra; Newcombe, Nora S; Shipley, Thomas F

    2017-01-01

    Understanding and reasoning about phenomena at scales outside human perception (for example, geologic time) is critical across science, technology, engineering, and mathematics. Thus, devising strong methods to support acquisition of reasoning at such scales is an important goal in science, technology, engineering, and mathematics education. In two experiments, we examine the use of analogical principles in learning about geologic time. Across both experiments we find that using a spatial analogy (for example, a time line) to make multiple alignments, and keeping all unrelated components of the analogy held constant (for example, keep the time line the same length), leads to better understanding of the magnitude of geologic time. Effective approaches also include hierarchically and progressively aligning scale information (Experiment 1) and active prediction in making alignments paired with immediate feedback (Experiments 1 and 2).

  15. Phylo-VISTA: Interactive visualization of multiple DNA sequence alignments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shah, Nameeta; Couronne, Olivier; Pennacchio, Len A.

    The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. Results: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a frameworkmore » based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. Availability: Phylo-VISTA is available at http://www-gsd.lbl. gov/phylovista. It requires an Internet browser with Java Plugin 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu« less

  16. LAMBADA and InflateGRO2: efficient membrane alignment and insertion of membrane proteins for molecular dynamics simulations.

    PubMed

    Schmidt, Thomas H; Kandt, Christian

    2012-10-22

    At the beginning of each molecular dynamics membrane simulation stands the generation of a suitable starting structure which includes the working steps of aligning membrane and protein and seamlessly accommodating the protein in the membrane. Here we introduce two efficient and complementary methods based on pre-equilibrated membrane patches, automating these steps. Using a voxel-based cast of the coarse-grained protein, LAMBADA computes a hydrophilicity profile-derived scoring function based on which the optimal rotation and translation operations are determined to align protein and membrane. Employing an entirely geometrical approach, LAMBADA is independent from any precalculated data and aligns even large membrane proteins within minutes on a regular workstation. LAMBADA is the first tool performing the entire alignment process automatically while providing the user with the explicit 3D coordinates of the aligned protein and membrane. The second tool is an extension of the InflateGRO method addressing the shortcomings of its predecessor in a fully automated workflow. Determining the exact number of overlapping lipids based on the area occupied by the protein and restricting expansion, compression and energy minimization steps to a subset of relevant lipids through automatically calculated and system-optimized operation parameters, InflateGRO2 yields optimal lipid packing and reduces lipid vacuum exposure to a minimum preserving as much of the equilibrated membrane structure as possible. Applicable to atomistic and coarse grain structures in MARTINI format, InflateGRO2 offers high accuracy, fast performance, and increased application flexibility permitting the easy preparation of systems exhibiting heterogeneous lipid composition as well as embedding proteins into multiple membranes. Both tools can be used separately, in combination with other methods, or in tandem permitting a fully automated workflow while retaining a maximum level of usage control and flexibility. To assess the performance of both methods, we carried out test runs using 22 membrane proteins of different size and transmembrane structure.

  17. Phylogenetic relationships within the cyst-forming nematodes (Nematoda, Heteroderidae) based on analysis of sequences from the ITS regions of ribosomal DNA.

    PubMed

    Subbotin, S A; Vierstraete, A; De Ley, P; Rowe, J; Waeyenberge, L; Moens, M; Vanfleteren, J R

    2001-10-01

    The ITS1, ITS2, and 5.8S gene sequences of nuclear ribosomal DNA from 40 taxa of the family Heteroderidae (including the genera Afenestrata, Cactodera, Heterodera, Globodera, Punctodera, Meloidodera, Cryphodera, and Thecavermiculatus) were sequenced and analyzed. The ITS regions displayed high levels of sequence divergence within Heteroderinae and compared to outgroup taxa. Unlike recent findings in root knot nematodes, ITS sequence polymorphism does not appear to complicate phylogenetic analysis of cyst nematodes. Phylogenetic analyses with maximum-parsimony, minimum-evolution, and maximum-likelihood methods were performed with a range of computer alignments, including elision and culled alignments. All multiple alignments and phylogenetic methods yielded similar basic structure for phylogenetic relationships of Heteroderidae. The cyst-forming nematodes are represented by six main clades corresponding to morphological characters and host specialization, with certain clades assuming different positions depending on alignment procedure and/or method of phylogenetic inference. Hypotheses of monophyly of Punctoderinae and Heteroderinae are, respectively, strongly and moderately supported by the ITS data across most alignments. Close relationships were revealed between the Avenae and the Sacchari groups and between the Humuli group and the species H. salixophila within Heteroderinae. The Goettingiana group occupies a basal position within this subfamily. The validity of the genera Afenestrata and Bidera was tested and is discussed based on molecular data. We conclude that ITS sequence data are appropriate for studies of relationships within the different species groups and less so for recovery of more ancient speciations within Heteroderidae. Copyright 2001 Academic Press.

  18. Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

    PubMed

    Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

    2011-01-01

    The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

  19. Improved measurements of RNA structure conservation with generalized centroid estimators.

    PubMed

    Okada, Yohei; Saito, Yutaka; Sato, Kengo; Sakakibara, Yasubumi

    2011-01-01

    Identification of non-protein-coding RNAs (ncRNAs) in genomes is a crucial task for not only molecular cell biology but also bioinformatics. Secondary structures of ncRNAs are employed as a key feature of ncRNA analysis since biological functions of ncRNAs are deeply related to their secondary structures. Although the minimum free energy (MFE) structure of an RNA sequence is regarded as the most stable structure, MFE alone could not be an appropriate measure for identifying ncRNAs since the free energy is heavily biased by the nucleotide composition. Therefore, instead of MFE itself, several alternative measures for identifying ncRNAs have been proposed such as the structure conservation index (SCI) and the base pair distance (BPD), both of which employ MFE structures. However, these measurements are unfortunately not suitable for identifying ncRNAs in some cases including the genome-wide search and incur high false discovery rate. In this study, we propose improved measurements based on SCI and BPD, applying generalized centroid estimators to incorporate the robustness against low quality multiple alignments. Our experiments show that our proposed methods achieve higher accuracy than the original SCI and BPD for not only human-curated structural alignments but also low quality alignments produced by CLUSTAL W. Furthermore, the centroid-based SCI on CLUSTAL W alignments is more accurate than or comparable with that of the original SCI on structural alignments generated with RAF, a high quality structural aligner, for which twofold expensive computational time is required on average. We conclude that our methods are more suitable for genome-wide alignments which are of low quality from the point of view on secondary structures than the original SCI and BPD.

  20. Java bioinformatics analysis web services for multiple sequence alignment--JABAWS:MSA.

    PubMed

    Troshin, Peter V; Procter, James B; Barton, Geoffrey J

    2011-07-15

    JABAWS is a web services framework that simplifies the deployment of web services for bioinformatics. JABAWS:MSA provides services for five multiple sequence alignment (MSA) methods (Probcons, T-coffee, Muscle, Mafft and ClustalW), and is the system employed by the Jalview multiple sequence analysis workbench since version 2.6. A fully functional, easy to set up server is provided as a Virtual Appliance (VA), which can be run on most operating systems that support a virtualization environment such as VMware or Oracle VirtualBox. JABAWS is also distributed as a Web Application aRchive (WAR) and can be configured to run on a single computer and/or a cluster managed by Grid Engine, LSF or other queuing systems that support DRMAA. JABAWS:MSA provides clients full access to each application's parameters, allows administrators to specify named parameter preset combinations and execution limits for each application through simple configuration files. The JABAWS command-line client allows integration of JABAWS services into conventional scripts. JABAWS is made freely available under the Apache 2 license and can be obtained from: http://www.compbio.dundee.ac.uk/jabaws.

  1. Inter-trial alignment of EEG data and phase-locking

    NASA Astrophysics Data System (ADS)

    Testorf, M. E.; Horak, P.; Connolly, A.; Holmes, G. L.; Jobst, B. C.

    2015-09-01

    Neuro-scientific studies are often aimed at imaging brain activity, which is time-locked to external stimuli. This provides the possibility to use statistical methods to extract even weak signal components, which occur with each stimulus. For electroencephalographic recordings this concept is limited by inevitable time jitter, which cannot be controlled in all cases. Our study is based on a cross-correlation analysis of trials to alignment trials based on the recorded data. This is demonstrated both with simulated signals and with clinical EEG data, which were recorded intracranially. Special attention is given to the evaluation of the time-frequency resolved phase-locking across multiple trails.

  2. DNA motif alignment by evolving a population of Markov chains.

    PubMed

    Bi, Chengpeng

    2009-01-30

    Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.

  3. Towards Long-Range RNA Structure Prediction in Eukaryotic Genes.

    PubMed

    Pervouchine, Dmitri D

    2018-06-15

    The ability to form an intramolecular structure plays a fundamental role in eukaryotic RNA biogenesis. Proximate regions in the primary transcripts fold into a local secondary structure, which is then hierarchically assembled into a tertiary structure that is stabilized by RNA-binding proteins and long-range intramolecular base pairings. While the local RNA structure can be predicted reasonably well for short sequences, long-range structure at the scale of eukaryotic genes remains problematic from the computational standpoint. The aim of this review is to list functional examples of long-range RNA structures, to summarize current comparative methods of structure prediction, and to highlight their advances and limitations in the context of long-range RNA structures. Most comparative methods implement the “first-align-then-fold” principle, i.e., they operate on multiple sequence alignments, while functional RNA structures often reside in non-conserved parts of the primary transcripts. The opposite “first-fold-then-align” approach is currently explored to a much lesser extent. Developing novel methods in both directions will improve the performance of comparative RNA structure analysis and help discover novel long-range structures, their higher-order organization, and RNA⁻RNA interactions across the transcriptome.

  4. A line-source method for aligning on-board and other pinhole SPECT systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yan, Susu; Bowsher, James; Yin, Fang-Fang

    2013-12-15

    Purpose: In order to achieve functional and molecular imaging as patients are in position for radiation therapy, a robotic multipinhole SPECT system is being developed. Alignment of the SPECT system—to the linear accelerator (LINAC) coordinate frame and to the coordinate frames of other on-board imaging systems such as cone-beam CT (CBCT)—is essential for target localization and image reconstruction. An alignment method that utilizes line sources and one pinhole projection is proposed and investigated to achieve this goal. Potentially, this method could also be applied to the calibration of the other pinhole SPECT systems.Methods: An alignment model consisting of multiple alignmentmore » parameters was developed which maps line sources in three-dimensional (3D) space to their two-dimensional (2D) projections on the SPECT detector. In a computer-simulation study, 3D coordinates of line-sources were defined in a reference room coordinate frame, such as the LINAC coordinate frame. Corresponding 2D line-source projections were generated by computer simulation that included SPECT blurring and noise effects. The Radon transform was utilized to detect angles (α) and offsets (ρ) of the line-source projections. Alignment parameters were then estimated by a nonlinear least squares method, based on the α and ρ values and the alignment model. Alignment performance was evaluated as a function of number of line sources, Radon transform accuracy, finite line-source width, intrinsic camera resolution, Poisson noise, and acquisition geometry. Experimental evaluations were performed using a physical line-source phantom and a pinhole-collimated gamma camera attached to a robot.Results: In computer-simulation studies, when there was no error in determining angles (α) and offsets (ρ) of the measured projections, six alignment parameters (three translational and three rotational) were estimated perfectly using three line sources. When angles (α) and offsets (ρ) were provided by the Radon transform, estimation accuracy was reduced. The estimation error was associated with rounding errors of Radon transform, finite line-source width, Poisson noise, number of line sources, intrinsic camera resolution, and detector acquisition geometry. Statistically, the estimation accuracy was significantly improved by using four line sources rather than three and by thinner line-source projections (obtained by better intrinsic detector resolution). With five line sources, median errors were 0.2 mm for the detector translations, 0.7 mm for the detector radius of rotation, and less than 0.5° for detector rotation, tilt, and twist. In experimental evaluations, average errors relative to a different, independent registration technique were about 1.8 mm for detector translations, 1.1 mm for the detector radius of rotation (ROR), 0.5° and 0.4° for detector rotation and tilt, respectively, and 1.2° for detector twist.Conclusions: Alignment parameters can be estimated using one pinhole projection of line sources. Alignment errors are largely associated with limited accuracy of the Radon transform in determining angles (α) and offsets (ρ) of the line-source projections. This alignment method may be important for multipinhole SPECT, where relative pinhole alignment may vary during rotation. For pinhole and multipinhole SPECT imaging on-board radiation therapy machines, the method could provide alignment of SPECT coordinates with those of CBCT and the LINAC.« less

  5. A line-source method for aligning on-board and other pinhole SPECT systems.

    PubMed

    Yan, Susu; Bowsher, James; Yin, Fang-Fang

    2013-12-01

    In order to achieve functional and molecular imaging as patients are in position for radiation therapy, a robotic multipinhole SPECT system is being developed. Alignment of the SPECT system-to the linear accelerator (LINAC) coordinate frame and to the coordinate frames of other on-board imaging systems such as cone-beam CT (CBCT)-is essential for target localization and image reconstruction. An alignment method that utilizes line sources and one pinhole projection is proposed and investigated to achieve this goal. Potentially, this method could also be applied to the calibration of the other pinhole SPECT systems. An alignment model consisting of multiple alignment parameters was developed which maps line sources in three-dimensional (3D) space to their two-dimensional (2D) projections on the SPECT detector. In a computer-simulation study, 3D coordinates of line-sources were defined in a reference room coordinate frame, such as the LINAC coordinate frame. Corresponding 2D line-source projections were generated by computer simulation that included SPECT blurring and noise effects. The Radon transform was utilized to detect angles (α) and offsets (ρ) of the line-source projections. Alignment parameters were then estimated by a nonlinear least squares method, based on the α and ρ values and the alignment model. Alignment performance was evaluated as a function of number of line sources, Radon transform accuracy, finite line-source width, intrinsic camera resolution, Poisson noise, and acquisition geometry. Experimental evaluations were performed using a physical line-source phantom and a pinhole-collimated gamma camera attached to a robot. In computer-simulation studies, when there was no error in determining angles (α) and offsets (ρ) of the measured projections, six alignment parameters (three translational and three rotational) were estimated perfectly using three line sources. When angles (α) and offsets (ρ) were provided by the Radon transform, estimation accuracy was reduced. The estimation error was associated with rounding errors of Radon transform, finite line-source width, Poisson noise, number of line sources, intrinsic camera resolution, and detector acquisition geometry. Statistically, the estimation accuracy was significantly improved by using four line sources rather than three and by thinner line-source projections (obtained by better intrinsic detector resolution). With five line sources, median errors were 0.2 mm for the detector translations, 0.7 mm for the detector radius of rotation, and less than 0.5° for detector rotation, tilt, and twist. In experimental evaluations, average errors relative to a different, independent registration technique were about 1.8 mm for detector translations, 1.1 mm for the detector radius of rotation (ROR), 0.5° and 0.4° for detector rotation and tilt, respectively, and 1.2° for detector twist. Alignment parameters can be estimated using one pinhole projection of line sources. Alignment errors are largely associated with limited accuracy of the Radon transform in determining angles (α) and offsets (ρ) of the line-source projections. This alignment method may be important for multipinhole SPECT, where relative pinhole alignment may vary during rotation. For pinhole and multipinhole SPECT imaging on-board radiation therapy machines, the method could provide alignment of SPECT coordinates with those of CBCT and the LINAC.

  6. Navigating Similarities and Differences in National and International Accreditation Standards: A Proposed Approach Using US Agency Requirements

    ERIC Educational Resources Information Center

    Wilkerson, Judy R.

    2017-01-01

    Purpose: Understanding and navigating the differences in standards, and the roots and rationales underlying accreditation reviews, is necessary for all institutions that seek multiple accreditations. The purpose of this paper is to demonstrate a method to assist institutional-level leaders and assessment practitioners analyze and align these…

  7. TPACK Development in a Three-Year Online Masters Program: How Do Teacher Perceptions Align with Classroom Practice?

    ERIC Educational Resources Information Center

    Staus, Nancy; Gillow-Wiles, Henry; Niess, Margaret

    2014-01-01

    A new primarily distance education Master's degree program was focused on the development of technological pedagogical content knowledge (TPACK) for integrating appropriate digital technologies in mathematics and science classes. In this mixed-method multiple case study, we documented in-service K-8 teachers' perceptions of their TPACK…

  8. DNA Translator and Aligner: HyperCard utilities to aid phylogenetic analysis of molecules.

    PubMed

    Eernisse, D J

    1992-04-01

    DNA Translator and Aligner are molecular phylogenetics HyperCard stacks for Macintosh computers. They manipulate sequence data to provide graphical gene mapping, conversions, translations and manual multiple-sequence alignment editing. DNA Translator is able to convert documented GenBank or EMBL documented sequences into linearized, rescalable gene maps whose gene sequences are extractable by clicking on the corresponding map button or by selection from a scrolling list. Provided gene maps, complete with extractable sequences, consist of nine metazoan, one yeast, and one ciliate mitochondrial DNAs and three green plant chloroplast DNAs. Single or multiple sequences can be manipulated to aid in phylogenetic analysis. Sequences can be translated between nucleic acids and proteins in either direction with flexible support of alternate genetic codes and ambiguous nucleotide symbols. Multiple aligned sequence output from diverse sources can be converted to Nexus, Hennig86 or PHYLIP format for subsequent phylogenetic analysis. Input or output alignments can be examined with Aligner, a convenient accessory stack included in the DNA Translator package. Aligner is an editor for the manual alignment of up to 100 sequences that toggles between display of matched characters and normal unmatched sequences. DNA Translator also generates graphic displays of amino acid coding and codon usage frequency relative to all other, or only synonymous, codons for approximately 70 select organism-organelle combinations. Codon usage data is compatible with spreadsheet or UWGCG formats for incorporation of additional molecules of interest. The complete package is available via anonymous ftp and is free for non-commercial uses.

  9. NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks.

    PubMed

    Hu, Jialu; Kehr, Birte; Reinert, Knut

    2014-02-15

    Owing to recent advancements in high-throughput technologies, protein-protein interaction networks of more and more species become available in public databases. The question of how to identify functionally conserved proteins across species attracts a lot of attention in computational biology. Network alignments provide a systematic way to solve this problem. However, most existing alignment tools encounter limitations in tackling this problem. Therefore, the demand for faster and more efficient alignment tools is growing. We present a fast and accurate algorithm, NetCoffee, which allows to find a global alignment of multiple protein-protein interaction networks. NetCoffee searches for a global alignment by maximizing a target function using simulated annealing on a set of weighted bipartite graphs that are constructed using a triplet approach similar to T-Coffee. To assess its performance, NetCoffee was applied to four real datasets. Our results suggest that NetCoffee remedies several limitations of previous algorithms, outperforms all existing alignment tools in terms of speed and nevertheless identifies biologically meaningful alignments. The source code and data are freely available for download under the GNU GPL v3 license at https://code.google.com/p/netcoffee/.

  10. Skeleton-based human action recognition using multiple sequence alignment

    NASA Astrophysics Data System (ADS)

    Ding, Wenwen; Liu, Kai; Cheng, Fei; Zhang, Jin; Li, YunSong

    2015-05-01

    Human action recognition and analysis is an active research topic in computer vision for many years. This paper presents a method to represent human actions based on trajectories consisting of 3D joint positions. This method first decompose action into a sequence of meaningful atomic actions (actionlets), and then label actionlets with English alphabets according to the Davies-Bouldin index value. Therefore, an action can be represented using a sequence of actionlet symbols, which will preserve the temporal order of occurrence of each of the actionlets. Finally, we employ sequence comparison to classify multiple actions through using string matching algorithms (Needleman-Wunsch). The effectiveness of the proposed method is evaluated on datasets captured by commodity depth cameras. Experiments of the proposed method on three challenging 3D action datasets show promising results.

  11. More are better, but the details matter: combinations of multiple Fresnel zone plates for improved resolution and efficiency in X-ray microscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Kenan; Jacobsen, Chris

    Fresnel zone plates used for X-ray nanofocusing face high-aspect-ratio nanofabrication challenges in combining narrow transverse features (for high spatial resolution) along with extended optical modulation along the X-ray beam direction (to improve efficiency). The stacking of multiple Fresnel zone plates along the beam direction has already been shown to offer improved characteristics of resolution and efficiency when compared with thin single zone plates. Using multislice wave propagation simulation methods, here a number of new schemes for the stacking of multiple Fresnel zone plates are considered. These include consideration of optimal thickness and spacing in the axial direction, and methods tomore » capture a fraction of the light otherwise diffracted into unwanted orders, and instead bring it into the desired first-order focus. In conclusion, the alignment tolerances for stacking multiple Fresnel zone plates are also considered.« less

  12. Simultaneous fluorescence and quantitative phase microscopy with single-pixel detectors

    NASA Astrophysics Data System (ADS)

    Liu, Yang; Suo, Jinli; Zhang, Yuanlong; Dai, Qionghai

    2018-02-01

    Multimodal microscopy offers high flexibilities for biomedical observation and diagnosis. Conventional multimodal approaches either use multiple cameras or a single camera spatially multiplexing different modes. The former needs expertise demanding alignment and the latter suffers from limited spatial resolution. Here, we report an alignment-free full-resolution simultaneous fluorescence and quantitative phase imaging approach using single-pixel detectors. By combining reference-free interferometry with single-pixel detection, we encode the phase and fluorescence of the sample in two detection arms at the same time. Then we employ structured illumination and the correlated measurements between the sample and the illuminations for reconstruction. The recovered fluorescence and phase images are inherently aligned thanks to single-pixel detection. To validate the proposed method, we built a proof-of-concept setup for first imaging the phase of etched glass with the depth of a few hundred nanometers and then imaging the fluorescence and phase of the quantum dot drop. This method holds great potential for multispectral fluorescence microscopy with additional single-pixel detectors or a spectrometer. Besides, this cost-efficient multimodal system might find broad applications in biomedical science and neuroscience.

  13. 7TMRmine: a Web server for hierarchical mining of 7TMR proteins

    PubMed Central

    Lu, Guoqing; Wang, Zhifang; Jones, Alan M; Moriyama, Etsuko N

    2009-01-01

    Background Seven-transmembrane region-containing receptors (7TMRs) play central roles in eukaryotic signal transduction. Due to their biomedical importance, thorough mining of 7TMRs from diverse genomes has been an active target of bioinformatics and pharmacogenomics research. The need for new and accurate 7TMR/GPCR prediction tools is paramount with the accelerated rate of acquisition of diverse sequence information. Currently available and often used protein classification methods (e.g., profile hidden Markov Models) are highly accurate for identifying their membership information among already known 7TMR subfamilies. However, these alignment-based methods are less effective for identifying remote similarities, e.g., identifying proteins from highly divergent or possibly new 7TMR families. In this regard, more sensitive (e.g., alignment-free) methods are needed to complement the existing protein classification methods. A better strategy would be to combine different classifiers, from more specific to more sensitive methods, to identify a broader spectrum of 7TMR protein candidates. Description We developed a Web server, 7TMRmine, by integrating alignment-free and alignment-based classifiers specifically trained to identify candidate 7TMR proteins as well as transmembrane (TM) prediction methods. This new tool enables researchers to easily assess the distribution of GPCR functionality in diverse genomes or individual newly-discovered proteins. 7TMRmine is easily customized and facilitates exploratory analysis of diverse genomes. Users can integrate various alignment-based, alignment-free, and TM-prediction methods in any combination and in any hierarchical order. Sixteen classifiers (including two TM-prediction methods) are available on the 7TMRmine Web server. Not only can the 7TMRmine tool be used for 7TMR mining, but also for general TM-protein analysis. Users can submit protein sequences for analysis, or explore pre-analyzed results for multiple genomes. The server currently includes prediction results and the summary statistics for 68 genomes. Conclusion 7TMRmine facilitates the discovery of 7TMR proteins. By combining prediction results from different classifiers in a multi-level filtering process, prioritized sets of 7TMR candidates can be obtained for further investigation. 7TMRmine can be also used as a general TM-protein classifier. Comparisons of TM and 7TMR protein distributions among 68 genomes revealed interesting differences in evolution of these protein families among major eukaryotic phyla. PMID:19538753

  14. Prediction of protein secondary structure content for the twilight zone sequences.

    PubMed

    Homaeian, Leila; Kurgan, Lukasz A; Ruan, Jishou; Cios, Krzysztof J; Chen, Ke

    2007-11-15

    Secondary protein structure carries information about local structural arrangements, which include three major conformations: alpha-helices, beta-strands, and coils. Significant majority of successful methods for prediction of the secondary structure is based on multiple sequence alignment. However, multiple alignment fails to provide accurate results when a sequence comes from the twilight zone, that is, it is characterized by low (<30%) homology. To this end, we propose a novel method for prediction of secondary structure content through comprehensive sequence representation, called PSSC-core. The method uses a multiple linear regression model and introduces a comprehensive feature-based sequence representation to predict amount of helices and strands for sequences from the twilight zone. The PSSC-core method was tested and compared with two other state-of-the-art prediction methods on a set of 2187 twilight zone sequences. The results indicate that our method provides better predictions for both helix and strand content. The PSSC-core is shown to provide statistically significantly better results when compared with the competing methods, reducing the prediction error by 5-7% for helix and 7-9% for strand content predictions. The proposed feature-based sequence representation uses a comprehensive set of physicochemical properties that are custom-designed for each of the helix and strand content predictions. It includes composition and composition moment vectors, frequency of tetra-peptides associated with helical and strand conformations, various property-based groups like exchange groups, chemical groups of the side chains and hydrophobic group, auto-correlations based on hydrophobicity, side-chain masses, hydropathy, and conformational patterns for beta-sheets. The PSSC-core method provides an alternative for predicting the secondary structure content that can be used to validate and constrain results of other structure prediction methods. At the same time, it also provides useful insight into design of successful protein sequence representations that can be used in developing new methods related to prediction of different aspects of the secondary protein structure. (c) 2007 Wiley-Liss, Inc.

  15. Association of low back pain with muscle stiffness and muscle mass of the lumbar back muscles, and sagittal spinal alignment in young and middle-aged medical workers.

    PubMed

    Masaki, Mitsuhiro; Aoyama, Tomoki; Murakami, Takashi; Yanase, Ko; Ji, Xiang; Tateuchi, Hiroshige; Ichihashi, Noriaki

    2017-11-01

    Muscle stiffness of the lumbar back muscles in low back pain (LBP) patients has not been clearly elucidated because quantitative assessment of the stiffness of individual muscles was conventionally difficult. This study aimed to examine the association of LBP with muscle stiffness assessed using ultrasonic shear wave elastography (SWE) and muscle mass of the lumbar back muscle, and spinal alignment in young and middle-aged medical workers. The study comprised 23 asymptomatic medical workers [control (CTR) group] and 9 medical workers with LBP (LBP group). Muscle stiffness and mass of the lumbar back muscles (lumbar erector spinae, multifidus, and quadratus lumborum) in the prone position were measured using ultrasonic SWE. Sagittal spinal alignment in the standing and prone positions was measured using a Spinal Mouse. The association with LBP was investigated by multiple logistic regression analysis with a forward selection method. The analysis was conducted using the shear elastic modulus and muscle thickness of the lumbar back muscles, and spinal alignment, age, body height, body weight, and sex as independent variables. Multiple logistic regression analysis showed that muscle stiffness of the lumbar multifidus muscle and body height were significant and independent determinants of LBP, but that muscle mass and spinal alignment were not. Muscle stiffness of the lumbar multifidus muscle in the LBP group was significantly higher than that in the CTR group. The results of this study suggest that LBP is associated with muscle stiffness of the lumbar multifidus muscle in young and middle-aged medical workers. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. An ensemble approach for large-scale identification of protein-protein interactions using the alignments of multiple sequences

    PubMed Central

    Wang, Lei; You, Zhu-Hong; Chen, Xing; Li, Jian-Qiang; Yan, Xin; Zhang, Wei; Huang, Yu-An

    2017-01-01

    Protein–Protein Interactions (PPI) is not only the critical component of various biological processes in cells, but also the key to understand the mechanisms leading to healthy and diseased states in organisms. However, it is time-consuming and cost-intensive to identify the interactions among proteins using biological experiments. Hence, how to develop a more efficient computational method rapidly became an attractive topic in the post-genomic era. In this paper, we propose a novel method for inference of protein-protein interactions from protein amino acids sequences only. Specifically, protein amino acids sequence is firstly transformed into Position-Specific Scoring Matrix (PSSM) generated by multiple sequences alignments; then the Pseudo PSSM is used to extract feature descriptors. Finally, ensemble Rotation Forest (RF) learning system is trained to predict and recognize PPIs based solely on protein sequence feature. When performed the proposed method on the three benchmark data sets (Yeast, H. pylori, and independent dataset) for predicting PPIs, our method can achieve good average accuracies of 98.38%, 89.75%, and 96.25%, respectively. In order to further evaluate the prediction performance, we also compare the proposed method with other methods using same benchmark data sets. The experiment results demonstrate that the proposed method consistently outperforms other state-of-the-art method. Therefore, our method is effective and robust and can be taken as a useful tool in exploring and discovering new relationships between proteins. A web server is made publicly available at the URL http://202.119.201.126:8888/PsePSSM/ for academic use. PMID:28029645

  17. Two-dimensional imaging via a narrowband MIMO radar system with two perpendicular linear arrays.

    PubMed

    Wang, Dang-wei; Ma, Xiao-yan; Su, Yi

    2010-05-01

    This paper presents a system model and method for the 2-D imaging application via a narrowband multiple-input multiple-output (MIMO) radar system with two perpendicular linear arrays. Furthermore, the imaging formulation for our method is developed through a Fourier integral processing, and the parameters of antenna array including the cross-range resolution, required size, and sampling interval are also examined. Different from the spatial sequential procedure sampling the scattered echoes during multiple snapshot illuminations in inverse synthetic aperture radar (ISAR) imaging, the proposed method utilizes a spatial parallel procedure to sample the scattered echoes during a single snapshot illumination. Consequently, the complex motion compensation in ISAR imaging can be avoided. Moreover, in our array configuration, multiple narrowband spectrum-shared waveforms coded with orthogonal polyphase sequences are employed. The mainlobes of the compressed echoes from the different filter band could be located in the same range bin, and thus, the range alignment in classical ISAR imaging is not necessary. Numerical simulations based on synthetic data are provided for testing our proposed method.

  18. Multiple frequency audio signal communication as a mechanism for neurophysiology and video data synchronization.

    PubMed

    Topper, Nicholas C; Burke, Sara N; Maurer, Andrew Porter

    2014-12-30

    Current methods for aligning neurophysiology and video data are either prepackaged, requiring the additional purchase of a software suite, or use a blinking LED with a stationary pulse-width and frequency. These methods lack significant user interface for adaptation, are expensive, or risk a misalignment of the two data streams. A cost-effective means to obtain high-precision alignment of behavioral and neurophysiological data is obtained by generating an audio-pulse embedded with two domains of information, a low-frequency binary-counting signal and a high, randomly changing frequency. This enabled the derivation of temporal information while maintaining enough entropy in the system for algorithmic alignment. The sample to frame index constructed using the audio input correlation method described in this paper enables video and data acquisition to be aligned at a sub-frame level of precision. Traditionally, a synchrony pulse is recorded on-screen via a flashing diode. The higher sampling rate of the audio input of the camcorder enables the timing of an event to be detected with greater precision. While on-line analysis and synchronization using specialized equipment may be the ideal situation in some cases, the method presented in the current paper presents a viable, low cost alternative, and gives the flexibility to interface with custom off-line analysis tools. Moreover, the ease of constructing and implements this set-up presented in the current paper makes it applicable to a wide variety of applications that require video recording. Copyright © 2014 Elsevier B.V. All rights reserved.

  19. Multiview alignment hashing for efficient image search.

    PubMed

    Liu, Li; Yu, Mengyang; Shao, Ling

    2015-03-01

    Hashing is a popular and efficient method for nearest neighbor search in large-scale data spaces by embedding high-dimensional feature descriptors into a similarity preserving Hamming space with a low dimension. For most hashing methods, the performance of retrieval heavily depends on the choice of the high-dimensional feature descriptor. Furthermore, a single type of feature cannot be descriptive enough for different images when it is used for hashing. Thus, how to combine multiple representations for learning effective hashing functions is an imminent task. In this paper, we present a novel unsupervised multiview alignment hashing approach based on regularized kernel nonnegative matrix factorization, which can find a compact representation uncovering the hidden semantics and simultaneously respecting the joint probability distribution of data. In particular, we aim to seek a matrix factorization to effectively fuse the multiple information sources meanwhile discarding the feature redundancy. Since the raised problem is regarded as nonconvex and discrete, our objective function is then optimized via an alternate way with relaxation and converges to a locally optimal solution. After finding the low-dimensional representation, the hashing functions are finally obtained through multivariable logistic regression. The proposed method is systematically evaluated on three data sets: 1) Caltech-256; 2) CIFAR-10; and 3) CIFAR-20, and the results show that our method significantly outperforms the state-of-the-art multiview hashing techniques.

  20. A portable foot-parameter-extracting system

    NASA Astrophysics Data System (ADS)

    Zhang, MingKai; Liang, Jin; Li, Wenpan; Liu, Shifan

    2016-03-01

    In order to solve the problem of automatic foot measurement in garment customization, a new automatic footparameter- extracting system based on stereo vision, photogrammetry and heterodyne multiple frequency phase shift technology is proposed and implemented. The key technologies applied in the system are studied, including calibration of projector, alignment of point clouds, and foot measurement. Firstly, a new projector calibration algorithm based on plane model has been put forward to get the initial calibration parameters and a feature point detection scheme of calibration board image is developed. Then, an almost perfect match of two clouds is achieved by performing a first alignment using the Sampled Consensus - Initial Alignment algorithm (SAC-IA) and refining the alignment using the Iterative Closest Point algorithm (ICP). Finally, the approaches used for foot-parameterextracting and the system scheme are presented in detail. Experimental results show that the RMS error of the calibration result is 0.03 pixel and the foot parameter extracting experiment shows the feasibility of the extracting algorithm. Compared with the traditional measurement method, the system can be more portable, accurate and robust.

  1. Log-odds sequence logos

    PubMed Central

    Yu, Yi-Kuo; Capra, John A.; Stojmirović, Aleksandar; Landsman, David; Altschul, Stephen F.

    2015-01-01

    Motivation: DNA and protein patterns are usefully represented by sequence logos. However, the methods for logo generation in common use lack a proper statistical basis, and are non-optimal for recognizing functionally relevant alignment columns. Results: We redefine the information at a logo position as a per-observation multiple alignment log-odds score. Such scores are positive or negative, depending on whether a column’s observations are better explained as arising from relatedness or chance. Within this framework, we propose distinct normalized maximum likelihood and Bayesian measures of column information. We illustrate these measures on High Mobility Group B (HMGB) box proteins and a dataset of enzyme alignments. Particularly in the context of protein alignments, our measures improve the discrimination of biologically relevant positions. Availability and implementation: Our new measures are implemented in an open-source Web-based logo generation program, which is available at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/logoddslogo/index.html. A stand-alone version of the program is also available from this site. Contact: altschul@ncbi.nlm.nih.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25294922

  2. Multi-temporal and multi-source remote sensing image classification by nonlinear relative normalization

    NASA Astrophysics Data System (ADS)

    Tuia, Devis; Marcos, Diego; Camps-Valls, Gustau

    2016-10-01

    Remote sensing image classification exploiting multiple sensors is a very challenging problem: data from different modalities are affected by spectral distortions and mis-alignments of all kinds, and this hampers re-using models built for one image to be used successfully in other scenes. In order to adapt and transfer models across image acquisitions, one must be able to cope with datasets that are not co-registered, acquired under different illumination and atmospheric conditions, by different sensors, and with scarce ground references. Traditionally, methods based on histogram matching have been used. However, they fail when densities have very different shapes or when there is no corresponding band to be matched between the images. An alternative builds upon manifold alignment. Manifold alignment performs a multidimensional relative normalization of the data prior to product generation that can cope with data of different dimensionality (e.g. different number of bands) and possibly unpaired examples. Aligning data distributions is an appealing strategy, since it allows to provide data spaces that are more similar to each other, regardless of the subsequent use of the transformed data. In this paper, we study a methodology that aligns data from different domains in a nonlinear way through kernelization. We introduce the Kernel Manifold Alignment (KEMA) method, which provides a flexible and discriminative projection map, exploits only a few labeled samples (or semantic ties) in each domain, and reduces to solving a generalized eigenvalue problem. We successfully test KEMA in multi-temporal and multi-source very high resolution classification tasks, as well as on the task of making a model invariant to shadowing for hyperspectral imaging.

  3. RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation.

    PubMed

    Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin

    2017-01-21

    RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/ .

  4. Accuracy Estimation and Parameter Advising for Protein Multiple Sequence Alignment

    PubMed Central

    DeBlasio, Dan

    2013-01-01

    Abstract We develop a novel and general approach to estimating the accuracy of multiple sequence alignments without knowledge of a reference alignment, and use our approach to address a new task that we call parameter advising: the problem of choosing values for alignment scoring function parameters from a given set of choices to maximize the accuracy of a computed alignment. For protein alignments, we consider twelve independent features that contribute to a quality alignment. An accuracy estimator is learned that is a polynomial function of these features; its coefficients are determined by minimizing its error with respect to true accuracy using mathematical optimization. Compared to prior approaches for estimating accuracy, our new approach (a) introduces novel feature functions that measure nonlocal properties of an alignment yet are fast to evaluate, (b) considers more general classes of estimators beyond linear combinations of features, and (c) develops new regression formulations for learning an estimator from examples; in addition, for parameter advising, we (d) determine the optimal parameter set of a given cardinality, which specifies the best parameter values from which to choose. Our estimator, which we call Facet (for “feature-based accuracy estimator”), yields a parameter advisor that on the hardest benchmarks provides more than a 27% improvement in accuracy over the best default parameter choice, and for parameter advising significantly outperforms the best prior approaches to assessing alignment quality. PMID:23489379

  5. gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances.

    PubMed

    Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav

    2016-01-01

    Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos).

  6. gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances

    PubMed Central

    Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav

    2016-01-01

    Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos). PMID:27846272

  7. Compression-based distance (CBD): a simple, rapid, and accurate method for microbiota composition comparison

    PubMed Central

    2013-01-01

    Background Perturbations in intestinal microbiota composition have been associated with a variety of gastrointestinal tract-related diseases. The alleviation of symptoms has been achieved using treatments that alter the gastrointestinal tract microbiota toward that of healthy individuals. Identifying differences in microbiota composition through the use of 16S rRNA gene hypervariable tag sequencing has profound health implications. Current computational methods for comparing microbial communities are usually based on multiple alignments and phylogenetic inference, making them time consuming and requiring exceptional expertise and computational resources. As sequencing data rapidly grows in size, simpler analysis methods are needed to meet the growing computational burdens of microbiota comparisons. Thus, we have developed a simple, rapid, and accurate method, independent of multiple alignments and phylogenetic inference, to support microbiota comparisons. Results We create a metric, called compression-based distance (CBD) for quantifying the degree of similarity between microbial communities. CBD uses the repetitive nature of hypervariable tag datasets and well-established compression algorithms to approximate the total information shared between two datasets. Three published microbiota datasets were used as test cases for CBD as an applicable tool. Our study revealed that CBD recaptured 100% of the statistically significant conclusions reported in the previous studies, while achieving a decrease in computational time required when compared to similar tools without expert user intervention. Conclusion CBD provides a simple, rapid, and accurate method for assessing distances between gastrointestinal tract microbiota 16S hypervariable tag datasets. PMID:23617892

  8. Compression-based distance (CBD): a simple, rapid, and accurate method for microbiota composition comparison.

    PubMed

    Yang, Fang; Chia, Nicholas; White, Bryan A; Schook, Lawrence B

    2013-04-23

    Perturbations in intestinal microbiota composition have been associated with a variety of gastrointestinal tract-related diseases. The alleviation of symptoms has been achieved using treatments that alter the gastrointestinal tract microbiota toward that of healthy individuals. Identifying differences in microbiota composition through the use of 16S rRNA gene hypervariable tag sequencing has profound health implications. Current computational methods for comparing microbial communities are usually based on multiple alignments and phylogenetic inference, making them time consuming and requiring exceptional expertise and computational resources. As sequencing data rapidly grows in size, simpler analysis methods are needed to meet the growing computational burdens of microbiota comparisons. Thus, we have developed a simple, rapid, and accurate method, independent of multiple alignments and phylogenetic inference, to support microbiota comparisons. We create a metric, called compression-based distance (CBD) for quantifying the degree of similarity between microbial communities. CBD uses the repetitive nature of hypervariable tag datasets and well-established compression algorithms to approximate the total information shared between two datasets. Three published microbiota datasets were used as test cases for CBD as an applicable tool. Our study revealed that CBD recaptured 100% of the statistically significant conclusions reported in the previous studies, while achieving a decrease in computational time required when compared to similar tools without expert user intervention. CBD provides a simple, rapid, and accurate method for assessing distances between gastrointestinal tract microbiota 16S hypervariable tag datasets.

  9. Information Compression, Multiple Alignment, and the Representation and Processing of Knowledge in the Brain

    PubMed Central

    Wolff, J. Gerard

    2016-01-01

    The SP theory of intelligence, with its realization in the SP computer model, aims to simplify and integrate observations and concepts across artificial intelligence, mainstream computing, mathematics, and human perception and cognition, with information compression as a unifying theme. This paper describes how abstract structures and processes in the theory may be realized in terms of neurons, their interconnections, and the transmission of signals between neurons. This part of the SP theory—SP-neural—is a tentative and partial model for the representation and processing of knowledge in the brain. Empirical support for the SP theory—outlined in the paper—provides indirect support for SP-neural. In the abstract part of the SP theory (SP-abstract), all kinds of knowledge are represented with patterns, where a pattern is an array of atomic symbols in one or two dimensions. In SP-neural, the concept of a “pattern” is realized as an array of neurons called a pattern assembly, similar to Hebb's concept of a “cell assembly” but with important differences. Central to the processing of information in SP-abstract is information compression via the matching and unification of patterns (ICMUP) and, more specifically, information compression via the powerful concept of multiple alignment, borrowed and adapted from bioinformatics. Processes such as pattern recognition, reasoning and problem solving are achieved via the building of multiple alignments, while unsupervised learning is achieved by creating patterns from sensory information and also by creating patterns from multiple alignments in which there is a partial match between one pattern and another. It is envisaged that, in SP-neural, short-lived neural structures equivalent to multiple alignments will be created via an inter-play of excitatory and inhibitory neural signals. It is also envisaged that unsupervised learning will be achieved by the creation of pattern assemblies from sensory information and from the neural equivalents of multiple alignments, much as in the non-neural SP theory—and significantly different from the “Hebbian” kinds of learning which are widely used in the kinds of artificial neural network that are popular in computer science. The paper discusses several associated issues, with relevant empirical evidence. PMID:27857695

  10. Information Compression, Multiple Alignment, and the Representation and Processing of Knowledge in the Brain.

    PubMed

    Wolff, J Gerard

    2016-01-01

    The SP theory of intelligence , with its realization in the SP computer model , aims to simplify and integrate observations and concepts across artificial intelligence, mainstream computing, mathematics, and human perception and cognition, with information compression as a unifying theme. This paper describes how abstract structures and processes in the theory may be realized in terms of neurons, their interconnections, and the transmission of signals between neurons. This part of the SP theory- SP-neural -is a tentative and partial model for the representation and processing of knowledge in the brain. Empirical support for the SP theory-outlined in the paper-provides indirect support for SP-neural. In the abstract part of the SP theory (SP-abstract), all kinds of knowledge are represented with patterns , where a pattern is an array of atomic symbols in one or two dimensions. In SP-neural, the concept of a "pattern" is realized as an array of neurons called a pattern assembly , similar to Hebb's concept of a "cell assembly" but with important differences. Central to the processing of information in SP-abstract is information compression via the matching and unification of patterns (ICMUP) and, more specifically, information compression via the powerful concept of multiple alignment , borrowed and adapted from bioinformatics. Processes such as pattern recognition, reasoning and problem solving are achieved via the building of multiple alignments, while unsupervised learning is achieved by creating patterns from sensory information and also by creating patterns from multiple alignments in which there is a partial match between one pattern and another. It is envisaged that, in SP-neural, short-lived neural structures equivalent to multiple alignments will be created via an inter-play of excitatory and inhibitory neural signals. It is also envisaged that unsupervised learning will be achieved by the creation of pattern assemblies from sensory information and from the neural equivalents of multiple alignments, much as in the non-neural SP theory-and significantly different from the "Hebbian" kinds of learning which are widely used in the kinds of artificial neural network that are popular in computer science. The paper discusses several associated issues, with relevant empirical evidence.

  11. Assessing the statistical robustness of inter- and intra-basinal carbon isotope chemostratigraphic correlation

    NASA Astrophysics Data System (ADS)

    Hay, C.; Creveling, J. R.; Huybers, P. J.

    2016-12-01

    Excursions in the stable carbon isotopic composition of carbonate rocks (δ13Ccarb) can facilitate correlation of Precambrian and Phanerozoic sedimentary successions at a higher temporal resolution than radiometric and biostratigraphic frameworks typically afford. Within the bounds of litho- and biostratigraphic constraints, stratigraphers often correlate isotopic patterns between distant stratigraphic sections through visual alignment of local maxima and minima of isotopic values. The reproducibility of this method can prove challenging and, thus, evaluating the statistical robustness of intrabasinal composite carbon isotope curves, and global correlations to these reference curves, remains difficult. To assess the reproducibility of stratigraphic alignment of δ13Ccarb data, and correlations between carbon isotope excursions, we employ a numerical dynamic time warping methodology that stretches and squeezes the time axis of a record to obtain an optimal correlation (in a least-squares sense) between time-uncertain series of data. In particular, we assess various alignments between series of Early Cambrian δ13Ccarb data with respect to plausible matches. We first show that an alignment of these records obtained visually, and published previously, is broadly reproducible using dynamic time warping. Alternative alignments with similar goodness of fits are also obtainable, and their stratigraphic plausibility are discussed. This approach should be generalizable to an algorithm for the purposes of developing a library of plausible alignments between multiple time-uncertain stratigraphic records.

  12. Molecular systematics of terraranas (Anura: Brachycephaloidea) with an assessment of the effects of alignment and optimality criteria.

    PubMed

    Padial, José M; Grant, Taran; Frost, Darrel R

    2014-06-26

    Brachycephaloidea is a monophyletic group of frogs with more than 1000 species distributed throughout the New World tropics, subtropics, and Andean regions. Recently, the group has been the target of multiple molecular phylogenetic analyses, resulting in extensive changes in its taxonomy. Here, we test previous hypotheses of phylogenetic relationships for the group by combining available molecular evidence (sequences of 22 genes representing 431 ingroup and 25 outgroup terminals) and performing a tree-alignment analysis under the parsimony optimality criterion using the program POY. To elucidate the effects of alignment and optimality criterion on phylogenetic inferences, we also used the program MAFFT to obtain a similarity-alignment for analysis under both parsimony and maximum likelihood using the programs TNT and GARLI, respectively. Although all three analytical approaches agreed on numerous points, there was also extensive disagreement. Tree-alignment under parsimony supported the monophyly of the ingroup and the sister group relationship of the monophyletic marsupial frogs (Hemiphractidae), while maximum likelihood and parsimony analyses of the MAFFT similarity-alignment did not. All three methods differed with respect to the position of Ceuthomantis smaragdinus (Ceuthomantidae), with tree-alignment using parsimony recovering this species as the sister of Pristimantis + Yunganastes. All analyses rejected the monophyly of Strabomantidae and Strabomantinae as originally defined, and the tree-alignment analysis under parsimony further rejected the recently redefined Craugastoridae and Pristimantinae. Despite the greater emphasis in the systematics literature placed on the choice of optimality criterion for evaluating trees than on the choice of method for aligning DNA sequences, we found that the topological differences attributable to the alignment method were as great as those caused by the optimality criterion. Further, the optimal tree-alignment indicates that insertions and deletions occurred in twice as many aligned positions as implied by the optimal similarity-alignment, confirming previous findings that sequence turnover through insertion and deletion events plays a greater role in molecular evolution than indicated by similarity-alignments. Our results also provide a clear empirical demonstration of the different effects of wildcard taxa produced by missing data in parsimony and maximum likelihood analyses. Specifically, maximum likelihood analyses consistently (81% bootstrap frequency) provided spurious resolution despite a lack of evidence, whereas parsimony correctly depicted the ambiguity due to missing data by collapsing unsupported nodes. We provide a new taxonomy for the group that retains previously recognized Linnaean taxa except for Ceuthomantidae, Strabomantidae, and Strabomantinae. A phenotypically diagnosable superfamily is recognized formally as Brachycephaloidea, with the informal, unranked name terrarana retained as the standard common name for these frogs. We recognize three families within Brachycephaloidea that are currently diagnosable solely on molecular grounds (Brachycephalidae, Craugastoridae, and Eleutherodactylidae), as well as five subfamilies (Craugastorinae, Eleutherodactylinae, Holoadeninae, Phyzelaphryninae, and Pristimantinae) corresponding in large part to previous families and subfamilies. Our analyses upheld the monophyly of all tested genera, but we found numerous subgeneric taxa to be non-monophyletic and modified the taxonomy accordingly.

  13. Structure-Property Relations in Carbon Nanotube Fibers by Downscaling Solution Processing.

    PubMed

    Headrick, Robert J; Tsentalovich, Dmitri E; Berdegué, Julián; Bengio, Elie Amram; Liberman, Lucy; Kleinerman, Olga; Lucas, Matthew S; Talmon, Yeshayahu; Pasquali, Matteo

    2018-03-01

    At the microscopic scale, carbon nanotubes (CNTs) combine impressive tensile strength and electrical conductivity; however, their macroscopic counterparts have not met expectations. The reasons are variously attributed to inherent CNT sample properties (diameter and helicity polydispersity, high defect density, insufficient length) and manufacturing shortcomings (inadequate ordering and packing), which can lead to poor transmission of stress and current. To efficiently investigate the disparity between microscopic and macroscopic properties, a new method is introduced for processing microgram quantities of CNTs into highly oriented and well-packed fibers. CNTs are dissolved into chlorosulfonic acid and processed into aligned films; each film can be peeled and twisted into multiple discrete fibers. Fibers fabricated by this method and solution-spinning are directly compared to determine the impact of alignment, twist, packing density, and length. Surprisingly, these discrete fibers can be twice as strong as their solution-spun counterparts despite a lower degree of alignment. Strength appears to be more sensitive to internal twist and packing density, while fiber conductivity is essentially equivalent among the two sets of samples. Importantly, this rapid fiber manufacturing method uses three orders of magnitude less material than solution spinning, expanding the experimental parameter space and enabling the exploration of unique CNT sources. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Albrecht, Simon; Winn, Joshua N.; Marcy, Geoffrey W.

    We measure the sky-projected stellar obliquities ({lambda}) in the multiple-transiting planetary systems KOI-94 and Kepler-25, using the Rossiter-McLaughlin effect. In both cases, the host stars are well aligned with the orbital planes of the planets. For KOI-94 we find {lambda} = -11 Degree-Sign {+-} 11 Degree-Sign , confirming a recent result by Hirano and coworkers. Kepler-25 was a more challenging case, because the transit depth is unusually small (0.13%). To obtain the obliquity, it was necessary to use prior knowledge of the star's projected rotation rate and apply two different analysis methods to independent wavelength regions of the spectra. Themore » two methods gave consistent results, {lambda} = 7 Degree-Sign {+-} 8 Degree-Sign and -0. Degree-Sign 5 {+-} 5. Degree-Sign 7. There are now a total of five obliquity measurements for host stars of systems of multiple-transiting planets, all of which are consistent with spin-orbit alignment. This alignment is unlikely to be the result of tidal interactions because of the relatively large orbital distances and low planetary masses in the systems. In this respect, the multiplanet host stars differ from hot-Jupiter host stars, which commonly have large spin-orbit misalignments whenever tidal interactions are weak. In particular, the weak-tide subset of hot-Jupiter hosts has obliquities consistent with an isotropic distribution (p = 0.6), but the multiplanet hosts are incompatible with such a distribution (p {approx} 10{sup -6}). This suggests that high obliquities are confined to hot-Jupiter systems, and provides further evidence that hot-Jupiter formation involves processes that tilt the planetary orbit.« less

  15. Free-viewpoint video of human actors using multiple handheld Kinects.

    PubMed

    Ye, Genzhi; Liu, Yebin; Deng, Yue; Hasler, Nils; Ji, Xiangyang; Dai, Qionghai; Theobalt, Christian

    2013-10-01

    We present an algorithm for creating free-viewpoint video of interacting humans using three handheld Kinect cameras. Our method reconstructs deforming surface geometry and temporal varying texture of humans through estimation of human poses and camera poses for every time step of the RGBZ video. Skeletal configurations and camera poses are found by solving a joint energy minimization problem, which optimizes the alignment of RGBZ data from all cameras, as well as the alignment of human shape templates to the Kinect data. The energy function is based on a combination of geometric correspondence finding, implicit scene segmentation, and correspondence finding using image features. Finally, texture recovery is achieved through jointly optimization on spatio-temporal RGB data using matrix completion. As opposed to previous methods, our algorithm succeeds on free-viewpoint video of human actors under general uncontrolled indoor scenes with potentially dynamic background, and it succeeds even if the cameras are moving.

  16. Protein Sectors: Statistical Coupling Analysis versus Conservation

    PubMed Central

    Teşileanu, Tiberiu; Colwell, Lucy J.; Leibler, Stanislas

    2015-01-01

    Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed “sectors”. The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation. PMID:25723535

  17. Generic accelerated sequence alignment in SeqAn using vectorization and multi-threading.

    PubMed

    Rahn, René; Budach, Stefan; Costanza, Pascal; Ehrhardt, Marcel; Hancox, Jonny; Reinert, Knut

    2018-05-03

    Pairwise sequence alignment is undoubtedly a central tool in many bioinformatics analyses. In this paper, we present a generically accelerated module for pairwise sequence alignments applicable for a broad range of applications. In our module, we unified the standard dynamic programming kernel used for pairwise sequence alignments and extended it with a generalized inter-sequence vectorization layout, such that many alignments can be computed simultaneously by exploiting SIMD (Single Instruction Multiple Data) instructions of modern processors. We then extended the module by adding two layers of thread-level parallelization, where we a) distribute many independent alignments on multiple threads and b) inherently parallelize a single alignment computation using a work stealing approach producing a dynamic wavefront progressing along the minor diagonal. We evaluated our alignment vectorization and parallelization on different processors, including the newest Intel® Xeon® (Skylake) and Intel® Xeon Phi™ (KNL) processors, and use cases. The instruction set AVX512-BW (Byte and Word), available on Skylake processors, can genuinely improve the performance of vectorized alignments. We could run single alignments 1600 times faster on the Xeon Phi™ and 1400 times faster on the Xeon® than executing them with our previous sequential alignment module. The module is programmed in C++ using the SeqAn (Reinert et al., 2017) library and distributed with version 2.4. under the BSD license. We support SSE4, AVX2, AVX512 instructions and included UME::SIMD, a SIMD-instruction wrapper library, to extend our module for further instruction sets. We thoroughly test all alignment components with all major C++ compilers on various platforms. rene.rahn@fu-berlin.de.

  18. Learning of Alignment Rules between Concept Hierarchies

    NASA Astrophysics Data System (ADS)

    Ichise, Ryutaro; Takeda, Hideaki; Honiden, Shinichi

    With the rapid advances of information technology, we are acquiring much information than ever before. As a result, we need tools for organizing this data. Concept hierarchies such as ontologies and information categorizations are powerful and convenient methods for accomplishing this goal, which have gained wide spread acceptance. Although each concept hierarchy is useful, it is difficult to employ multiple concept hierarchies at the same time because it is hard to align their conceptual structures. This paper proposes a rule learning method that inputs information from a source concept hierarchy and finds suitable location for them in a target hierarchy. The key idea is to find the most similar categories in each hierarchy, where similarity is measured by the κ(kappa) statistic that counts instances belonging to both categories. In order to evaluate our method, we conducted experiments using two internet directories: Yahoo! and LYCOS. We map information instances from the source directory into the target directory, and show that our learned rules agree with a human-generated assignment 76% of the time.

  19. Deep RNA sequencing analysis of readthrough gene fusions in human prostate adenocarcinoma and reference samples

    PubMed Central

    2011-01-01

    Background Readthrough fusions across adjacent genes in the genome, or transcription-induced chimeras (TICs), have been estimated using expressed sequence tag (EST) libraries to involve 4-6% of all genes. Deep transcriptional sequencing (RNA-Seq) now makes it possible to study the occurrence and expression levels of TICs in individual samples across the genome. Methods We performed single-end RNA-Seq on three human prostate adenocarcinoma samples and their corresponding normal tissues, as well as brain and universal reference samples. We developed two bioinformatics methods to specifically identify TIC events: a targeted alignment method using artificial exon-exon junctions within 200,000 bp from adjacent genes, and genomic alignment allowing splicing within individual reads. We performed further experimental verification and characterization of selected TIC and fusion events using quantitative RT-PCR and comparative genomic hybridization microarrays. Results Targeted alignment against artificial exon-exon junctions yielded 339 distinct TIC events, including 32 gene pairs with multiple isoforms. The false discovery rate was estimated to be 1.5%. Spliced alignment to the genome was less sensitive, finding only 18% of those found by targeted alignment in 33-nt reads and 59% of those in 50-nt reads. However, spliced alignment revealed 30 cases of TICs with intervening exons, in addition to distant inversions, scrambled genes, and translocations. Our findings increase the catalog of observed TIC gene pairs by 66%. We verified 6 of 6 predicted TICs in all prostate samples, and 2 of 5 predicted novel distant gene fusions, both private events among 54 prostate tumor samples tested. Expression of TICs correlates with that of the upstream gene, which can explain the prostate-specific pattern of some TIC events and the restriction of the SLC45A3-ELK4 e4-e2 TIC to ERG-negative prostate samples, as confirmed in 20 matched prostate tumor and normal samples and 9 lung cancer cell lines. Conclusions Deep transcriptional sequencing and analysis with targeted and spliced alignment methods can effectively identify TIC events across the genome in individual tissues. Prostate and reference samples exhibit a wide range of TIC events, involving more genes than estimated previously using ESTs. Tissue specificity of TIC events is correlated with expression patterns of the upstream gene. Some TIC events, such as MSMB-NCOA4, may play functional roles in cancer. PMID:21261984

  20. Alignment method for parabolic trough solar concentrators

    DOEpatents

    Diver, Richard B [Albuquerque, NM

    2010-02-23

    A Theoretical Overlay Photographic (TOP) alignment method uses the overlay of a theoretical projected image of a perfectly aligned concentrator on a photographic image of the concentrator to align the mirror facets of a parabolic trough solar concentrator. The alignment method is practical and straightforward, and inherently aligns the mirror facets to the receiver. When integrated with clinometer measurements for which gravity and mechanical drag effects have been accounted for and which are made in a manner and location consistent with the alignment method, all of the mirrors on a common drive can be aligned and optimized for any concentrator orientation.

  1. IVisTMSA: Interactive Visual Tools for Multiple Sequence Alignments.

    PubMed

    Pervez, Muhammad Tariq; Babar, Masroor Ellahi; Nadeem, Asif; Aslam, Naeem; Naveed, Nasir; Ahmad, Sarfraz; Muhammad, Shah; Qadri, Salman; Shahid, Muhammad; Hussain, Tanveer; Javed, Maryam

    2015-01-01

    IVisTMSA is a software package of seven graphical tools for multiple sequence alignments. MSApad is an editing and analysis tool. It can load 409% more data than Jalview, STRAP, CINEMA, and Base-by-Base. MSA comparator allows the user to visualize consistent and inconsistent regions of reference and test alignments of more than 21-MB size in less than 12 seconds. MSA comparator is 5,200% efficient and more than 40% efficient as compared to BALiBASE c program and FastSP, respectively. MSA reconstruction tool provides graphical user interfaces for four popular aligners and allows the user to load several sequence files at a time. FASTA generator converts seven formats of alignments of unlimited size into FASTA format in a few seconds. MSA ID calculator calculates identity matrix of more than 11,000 sequences with a sequence length of 2,696 base pairs in less than 100 seconds. Tree and Distance Matrix calculation tools generate phylogenetic tree and distance matrix, respectively, using neighbor joining% identity and BLOSUM 62 matrix.

  2. SWPhylo - A Novel Tool for Phylogenomic Inferences by Comparison of Oligonucleotide Patterns and Integration of Genome-Based and Gene-Based Phylogenetic Trees.

    PubMed

    Yu, Xiaoyu; Reva, Oleg N

    2018-01-01

    Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA.

  3. Superposition and alignment of labeled point clouds.

    PubMed

    Fober, Thomas; Glinca, Serghei; Klebe, Gerhard; Hüllermeier, Eyke

    2011-01-01

    Geometric objects are often represented approximately in terms of a finite set of points in three-dimensional euclidean space. In this paper, we extend this representation to what we call labeled point clouds. A labeled point cloud is a finite set of points, where each point is not only associated with a position in three-dimensional space, but also with a discrete class label that represents a specific property. This type of model is especially suitable for modeling biomolecules such as proteins and protein binding sites, where a label may represent an atom type or a physico-chemical property. Proceeding from this representation, we address the question of how to compare two labeled points clouds in terms of their similarity. Using fuzzy modeling techniques, we develop a suitable similarity measure as well as an efficient evolutionary algorithm to compute it. Moreover, we consider the problem of establishing an alignment of the structures in the sense of a one-to-one correspondence between their basic constituents. From a biological point of view, alignments of this kind are of great interest, since mutually corresponding molecular constituents offer important information about evolution and heredity, and can also serve as a means to explain a degree of similarity. In this paper, we therefore develop a method for computing pairwise or multiple alignments of labeled point clouds. To this end, we proceed from an optimal superposition of the corresponding point clouds and construct an alignment which is as much as possible in agreement with the neighborhood structure established by this superposition. We apply our methods to the structural analysis of protein binding sites.

  4. SWPhylo – A Novel Tool for Phylogenomic Inferences by Comparison of Oligonucleotide Patterns and Integration of Genome-Based and Gene-Based Phylogenetic Trees

    PubMed Central

    Yu, Xiaoyu; Reva, Oleg N

    2018-01-01

    Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA. PMID:29511354

  5. High accuracy prediction of beta-turns and their types using propensities and multiple alignments.

    PubMed

    Fuchs, Patrick F J; Alix, Alain J P

    2005-06-01

    We have developed a method that predicts both the presence and the type of beta-turns, using a straightforward approach based on propensities and multiple alignments. The propensities were calculated classically, but the way to use them for prediction was completely new: starting from a tetrapeptide sequence on which one wants to evaluate the presence of a beta-turn, the propensity for a given residue is modified by taking into account all the residues present in the multiple alignment at this position. The evaluation of a score is then done by weighting these propensities by the use of Position-specific score matrices generated by PSI-BLAST. The introduction of secondary structure information predicted by PSIPRED or SSPRO2 as well as taking into account the flanking residues around the tetrapeptide improved the accuracy greatly. This latter evaluated on a database of 426 reference proteins (previously used on other studies) by a sevenfold crossvalidation gave very good results with a Matthews Correlation Coefficient (MCC) of 0.42 and an overall prediction accuracy of 74.8%; this places our method among the best ones. A jackknife test was also done, which gave results within the same range. This shows that it is possible to reach neural networks accuracy with considerably less computional cost and complexity. Furthermore, propensities remain excellent descriptors of amino acid tendencies to belong to beta-turns, which can be useful for peptide or protein engineering and design. For beta-turn type prediction, we reached the best accuracy ever published in terms of MCC (except for the irregular type IV) in the range of 0.25-0.30 for types I, II, and I' and 0.13-0.15 for types VIII, II', and IV. To our knowledge, our method is the only one available on the Web that predicts types I' and II'. The accuracy evaluated on two larger databases of 547 and 823 proteins was not improved significantly. All of this was implemented into a Web server called COUDES (French acronym for: Chercher Ou Une Deviation Existe Surement), which is available at the following URL: http://bioserv.rpbs.jussieu.fr/Coudes/index.html within the new bioinformatics platform RPBS.

  6. Phylogenic inference using alignment-free methods for applications in microbial community surveys using 16s rRNA gene

    PubMed Central

    2017-01-01

    The diversity of microbiota is best explored by understanding the phylogenetic structure of the microbial communities. Traditionally, sequence alignment has been used for phylogenetic inference. However, alignment-based approaches come with significant challenges and limitations when massive amounts of data are analyzed. In the recent decade, alignment-free approaches have enabled genome-scale phylogenetic inference. Here we evaluate three alignment-free methods: ACS, CVTree, and Kr for phylogenetic inference with 16s rRNA gene data. We use a taxonomic gold standard to compare the accuracy of alignment-free phylogenetic inference with that of common microbiome-wide phylogenetic inference pipelines based on PyNAST and MUSCLE alignments with FastTree and RAxML. We re-simulate fecal communities from Human Microbiome Project data to evaluate the performance of the methods on datasets with properties of real data. Our comparisons show that alignment-free methods are not inferior to alignment-based methods in giving accurate and robust phylogenic trees. Moreover, consensus ensembles of alignment-free phylogenies are superior to those built from alignment-based methods in their ability to highlight community differences in low power settings. In addition, the overall running times of alignment-based and alignment-free phylogenetic inference are comparable. Taken together our empirical results suggest that alignment-free methods provide a viable approach for microbiome-wide phylogenetic inference. PMID:29136663

  7. ADOMA: A Command Line Tool to Modify ClustalW Multiple Alignment Output.

    PubMed

    Zaal, Dionne; Nota, Benjamin

    2016-01-01

    We present ADOMA, a command line tool that produces alternative outputs from ClustalW multiple alignments of nucleotide or protein sequences. ADOMA can simplify the output of alignments by showing only the different residues between sequences, which is often desirable when only small differences such as single nucleotide polymorphisms are present (e.g., between different alleles). Another feature of ADOMA is that it can enhance the ClustalW output by coloring the residues in the alignment. This tool is easily integrated into automated Linux pipelines for next-generation sequencing data analysis, and may be useful for researchers in a broad range of scientific disciplines including evolutionary biology and biomedical sciences. The source code is freely available at https://sourceforge. net/projects/adoma/. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Bellerophon: a program to detect chimeric sequences in multiple sequence alignments.

    PubMed

    Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip

    2004-09-22

    Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments. Bellerophon is available as an interactive web server at http://foo.maths.uq.edu.au/~huber/bellerophon.pl

  9. Phylogenetic Characterization of Transport Protein Superfamilies: Superiority of SuperfamilyTree Programs over Those Based on Multiple Alignments

    PubMed Central

    Chen, Jonathan S.; Reddy, Vamsee; Chen, Joshua H.; Shlykov, Maksim A.; Zheng, Wei Hao; Cho, Jaehoon; Yen, Ming Ren; Saier, Milton H.

    2012-01-01

    Transport proteins function in the translocation of ions, solutes and macromolecules across cellular and organellar membranes. These integral membrane proteins fall into >600 families as tabulated in the Transporter Classification Database (www.tcdb.org). Recent studies, some of which are reported here, define distant phylogenetic relationships between families with the creation of superfamilies. Several of these are analyzed using a novel set of programs designed to allow reliable prediction of phylogenetic trees when sequence divergence is too great to allow the use of multiple alignments. These new programs, called SuperfamilyTree1 and 2 (SFT1 and 2), allow display of protein and family relationships, respectively, based on thousands of comparative BLAST scores rather than multiple alignments. Superfamilies analyzed include: (1) Aerolysins, (2) RTX Toxins, (3) Defensins, (4) Ion Transporters, (5) Bile/Arsenite/Riboflavin Transporters, (6) Cation: Proton Antiporters, and (7) the Glucose/Fructose/Lactose superfamily within the prokaryotic phosphoenol pyruvate-dependent Phosphotransferase System. In addition to defining the phylogenetic relationships of the proteins and families within these seven superfamilies, evidence is provided showing that the SFT programs outperform programs that are based on multiple alignments whenever sequence divergence of superfamily members is extensive. The SFT programs should be applicable to virtually any superfamily of proteins or nucleic acids. PMID:22286036

  10. An Advanced Electrospinning Method of Fabricating Nanofibrous Patterned Architectures with Controlled Deposition and Desired Alignment

    NASA Astrophysics Data System (ADS)

    Rasel, Sheikh Md

    We introduce a versatile advanced method of electrospinning for fabricating various kinds of nanofibrous patterns along with desired alignment, controlled amount of deposition and locally variable density into the architectures. In this method, we employed multiple electrodes whose potentials have been altered in milliseconds with the help of microprocessor based control system. Therefore, key success of this method was that the electrical field as well as charge carrying fibers could be switched shortly from one electrode's location to another, as a result, electrospun fibers could be deposited on the designated areas with desired alignment. A wide range of nanofibrous patterned architectures were constructed using proper arrangement of multiple electrodes. By controlling the concurrent activation time of two adjacent electrodes, we demonstrated that amount of fibers going into the pattern can be adjusted and desired alignment in electrospun fibers can be obtained. We also revealed that the deposition density of electrospun fibers in different areas of patterned architectures can be varied. We showed that by controlling the deposition time between two adjacent electrodes, a number of functionally graded patterns can be generated with uniaxial alignment. We also demonstrated that this handy method was capable of producing random, aligned, and multidirectional nanofibrous mats by engaging a number of electrodes and switching them in desired patterns. A comprehensive study using finite element method was carried out to understand the effects of electrical field. Simulation results revealed that electrical field strength alters shortly based on electrode control switch patterns. Nanofibrous polyvinyl alcohol (PVA) scaffolds and its composite reinforced with wollastonite and wood flour were fabricated using rotating drum electrospinning technique. Morphological, mechanical, and thermal, properties were characterized on PVA/wollastonite and PVA/wood flour nanocomposites containing 0, 5, 10, and 20 wt % of fillers. Morphological analyses carried out by digital optical microscope, scanning electron microscopy, x-ray computed tomography, and Fourier transform infrared spectroscopy, confirmed the presence and well dispersion of fillers in the composites. In addition, improvement of mechanical properties with increased filler content further emphasized the adhesion between matrix and reinforcement. PVA with 20 wt % wollastonite composite exhibited the highest tensile strength (11.99 MPa) and tensile module (198 MPa) as compared to pure PVA (3.92 MPa and 83 MPa, respectively). Moreover, the thermal tests demonstrated that there is no major deviation in the thermal stability due to the addition of wollastonite in PVA scaffolds. Almost similar trend was observed in PVA/wood flour nanocomposites where tensile strength improved by 228 % for 20 wt % of reinforcement. The PVA/wollastonite and PVA/wood flour fibrous nanocomposite which poses higher mechanical properties might be potentially suitable for many advanced applications such as filtration, tissue engineering, and food processing. We believe this study will contribute to further scientific understanding of the patterning mechanism of electrospun nanofibers and to allow for variety of design of specific patterned nanofibrous architectures with desired functional properties. Therefore, this improved scheme of electrospinning can have significant impact in a broad range of applications including tissue engineering scaffolds, filtrations, and nanoelectronics.

  11. Analyzing and synthesizing phylogenies using tree alignment graphs.

    PubMed

    Smith, Stephen A; Brown, Joseph W; Hinchliff, Cody E

    2013-01-01

    Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe.

  12. Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs

    PubMed Central

    Smith, Stephen A.; Brown, Joseph W.; Hinchliff, Cody E.

    2013-01-01

    Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe. PMID:24086118

  13. A fast cross-validation method for alignment of electron tomography images based on Beer-Lambert law.

    PubMed

    Yan, Rui; Edwards, Thomas J; Pankratz, Logan M; Kuhn, Richard J; Lanman, Jason K; Liu, Jun; Jiang, Wen

    2015-11-01

    In electron tomography, accurate alignment of tilt series is an essential step in attaining high-resolution 3D reconstructions. Nevertheless, quantitative assessment of alignment quality has remained a challenging issue, even though many alignment methods have been reported. Here, we report a fast and accurate method, tomoAlignEval, based on the Beer-Lambert law, for the evaluation of alignment quality. Our method is able to globally estimate the alignment accuracy by measuring the goodness of log-linear relationship of the beam intensity attenuations at different tilt angles. Extensive tests with experimental data demonstrated its robust performance with stained and cryo samples. Our method is not only significantly faster but also more sensitive than measurements of tomogram resolution using Fourier shell correlation method (FSCe/o). From these tests, we also conclude that while current alignment methods are sufficiently accurate for stained samples, inaccurate alignments remain a major limitation for high resolution cryo-electron tomography. Copyright © 2015 Elsevier Inc. All rights reserved.

  14. A fast cross-validation method for alignment of electron tomography images based on Beer-Lambert law

    PubMed Central

    Yan, Rui; Edwards, Thomas J.; Pankratz, Logan M.; Kuhn, Richard J.; Lanman, Jason K.; Liu, Jun; Jiang, Wen

    2015-01-01

    In electron tomography, accurate alignment of tilt series is an essential step in attaining high-resolution 3D reconstructions. Nevertheless, quantitative assessment of alignment quality has remained a challenging issue, even though many alignment methods have been reported. Here, we report a fast and accurate method, tomoAlignEval, based on the Beer-Lambert law, for the evaluation of alignment quality. Our method is able to globally estimate the alignment accuracy by measuring the goodness of log-linear relationship of the beam intensity attenuations at different tilt angles. Extensive tests with experimental data demonstrated its robust performance with stained and cryo samples. Our method is not only significantly faster but also more sensitive than measurements of tomogram resolution using Fourier shell correlation method (FSCe/o). From these tests, we also conclude that while current alignment methods are sufficiently accurate for stained samples, inaccurate alignments remain a major limitation for high resolution cryo-electron tomography. PMID:26455556

  15. Accurate Simulation and Detection of Coevolution Signals in Multiple Sequence Alignments

    PubMed Central

    Ackerman, Sharon H.; Tillier, Elisabeth R.; Gatti, Domenico L.

    2012-01-01

    Background While the conserved positions of a multiple sequence alignment (MSA) are clearly of interest, non-conserved positions can also be important because, for example, destabilizing effects at one position can be compensated by stabilizing effects at another position. Different methods have been developed to recognize the evolutionary relationship between amino acid sites, and to disentangle functional/structural dependencies from historical/phylogenetic ones. Methodology/Principal Findings We have used two complementary approaches to test the efficacy of these methods. In the first approach, we have used a new program, MSAvolve, for the in silico evolution of MSAs, which records a detailed history of all covarying positions, and builds a global coevolution matrix as the accumulated sum of individual matrices for the positions forced to co-vary, the recombinant coevolution, and the stochastic coevolution. We have simulated over 1600 MSAs for 8 protein families, which reflect sequences of different sizes and proteins with widely different functions. The calculated coevolution matrices were compared with the coevolution matrices obtained for the same evolved MSAs with different coevolution detection methods. In a second approach we have evaluated the capacity of the different methods to predict close contacts in the representative X-ray structures of an additional 150 protein families using only experimental MSAs. Conclusions/Significance Methods based on the identification of global correlations between pairs were found to be generally superior to methods based only on local correlations in their capacity to identify coevolving residues using either simulated or experimental MSAs. However, the significant variability in the performance of different methods with different proteins suggests that the simulation of MSAs that replicate the statistical properties of the experimental MSA can be a valuable tool to identify the coevolution detection method that is most effective in each case. PMID:23091608

  16. Theoretically informed Monte Carlo simulation of liquid crystals by sampling of alignment-tensor fields.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Armas-Perez, Julio C.; Londono-Hurtado, Alejandro; Guzman, Orlando

    2015-07-27

    A theoretically informed coarse-grained Monte Carlo method is proposed for studying liquid crystals. The free energy functional of the system is described in the framework of the Landau-de Gennes formalism. The alignment field and its gradients are approximated by finite differences, and the free energy is minimized through a stochastic sampling technique. The validity of the proposed method is established by comparing the results of the proposed approach to those of traditional free energy minimization techniques. Its usefulness is illustrated in the context of three systems, namely, a nematic liquid crystal confined in a slit channel, a nematic liquid crystalmore » droplet, and a chiral liquid crystal in the bulk. It is found that for systems that exhibit multiple metastable morphologies, the proposed Monte Carlo method is generally able to identify lower free energy states that are often missed by traditional approaches. Importantly, the Monte Carlo method identifies such states from random initial configurations, thereby obviating the need for educated initial guesses that can be difficult to formulate.« less

  17. Theoretically informed Monte Carlo simulation of liquid crystals by sampling of alignment-tensor fields

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Armas-Pérez, Julio C.; Londono-Hurtado, Alejandro; Guzmán, Orlando

    2015-07-28

    A theoretically informed coarse-grained Monte Carlo method is proposed for studying liquid crystals. The free energy functional of the system is described in the framework of the Landau-de Gennes formalism. The alignment field and its gradients are approximated by finite differences, and the free energy is minimized through a stochastic sampling technique. The validity of the proposed method is established by comparing the results of the proposed approach to those of traditional free energy minimization techniques. Its usefulness is illustrated in the context of three systems, namely, a nematic liquid crystal confined in a slit channel, a nematic liquid crystalmore » droplet, and a chiral liquid crystal in the bulk. It is found that for systems that exhibit multiple metastable morphologies, the proposed Monte Carlo method is generally able to identify lower free energy states that are often missed by traditional approaches. Importantly, the Monte Carlo method identifies such states from random initial configurations, thereby obviating the need for educated initial guesses that can be difficult to formulate.« less

  18. Alignment methods: strategies, challenges, benchmarking, and comparative overview.

    PubMed

    Löytynoja, Ari

    2012-01-01

    Comparative evolutionary analyses of molecular sequences are solely based on the identities and differences detected between homologous characters. Errors in this homology statement, that is errors in the alignment of the sequences, are likely to lead to errors in the downstream analyses. Sequence alignment and phylogenetic inference are tightly connected and many popular alignment programs use the phylogeny to divide the alignment problem into smaller tasks. They then neglect the phylogenetic tree, however, and produce alignments that are not evolutionarily meaningful. The use of phylogeny-aware methods reduces the error but the resulting alignments, with evolutionarily correct representation of homology, can challenge the existing practices and methods for viewing and visualising the sequences. The inter-dependency of alignment and phylogeny can be resolved by joint estimation of the two; methods based on statistical models allow for inferring the alignment parameters from the data and correctly take into account the uncertainty of the solution but remain computationally challenging. Widely used alignment methods are based on heuristic algorithms and unlikely to find globally optimal solutions. The whole concept of one correct alignment for the sequences is questionable, however, as there typically exist vast numbers of alternative, roughly equally good alignments that should also be considered. This uncertainty is hidden by many popular alignment programs and is rarely correctly taken into account in the downstream analyses. The quest for finding and improving the alignment solution is complicated by the lack of suitable measures of alignment goodness. The difficulty of comparing alternative solutions also affects benchmarks of alignment methods and the results strongly depend on the measure used. As the effects of alignment error cannot be predicted, comparing the alignments' performance in downstream analyses is recommended.

  19. Hartman Testing of X-Ray Telescopes

    NASA Technical Reports Server (NTRS)

    Saha, Timo T.; Biskasch, Michael; Zhang, William W.

    2013-01-01

    Hartmann testing of x-ray telescopes is a simple test method to retrieve and analyze alignment errors and low-order circumferential errors of x-ray telescopes and their components. A narrow slit is scanned along the circumference of the telescope in front of the mirror and the centroids of the images are calculated. From the centroid data, alignment errors, radius variation errors, and cone-angle variation errors can be calculated. Mean cone angle, mean radial height (average radius), and the focal length of the telescope can also be estimated if the centroid data is measured at multiple focal plane locations. In this paper we present the basic equations that are used in the analysis process. These equations can be applied to full circumference or segmented x-ray telescopes. We use the Optical Surface Analysis Code (OSAC) to model a segmented x-ray telescope and show that the derived equations and accompanying analysis retrieves the alignment errors and low order circumferential errors accurately.

  20. From Pixels to Response Maps: Discriminative Image Filtering for Face Alignment in the Wild.

    PubMed

    Asthana, Akshay; Zafeiriou, Stefanos; Tzimiropoulos, Georgios; Cheng, Shiyang; Pantic, Maja

    2015-06-01

    We propose a face alignment framework that relies on the texture model generated by the responses of discriminatively trained part-based filters. Unlike standard texture models built from pixel intensities or responses generated by generic filters (e.g. Gabor), our framework has two important advantages. First, by virtue of discriminative training, invariance to external variations (like identity, pose, illumination and expression) is achieved. Second, we show that the responses generated by discriminatively trained filters (or patch-experts) are sparse and can be modeled using a very small number of parameters. As a result, the optimization methods based on the proposed texture model can better cope with unseen variations. We illustrate this point by formulating both part-based and holistic approaches for generic face alignment and show that our framework outperforms the state-of-the-art on multiple "wild" databases. The code and dataset annotations are available for research purposes from http://ibug.doc.ic.ac.uk/resources.

  1. Phase Retrieval Using a Genetic Algorithm on the Systematic Image-Based Optical Alignment Testbed

    NASA Technical Reports Server (NTRS)

    Taylor, Jaime R.

    2003-01-01

    NASA s Marshall Space Flight Center s Systematic Image-Based Optical Alignment (SIBOA) Testbed was developed to test phase retrieval algorithms and hardware techniques. Individuals working with the facility developed the idea of implementing phase retrieval by breaking the determination of the tip/tilt of each mirror apart from the piston motion (or translation) of each mirror. Presented in this report is an algorithm that determines the optimal phase correction associated only with the piston motion of the mirrors. A description of the Phase Retrieval problem is first presented. The Systematic Image-Based Optical Alignment (SIBOA) Testbeb is then described. A Discrete Fourier Transform (DFT) is necessary to transfer the incoming wavefront (or estimate of phase error) into the spatial frequency domain to compare it with the image. A method for reducing the DFT to seven scalar/matrix multiplications is presented. A genetic algorithm is then used to search for the phase error. The results of this new algorithm on a test problem are presented.

  2. Template-based protein structure modeling using the RaptorX web server.

    PubMed

    Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo

    2012-07-19

    A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world.

  3. Template-based protein structure modeling using the RaptorX web server

    PubMed Central

    Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo

    2016-01-01

    A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world. PMID:22814390

  4. Method of making nanopatterns and nanostructures and nanopatterned functional oxide materials

    DOEpatents

    Dravid, Vinayak P; Donthu, Suresh K; Pan, Zixiao

    2014-02-11

    Method for nanopatterning of inorganic materials, such as ceramic (e.g. metal oxide) materials, and organic materials, such as polymer materials, on a variety of substrates to form nanopatterns and/or nanostructures with control of dimensions and location, all without the need for etching the materials and without the need for re-alignment between multiple patterning steps in forming nanostructures, such as heterostructures comprising multiple materials. The method involves patterning a resist-coated substrate using electron beam lithography, removing a portion of the resist to provide a patterned resist-coated substrate, and spin coating the patterned resist-coated substrate with a liquid precursor, such as a sol precursor, of the inorganic or organic material. The remaining resist is removed and the spin coated substrate is heated at an elevated temperature to crystallize the deposited precursor material.

  5. Protein structure modeling for CASP10 by multiple layers of global optimization.

    PubMed

    Joo, Keehyoung; Lee, Juyong; Sim, Sangjin; Lee, Sun Young; Lee, Kiho; Heo, Seungryong; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung

    2014-02-01

    In the template-based modeling (TBM) category of CASP10 experiment, we introduced a new protocol called protein modeling system (PMS) to generate accurate protein structures in terms of side-chains as well as backbone trace. In the new protocol, a global optimization algorithm, called conformational space annealing (CSA), is applied to the three layers of TBM procedure: multiple sequence-structure alignment, 3D chain building, and side-chain re-modeling. For 3D chain building, we developed a new energy function which includes new distance restraint terms of Lorentzian type (derived from multiple templates), and new energy terms that combine (physical) energy terms such as dynamic fragment assembly (DFA) energy, DFIRE statistical potential energy, hydrogen bonding term, etc. These physical energy terms are expected to guide the structure modeling especially for loop regions where no template structures are available. In addition, we developed a new quality assessment method based on random forest machine learning algorithm to screen templates, multiple alignments, and final models. For TBM targets of CASP10, we find that, due to the combination of three stages of CSA global optimizations and quality assessment, the modeling accuracy of PMS improves at each additional stage of the protocol. It is especially noteworthy that the side-chains of the final PMS models are far more accurate than the models in the intermediate steps. Copyright © 2013 Wiley Periodicals, Inc.

  6. ExScalibur: A High-Performance Cloud-Enabled Suite for Whole Exome Germline and Somatic Mutation Identification.

    PubMed

    Bao, Riyue; Hernandez, Kyle; Huang, Lei; Kang, Wenjun; Bartom, Elizabeth; Onel, Kenan; Volchenboum, Samuel; Andrade, Jorge

    2015-01-01

    Whole exome sequencing has facilitated the discovery of causal genetic variants associated with human diseases at deep coverage and low cost. In particular, the detection of somatic mutations from tumor/normal pairs has provided insights into the cancer genome. Although there is an abundance of publicly-available software for the detection of germline and somatic variants, concordance is generally limited among variant callers and alignment algorithms. Successful integration of variants detected by multiple methods requires in-depth knowledge of the software, access to high-performance computing resources, and advanced programming techniques. We present ExScalibur, a set of fully automated, highly scalable and modulated pipelines for whole exome data analysis. The suite integrates multiple alignment and variant calling algorithms for the accurate detection of germline and somatic mutations with close to 99% sensitivity and specificity. ExScalibur implements streamlined execution of analytical modules, real-time monitoring of pipeline progress, robust handling of errors and intuitive documentation that allows for increased reproducibility and sharing of results and workflows. It runs on local computers, high-performance computing clusters and cloud environments. In addition, we provide a data analysis report utility to facilitate visualization of the results that offers interactive exploration of quality control files, read alignment and variant calls, assisting downstream customization of potential disease-causing mutations. ExScalibur is open-source and is also available as a public image on Amazon cloud.

  7. Dry contact transfer printing of aligned carbon nanotube patterns and characterization of their optical properties for diameter distribution and alignment.

    PubMed

    Pint, Cary L; Xu, Ya-Qiong; Moghazy, Sharief; Cherukuri, Tonya; Alvarez, Noe T; Haroz, Erik H; Mahzooni, Salma; Doorn, Stephen K; Kono, Junichiro; Pasquali, Matteo; Hauge, Robert H

    2010-02-23

    A scalable and facile approach is demonstrated where as-grown patterns of well-aligned structures composed of single-walled carbon nanotubes (SWNT) synthesized via water-assisted chemical vapor deposition (CVD) can be transferred, or printed, to any host surface in a single dry, room-temperature step using the growth substrate as a stamp. We demonstrate compatibility of this process with multiple transfers for large-scale device and specifically tailored pattern fabrication. Utilizing this transfer approach, anisotropic optical properties of the SWNT films are probed via polarized absorption, Raman, and photoluminescence spectroscopies. Using a simple model to describe optical transitions in the large SWNT species present in the aligned samples, polarized absorption data are demonstrated as an effective tool for accurate assignment of the diameter distribution from broad absorption features located in the infrared. This can be performed on either well-aligned samples or unaligned doped samples, allowing simple and rapid feedback of the SWNT diameter distribution that can be challenging and time-consuming to obtain in other optical methods. Furthermore, we discuss challenges in accurately characterizing alignment in structures of long versus short carbon nanotubes through optical techniques, where SWNT length makes a difference in the information obtained in such measurements. This work provides new insight to the efficient transfer and optical properties of an emerging class of long, large diameter SWNT species typically produced in the CVD process.

  8. CMSA: a heterogeneous CPU/GPU computing system for multiple similar RNA/DNA sequence alignment.

    PubMed

    Chen, Xi; Wang, Chen; Tang, Shanjiang; Yu, Ce; Zou, Quan

    2017-06-24

    The multiple sequence alignment (MSA) is a classic and powerful technique for sequence analysis in bioinformatics. With the rapid growth of biological datasets, MSA parallelization becomes necessary to keep its running time in an acceptable level. Although there are a lot of work on MSA problems, their approaches are either insufficient or contain some implicit assumptions that limit the generality of usage. First, the information of users' sequences, including the sizes of datasets and the lengths of sequences, can be of arbitrary values and are generally unknown before submitted, which are unfortunately ignored by previous work. Second, the center star strategy is suited for aligning similar sequences. But its first stage, center sequence selection, is highly time-consuming and requires further optimization. Moreover, given the heterogeneous CPU/GPU platform, prior studies consider the MSA parallelization on GPU devices only, making the CPUs idle during the computation. Co-run computation, however, can maximize the utilization of the computing resources by enabling the workload computation on both CPU and GPU simultaneously. This paper presents CMSA, a robust and efficient MSA system for large-scale datasets on the heterogeneous CPU/GPU platform. It performs and optimizes multiple sequence alignment automatically for users' submitted sequences without any assumptions. CMSA adopts the co-run computation model so that both CPU and GPU devices are fully utilized. Moreover, CMSA proposes an improved center star strategy that reduces the time complexity of its center sequence selection process from O(mn 2 ) to O(mn). The experimental results show that CMSA achieves an up to 11× speedup and outperforms the state-of-the-art software. CMSA focuses on the multiple similar RNA/DNA sequence alignment and proposes a novel bitmap based algorithm to improve the center star strategy. We can conclude that harvesting the high performance of modern GPU is a promising approach to accelerate multiple sequence alignment. Besides, adopting the co-run computation model can maximize the entire system utilization significantly. The source code is available at https://github.com/wangvsa/CMSA .

  9. HAL: a hierarchical format for storing and analyzing multiple genome alignments.

    PubMed

    Hickey, Glenn; Paten, Benedict; Earl, Dent; Zerbino, Daniel; Haussler, David

    2013-05-15

    Large multiple genome alignments and inferred ancestral genomes are ideal resources for comparative studies of molecular evolution, and advances in sequencing and computing technology are making them increasingly obtainable. These structures can provide a rich understanding of the genetic relationships between all subsets of species they contain. Current formats for storing genomic alignments, such as XMFA and MAF, are all indexed or ordered using a single reference genome, however, which limits the information that can be queried with respect to other species and clades. This loss of information grows with the number of species under comparison, as well as their phylogenetic distance. We present HAL, a compressed, graph-based hierarchical alignment format for storing multiple genome alignments and ancestral reconstructions. HAL graphs are indexed on all genomes they contain. Furthermore, they are organized phylogenetically, which allows for modular and parallel access to arbitrary subclades without fragmentation because of rearrangements that have occurred in other lineages. HAL graphs can be created or read with a comprehensive C++ API. A set of tools is also provided to perform basic operations, such as importing and exporting data, identifying mutations and coordinate mapping (liftover). All documentation and source code for the HAL API and tools are freely available at http://github.com/glennhickey/hal. hickey@soe.ucsc.edu or haussler@soe.ucsc.edu Supplementary data are available at Bioinformatics online.

  10. Technical Note: Rod phantom analysis for comparison of PET detector sampling and reconstruction methods.

    PubMed

    Wollenweber, Scott D; Kemp, Brad J

    2016-11-01

    This investigation aimed to develop a scanner quantification performance methodology and compare multiple metrics between two scanners under different imaging conditions. Most PET scanners are designed to work over a wide dynamic range of patient imaging conditions. Clinical constraints, however, often impact the realization of the entitlement performance for a particular scanner design. Using less injected dose and imaging for a shorter time are often key considerations, all while maintaining "acceptable" image quality and quantitative capability. A dual phantom measurement including resolution inserts was used to measure the effects of in-plane (x, y) and axial (z) system resolution between two PET/CT systems with different block detector crystal dimensions. One of the scanners had significantly thinner slices. Several quantitative measures, including feature contrast recovery, max/min value, and feature profile accuracy were derived from the resulting data and compared between the two scanners and multiple phantoms and alignments. At the clinically relevant count levels used, the scanner with thinner slices had improved performance of approximately 2%, averaged over phantom alignments, measures, and reconstruction methods, for the head-sized phantom, mainly demonstrated with the rods aligned perpendicular to the scanner axis. That same scanner had a slightly decreased performance of -1% for the larger body-size phantom, mostly due to an apparent noise increase in the images. Most of the differences in the metrics between the two scanners were less than 10%. Using the proposed scanner performance methodology, it was shown that smaller detector elements and a larger number of image voxels require higher count density in order to demonstrate improved image quality and quantitation. In a body imaging scenario under typical clinical conditions, the potential advantages of the design must overcome increases in noise due to lower count density.

  11. Dinucleotide controlled null models for comparative RNA gene prediction.

    PubMed

    Gesell, Tanja; Washietl, Stefan

    2008-05-27

    Comparative prediction of RNA structures can be used to identify functional noncoding RNAs in genomic screens. It was shown recently by Babak et al. [BMC Bioinformatics. 8:33] that RNA gene prediction programs can be biased by the genomic dinucleotide content, in particular those programs using a thermodynamic folding model including stacking energies. As a consequence, there is need for dinucleotide-preserving control strategies to assess the significance of such predictions. While there have been randomization algorithms for single sequences for many years, the problem has remained challenging for multiple alignments and there is currently no algorithm available. We present a program called SISSIz that simulates multiple alignments of a given average dinucleotide content. Meeting additional requirements of an accurate null model, the randomized alignments are on average of the same sequence diversity and preserve local conservation and gap patterns. We make use of a phylogenetic substitution model that includes overlapping dependencies and site-specific rates. Using fast heuristics and a distance based approach, a tree is estimated under this model which is used to guide the simulations. The new algorithm is tested on vertebrate genomic alignments and the effect on RNA structure predictions is studied. In addition, we directly combined the new null model with the RNAalifold consensus folding algorithm giving a new variant of a thermodynamic structure based RNA gene finding program that is not biased by the dinucleotide content. SISSIz implements an efficient algorithm to randomize multiple alignments preserving dinucleotide content. It can be used to get more accurate estimates of false positive rates of existing programs, to produce negative controls for the training of machine learning based programs, or as standalone RNA gene finding program. Other applications in comparative genomics that require randomization of multiple alignments can be considered. SISSIz is available as open source C code that can be compiled for every major platform and downloaded here: http://sourceforge.net/projects/sissiz.

  12. HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

    PubMed

    Wan, Shixiang; Zou, Quan

    2017-01-01

    Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.

  13. Multi-profile Bayesian alignment model for LC-MS data analysis with integration of internal standards

    PubMed Central

    Tsai, Tsung-Heng; Tadesse, Mahlet G.; Di Poto, Cristina; Pannell, Lewis K.; Mechref, Yehia; Wang, Yue; Ressom, Habtom W.

    2013-01-01

    Motivation: Liquid chromatography-mass spectrometry (LC-MS) has been widely used for profiling expression levels of biomolecules in various ‘-omic’ studies including proteomics, metabolomics and glycomics. Appropriate LC-MS data preprocessing steps are needed to detect true differences between biological groups. Retention time (RT) alignment, which is required to ensure that ion intensity measurements among multiple LC-MS runs are comparable, is one of the most important yet challenging preprocessing steps. Current alignment approaches estimate RT variability using either single chromatograms or detected peaks, but do not simultaneously take into account the complementary information embedded in the entire LC-MS data. Results: We propose a Bayesian alignment model for LC-MS data analysis. The alignment model provides estimates of the RT variability along with uncertainty measures. The model enables integration of multiple sources of information including internal standards and clustered chromatograms in a mathematically rigorous framework. We apply the model to LC-MS metabolomic, proteomic and glycomic data. The performance of the model is evaluated based on ground-truth data, by measuring correlation of variation, RT difference across runs and peak-matching performance. We demonstrate that Bayesian alignment model improves significantly the RT alignment performance through appropriate integration of relevant information. Availability and implementation: MATLAB code, raw and preprocessed LC-MS data are available at http://omics.georgetown.edu/alignLCMS.html Contact: hwr@georgetown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24013927

  14. Assessment of Homology Templates and an Anesthetic Binding Site within the γ-Aminobutyric Acid Receptor

    PubMed Central

    Bertaccini, Edward J.; Yoluk, Ozge; Lindahl, Erik R.; Trudell, James R.

    2013-01-01

    Background Anesthetics mediate portions of their activity via modulation of the γ-aminobutyric acid receptor (GABAaR). While its molecular structure remains unknown, significant progress has been made towards understanding its interactions with anesthetics via molecular modeling. Methods The structure of the torpedo acetylcholine receptor (nAChRα), the structures of the α4 and β2 subunits of the human nAChR, the structures of the eukaryotic glutamate-gated chloride channel (GluCl), and the prokaryotic pH sensing channels, from Gloeobacter violaceus and Erwinia chrysanthemi, were aligned with the SAlign and 3DMA algorithms. A multiple sequence alignment from these structures and those of the GABAaR was performed with ClustalW. The Modeler and Rosetta algorithms independently created three-dimensional constructs of the GABAaR from the GluCl template. The CDocker algorithm docked a congeneric series of propofol derivatives into the binding pocket and scored calculated binding affinities for correlation with known GABAaR potentiation EC50’s. Results Multiple structure alignments of templates revealed a clear consensus of residue locations relevant to anesthetic effects except for torpedo nAChR. Within the GABAaR models generated from GluCl, the residues notable for modulating anesthetic action within transmembrane segments 1, 2, and 3 converged on the intersubunit interface between alpha and beta subunits. Docking scores of a propofol derivative series into this binding site showed strong linear correlation with GABAaR potentiation EC50. Conclusion Consensus structural alignment based on homologous templates revealed an intersubunit anesthetic binding cavity within the transmembrane domain of the GABAaR, which showed correlation of ligand docking scores with experimentally measured GABAaR potentiation. PMID:23770602

  15. Dynamics and allostery of the ionotropic glutamate receptors and the ligand binding domain.

    PubMed

    Tobi, Dror

    2016-02-01

    The dynamics of the ligand-binding domain (LBD) and the intact ionotropic glutamate receptor (iGluR) were studied using Gaussian Network Model (GNM) analysis. The dynamics of LBDs with various allosteric modulators is compared using a novel method of multiple alignment of GNM modes of motion. The analysis reveals that allosteric effectors change the dynamics of amino acids at the upper lobe interface of the LBD dimer as well as at the hinge region between the upper- and lower- lobes. For the intact glutamate receptor the analysis show that the clamshell-like movement of the LBD upper and lower lobes is coupled to the bending of the trans-membrane domain (TMD) helices which may open the channel pore. The results offer a new insight on the mechanism of action of allosteric modulators on the iGluR and support the notion of TMD helices bending as a possible mechanism for channel opening. In addition, the study validates the methodology of multiple GNM modes alignment as a useful tool to study allosteric effect and its relation to proteins dynamics. © 2015 Wiley Periodicals, Inc.

  16. Recapitulating phylogenies using k-mers: from trees to networks.

    PubMed

    Bernard, Guillaume; Ragan, Mark A; Chan, Cheong Xin

    2016-01-01

    Ernst Haeckel based his landmark Tree of Life on the supposed ontogenic recapitulation of phylogeny, i.e. that successive embryonic stages during the development of an organism re-trace the morphological forms of its ancestors over the course of evolution. Much of this idea has since been discredited. Today, phylogenies are often based on families of molecular sequences. The standard approach starts with a multiple sequence alignment, in which the sequences are arranged relative to each other in a way that maximises a measure of similarity position-by-position along their entire length. A tree (or sometimes a network) is then inferred. Rigorous multiple sequence alignment is computationally demanding, and evolutionary processes that shape the genomes of many microbes (bacteria, archaea and some morphologically simple eukaryotes) can add further complications. In particular, recombination, genome rearrangement and lateral genetic transfer undermine the assumptions that underlie multiple sequence alignment, and imply that a tree-like structure may be too simplistic. Here, using genome sequences of 143 bacterial and archaeal genomes, we construct a network of phylogenetic relatedness based on the number of shared k -mers (subsequences at fixed length k ). Our findings suggest that the network captures not only key aspects of microbial genome evolution as inferred from a tree, but also features that are not treelike. The method is highly scalable, allowing for investigation of genome evolution across a large number of genomes. Instead of using specific regions or sequences from genome sequences, or indeed Haeckel's idea of ontogeny, we argue that genome phylogenies can be inferred using k -mers from whole-genome sequences. Representing these networks dynamically allows biological questions of interest to be formulated and addressed quickly and in a visually intuitive manner.

  17. The Kinetic Mechanism for DNA Unwinding by Multiple Molecules of Dda Helicase Aligned on DNA†

    PubMed Central

    Eoff, Robert L.; Raney, Kevin D.

    2010-01-01

    Helicases catalyze the separation of double-stranded nucleic acids to form single-stranded intermediates. Using transient state kinetic methods we have determined the kinetic properties of DNA unwinding under conditions that favor a monomeric form of the Dda helicase as well as conditions that allow multiple molecules to function on the same substrate. Multiple helicase molecules can align like a train on the DNA track. The number of base pairs unwound in a single binding event for Dda is increased from ~19 bp for the monomeric form to ~64 bp when as many as four Dda molecules are aligned on the same substrate, while the kinetic step-size (3.2 ± 0.7 bp) and unwinding rate (242 ± 25 bp s−1) appear to be independent of the number of Dda molecules present on a given substrate. The data support a model in which the helicase molecules bound to the same substrate move along the DNA track independently during DNA unwinding. The observed increase in processivity arises from the increased probability that at least one of the helicases will completely unwind the DNA prior to dissociation. These results are in contrast to previous reports in which multiple Dda molecules on the same track greatly enhanced the rate and amplitude for displacement of protein blocks on the track. Therefore, only when the progress of the lead molecule in the train is impeded by some type of block, such as a protein bound to DNA, do the trailing molecules interact with the lead molecule in order to overcome the block. The fact that trailing helicase molecules have little impact on the lead molecule in the train during routine DNA unwinding suggests that the trailing molecules are moving at similar rates as the lead molecule. This result implicates a step in the translocation mechanism as contributing greatly to the overall rate-limiting step for unwinding of duplex DNA. PMID:20408588

  18. MSAViewer: interactive JavaScript visualization of multiple sequence alignments.

    PubMed

    Yachdav, Guy; Wilzbach, Sebastian; Rauscher, Benedikt; Sheridan, Robert; Sillitoe, Ian; Procter, James; Lewis, Suzanna E; Rost, Burkhard; Goldberg, Tatyana

    2016-11-15

    The MSAViewer is a quick and easy visualization and analysis JavaScript component for Multiple Sequence Alignment data of any size. Core features include interactive navigation through the alignment, application of popular color schemes, sorting, selecting and filtering. The MSAViewer is 'web ready': written entirely in JavaScript, compatible with modern web browsers and does not require any specialized software. The MSAViewer is part of the BioJS collection of components. The MSAViewer is released as open source software under the Boost Software License 1.0. Documentation, source code and the viewer are available at http://msa.biojs.net/Supplementary information: Supplementary data are available at Bioinformatics online. msa@bio.sh. © The Author 2016. Published by Oxford University Press.

  19. MSAViewer: interactive JavaScript visualization of multiple sequence alignments

    PubMed Central

    Yachdav, Guy; Wilzbach, Sebastian; Rauscher, Benedikt; Sheridan, Robert; Sillitoe, Ian; Procter, James; Lewis, Suzanna E.; Rost, Burkhard; Goldberg, Tatyana

    2016-01-01

    Summary: The MSAViewer is a quick and easy visualization and analysis JavaScript component for Multiple Sequence Alignment data of any size. Core features include interactive navigation through the alignment, application of popular color schemes, sorting, selecting and filtering. The MSAViewer is ‘web ready’: written entirely in JavaScript, compatible with modern web browsers and does not require any specialized software. The MSAViewer is part of the BioJS collection of components. Availability and Implementation: The MSAViewer is released as open source software under the Boost Software License 1.0. Documentation, source code and the viewer are available at http://msa.biojs.net/. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: msa@bio.sh PMID:27412096

  20. Comparison of two on-orbit attitude sensor alignment methods

    NASA Technical Reports Server (NTRS)

    Krack, Kenneth; Lambertson, Michael; Markley, F. Landis

    1990-01-01

    Compared here are two methods of on-orbit alignment of vector attitude sensors. The first method uses the angular difference between simultaneous measurements from two or more sensors. These angles are compared to the angular differences between the respective reference positions of the sensed objects. The alignments of the sensors are adjusted to minimize the difference between the two sets of angles. In the second method, the sensor alignment is part of a state vector that includes the attitude. The alignments are adjusted along with the attitude to minimize all observation residuals. It is shown that the latter method can result in much less alignment uncertainty when gyroscopes are used for attitude propagation during the alignment estimation. The additional information for this increased accuracy comes from knowledge of relative attitude obtained from the spacecraft gyroscopes. The theoretical calculations of this difference in accuracy are presented. Also presented are numerical estimates of the alignment uncertainties of the fixed-head star trackers on the Extreme Ultraviolet Explorer spacecraft using both methods.

  1. Simple and robust generation of ultrafast laser pulse trains using polarization-independent parallel-aligned thin films

    NASA Astrophysics Data System (ADS)

    Wang, Andong; Jiang, Lan; Li, Xiaowei; Wang, Zhi; Du, Kun; Lu, Yongfeng

    2018-05-01

    Ultrafast laser pulse temporal shaping has been widely applied in various important applications such as laser materials processing, coherent control of chemical reactions, and ultrafast imaging. However, temporal pulse shaping has been limited to only-in-lab technique due to the high cost, low damage threshold, and polarization dependence. Herein we propose a novel design of ultrafast laser pulse train generation device, which consists of multiple polarization-independent parallel-aligned thin films. Various pulse trains with controllable temporal profile can be generated flexibly by multi-reflections within the splitting films. Compared with other pulse train generation techniques, this method has advantages of compact structure, low cost, high damage threshold and polarization independence. These advantages endow it with high potential for broad utilization in ultrafast applications.

  2. Development of a method of alignment between various SOLAR MAXIMUM MISSION experiments

    NASA Technical Reports Server (NTRS)

    1977-01-01

    Results of an engineering study of the methods of alignment between various experiments for the solar maximum mission are described. The configuration studied consists of the instruments, mounts and instrument support platform located within the experiment module. Hardware design, fabrication methods and alignment techniques were studied with regard to optimizing the coalignment between the experiments and the fine sun sensor. The proposed hardware design was reviewed with regard to loads, stress, thermal distortion, alignment error budgets, fabrication techniques, alignment techniques and producibility. Methods of achieving comparable alignment accuracies on previous projects were also reviewed.

  3. Automated image alignment for 2D gel electrophoresis in a high-throughput proteomics pipeline.

    PubMed

    Dowsey, Andrew W; Dunn, Michael J; Yang, Guang-Zhong

    2008-04-01

    The quest for high-throughput proteomics has revealed a number of challenges in recent years. Whilst substantial improvements in automated protein separation with liquid chromatography and mass spectrometry (LC/MS), aka 'shotgun' proteomics, have been achieved, large-scale open initiatives such as the Human Proteome Organization (HUPO) Brain Proteome Project have shown that maximal proteome coverage is only possible when LC/MS is complemented by 2D gel electrophoresis (2-DE) studies. Moreover, both separation methods require automated alignment and differential analysis to relieve the bioinformatics bottleneck and so make high-throughput protein biomarker discovery a reality. The purpose of this article is to describe a fully automatic image alignment framework for the integration of 2-DE into a high-throughput differential expression proteomics pipeline. The proposed method is based on robust automated image normalization (RAIN) to circumvent the drawbacks of traditional approaches. These use symbolic representation at the very early stages of the analysis, which introduces persistent errors due to inaccuracies in modelling and alignment. In RAIN, a third-order volume-invariant B-spline model is incorporated into a multi-resolution schema to correct for geometric and expression inhomogeneity at multiple scales. The normalized images can then be compared directly in the image domain for quantitative differential analysis. Through evaluation against an existing state-of-the-art method on real and synthetically warped 2D gels, the proposed analysis framework demonstrates substantial improvements in matching accuracy and differential sensitivity. High-throughput analysis is established through an accelerated GPGPU (general purpose computation on graphics cards) implementation. Supplementary material, software and images used in the validation are available at http://www.proteomegrid.org/rain/.

  4. Photovoltaic module and interlocked stack of photovoltaic modules

    DOEpatents

    Wares, Brian S.

    2014-09-02

    One embodiment relates to an arrangement of photovoltaic modules configured for transportation. The arrangement includes a plurality of photovoltaic modules, each photovoltaic module including a frame. A plurality of individual male alignment features and a plurality of individual female alignment features are included on each frame. Adjacent photovoltaic modules are interlocked by multiple individual male alignment features on a first module of the adjacent photovoltaic modules fitting into and being surrounded by corresponding individual female alignment features on a second module of the adjacent photovoltaic modules. Other embodiments, features and aspects are also disclosed.

  5. Image-based quantification of fiber alignment within electrospun tissue engineering scaffolds is related to mechanical anisotropy.

    PubMed

    Fee, Timothy; Downs, Crawford; Eberhardt, Alan; Zhou, Yong; Berry, Joel

    2016-07-01

    It is well documented that electrospun tissue engineering scaffolds can be fabricated with variable degrees of fiber alignment to produce scaffolds with anisotropic mechanical properties. Several attempts have been made to quantify the degree of fiber alignment within an electrospun scaffold using image-based methods. However, these methods are limited by the inability to produce a quantitative measure of alignment that can be used to make comparisons across publications. Therefore, we have developed a new approach to quantifying the alignment present within a scaffold from scanning electron microscopic (SEM) images. The alignment is determined by using the Sobel approximation of the image gradient to determine the distribution of gradient angles with an image. This data was fit to a Von Mises distribution to find the dispersion parameter κ, which was used as a quantitative measure of fiber alignment. We fabricated four groups of electrospun polycaprolactone (PCL) + Gelatin scaffolds with alignments ranging from κ = 1.9 (aligned) to κ = 0.25 (random) and tested our alignment quantification method on these scaffolds. It was found that our alignment quantification method could distinguish between scaffolds of different alignments more accurately than two other published methods. Additionally, the alignment parameter κ was found to be a good predictor the mechanical anisotropy of our electrospun scaffolds. The ability to quantify fiber alignment within and make direct comparisons of scaffold fiber alignment across publications can reduce ambiguity between published results where cells are cultured on "highly aligned" fibrous scaffolds. This could have important implications for characterizing mechanics and cellular behavior on aligned tissue engineering scaffolds. © 2016 Wiley Periodicals, Inc. J Biomed Mater Res Part A: 104A: 1680-1686, 2016. © 2016 Wiley Periodicals, Inc.

  6. CHROMA: consensus-based colouring of multiple alignments for publication.

    PubMed

    Goodstadt, L; Ponting, C P

    2001-09-01

    CHROMA annotates multiple protein sequence alignments by consensus to produce formatted and coloured text suitable for incorporation into other documents for publication. The package is designed to be flexible and reliable, and has a simple-to-use graphical user interface running under Microsoft Windows. Both the executables and source code for CHROMA running under Windows and Linux (portable command-line only) are freely available at http://www.lg.ndirect.co.uk/chroma. Software enquiries should be directed to CHROMA@lg.ndirect.co.uk.

  7. W-curve alignments for HIV-1 genomic comparisons.

    PubMed

    Cork, Douglas J; Lembark, Steven; Tovanabutra, Sodsai; Robb, Merlin L; Kim, Jerome H

    2010-06-01

    The W-curve was originally developed as a graphical visualization technique for viewing DNA and RNA sequences. Its ability to render features of DNA also makes it suitable for computational studies. Its main advantage in this area is utilizing a single-pass algorithm for comparing the sequences. Avoiding recursion during sequence alignments offers advantages for speed and in-process resources. The graphical technique also allows for multiple models of comparison to be used depending on the nucleotide patterns embedded in similar whole genomic sequences. The W-curve approach allows us to compare large numbers of samples quickly. We are currently tuning the algorithm to accommodate quirks specific to HIV-1 genomic sequences so that it can be used to aid in diagnostic and vaccine efforts. Tracking the molecular evolution of the virus has been greatly hampered by gap associated problems predominantly embedded within the envelope gene of the virus. Gaps and hypermutation of the virus slow conventional string based alignments of the whole genome. This paper describes the W-curve algorithm itself, and how we have adapted it for comparison of similar HIV-1 genomes. A treebuilding method is developed with the W-curve that utilizes a novel Cylindrical Coordinate distance method and gap analysis method. HIV-1 C2-V5 env sequence regions from a Mother/Infant cohort study are used in the comparison. The output distance matrix and neighbor results produced by the W-curve are functionally equivalent to those from Clustal for C2-V5 sequences in the mother/infant pairs infected with CRF01_AE. Significant potential exists for utilizing this method in place of conventional string based alignment of HIV-1 genomes, such as Clustal X. With W-curve heuristic alignment, it may be possible to obtain clinically useful results in a short time-short enough to affect clinical choices for acute treatment. A description of the W-curve generation process, including a comparison technique of aligning extremes of the curves to effectively phase-shift them past the HIV-1 gap problem, is presented. Besides yielding similar neighbor-joining phenogram topologies, most Mother and Infant C2-V5 sequences in the cohort pairs geometrically map closest to each other, indicating that W-curve heuristics overcame any gap problem.

  8. Sandia Corporation (Albuquerque, NM)

    DOEpatents

    Diver, Richard B.

    2010-02-23

    A Theoretical Overlay Photographic (TOP) alignment method uses the overlay of a theoretical projected image of a perfectly aligned concentrator on a photographic image of the concentrator to align the mirror facets of a parabolic trough solar concentrator. The alignment method is practical and straightforward, and inherently aligns the mirror facets to the receiver. When integrated with clinometer measurements for which gravity and mechanical drag effects have been accounted for and which are made in a manner and location consistent with the alignment method, all of the mirrors on a common drive can be aligned and optimized for any concentrator orientation.

  9. Simultaneous Calibration: A Joint Optimization Approach for Multiple Kinect and External Cameras.

    PubMed

    Liao, Yajie; Sun, Ying; Li, Gongfa; Kong, Jianyi; Jiang, Guozhang; Jiang, Du; Cai, Haibin; Ju, Zhaojie; Yu, Hui; Liu, Honghai

    2017-06-24

    Camera calibration is a crucial problem in many applications, such as 3D reconstruction, structure from motion, object tracking and face alignment. Numerous methods have been proposed to solve the above problem with good performance in the last few decades. However, few methods are targeted at joint calibration of multi-sensors (more than four devices), which normally is a practical issue in the real-time systems. In this paper, we propose a novel method and a corresponding workflow framework to simultaneously calibrate relative poses of a Kinect and three external cameras. By optimizing the final cost function and adding corresponding weights to the external cameras in different locations, an effective joint calibration of multiple devices is constructed. Furthermore, the method is tested in a practical platform, and experiment results show that the proposed joint calibration method can achieve a satisfactory performance in a project real-time system and its accuracy is higher than the manufacturer's calibration.

  10. Simultaneous Calibration: A Joint Optimization Approach for Multiple Kinect and External Cameras

    PubMed Central

    Liao, Yajie; Sun, Ying; Li, Gongfa; Kong, Jianyi; Jiang, Guozhang; Jiang, Du; Cai, Haibin; Ju, Zhaojie; Yu, Hui; Liu, Honghai

    2017-01-01

    Camera calibration is a crucial problem in many applications, such as 3D reconstruction, structure from motion, object tracking and face alignment. Numerous methods have been proposed to solve the above problem with good performance in the last few decades. However, few methods are targeted at joint calibration of multi-sensors (more than four devices), which normally is a practical issue in the real-time systems. In this paper, we propose a novel method and a corresponding workflow framework to simultaneously calibrate relative poses of a Kinect and three external cameras. By optimizing the final cost function and adding corresponding weights to the external cameras in different locations, an effective joint calibration of multiple devices is constructed. Furthermore, the method is tested in a practical platform, and experiment results show that the proposed joint calibration method can achieve a satisfactory performance in a project real-time system and its accuracy is higher than the manufacturer’s calibration. PMID:28672823

  11. Fabrication of polydimethylsiloxane (PDMS) nanofluidic chips with controllable channel size and spacing.

    PubMed

    Peng, Ran; Li, Dongqing

    2016-10-07

    The ability to create reproducible and inexpensive nanofluidic chips is essential to the fundamental research and applications of nanofluidics. This paper presents a novel and cost-effective method for fabricating a single nanochannel or multiple nanochannels in PDMS chips with controllable channel size and spacing. Single nanocracks or nanocrack arrays, positioned by artificial defects, are first generated on a polystyrene surface with controllable size and spacing by a solvent-induced method. Two sets of optimal working parameters are developed to replicate the nanocracks onto the polymer layers to form the nanochannel molds. The nanochannel molds are used to make the bi-layer PDMS microchannel-nanochannel chips by simple soft lithography. An alignment system is developed for bonding the nanofluidic chips under an optical microscope. Using this method, high quality PDMS nanofluidic chips with a single nanochannel or multiple nanochannels of sub-100 nm width and height and centimeter length can be obtained with high repeatability.

  12. Linking GPS and travel diary data using sequence alignment in a study of children's independent mobility

    PubMed Central

    2011-01-01

    Background Global positioning systems (GPS) are increasingly being used in health research to determine the location of study participants. Combining GPS data with data collected via travel/activity diaries allows researchers to assess where people travel in conjunction with data about trip purpose and accompaniment. However, linking GPS and diary data is problematic and to date the only method has been to match the two datasets manually, which is time consuming and unlikely to be practical for larger data sets. This paper assesses the feasibility of a new sequence alignment method of linking GPS and travel diary data in comparison with the manual matching method. Methods GPS and travel diary data obtained from a study of children's independent mobility were linked using sequence alignment algorithms to test the proof of concept. Travel diaries were assessed for quality by counting the number of errors and inconsistencies in each participant's set of diaries. The success of the sequence alignment method was compared for higher versus lower quality travel diaries, and for accompanied versus unaccompanied trips. Time taken and percentage of trips matched were compared for the sequence alignment method and the manual method. Results The sequence alignment method matched 61.9% of all trips. Higher quality travel diaries were associated with higher match rates in both the sequence alignment and manual matching methods. The sequence alignment method performed almost as well as the manual method and was an order of magnitude faster. However, the sequence alignment method was less successful at fully matching trips and at matching unaccompanied trips. Conclusions Sequence alignment is a promising method of linking GPS and travel diary data in large population datasets, especially if limitations in the trip detection algorithm are addressed. PMID:22142322

  13. ChromA: signal-based retention time alignment for chromatography-mass spectrometry data.

    PubMed

    Hoffmann, Nils; Stoye, Jens

    2009-08-15

    We describe ChromA, a web-based alignment tool for chromatography-mass spectrometry data from the metabolomics and proteomics domains. Users can supply their data in open and standardized file formats for retention time alignment using dynamic time warping with different configurable local distance and similarity functions. Additionally, user-defined anchors can be used to constrain and speedup the alignment. A neighborhood around each anchor can be added to increase the flexibility of the constrained alignment. ChromA offers different visualizations of the alignment for easier qualitative interpretation and comparison of the data. For the multiple alignment of more than two data files, the center-star approximation is applied to select a reference among input files to align to. ChromA is available at http://bibiserv.techfak.uni-bielefeld.de/chroma. Executables and source code under the L-GPL v3 license are provided for download at the same location.

  14. What to do When Scalar Invariance Fails: The Extended Alignment Method for Multi-Group Factor Analysis Comparison of Latent Means Across Many Groups.

    PubMed

    Marsh, Herbert W; Guo, Jiesi; Parker, Philip D; Nagengast, Benjamin; Asparouhov, Tihomir; Muthén, Bengt; Dicke, Theresa

    2017-01-12

    Scalar invariance is an unachievable ideal that in practice can only be approximated; often using potentially questionable approaches such as partial invariance based on a stepwise selection of parameter estimates with large modification indices. Study 1 demonstrates an extension of the power and flexibility of the alignment approach for comparing latent factor means in large-scale studies (30 OECD countries, 8 factors, 44 items, N = 249,840), for which scalar invariance is typically not supported in the traditional confirmatory factor analysis approach to measurement invariance (CFA-MI). Importantly, we introduce an alignment-within-CFA (AwC) approach, transforming alignment from a largely exploratory tool into a confirmatory tool, and enabling analyses that previously have not been possible with alignment (testing the invariance of uniquenesses and factor variances/covariances; multiple-group MIMIC models; contrasts on latent means) and structural equation models more generally. Specifically, it also allowed a comparison of gender differences in a 30-country MIMIC AwC (i.e., a SEM with gender as a covariate) and a 60-group AwC CFA (i.e., 30 countries × 2 genders) analysis. Study 2, a simulation study following up issues raised in Study 1, showed that latent means were more accurately estimated with alignment than with the scalar CFA-MI, and particularly with partial invariance scalar models based on the heavily criticized stepwise selection strategy. In summary, alignment augmented by AwC provides applied researchers from diverse disciplines considerable flexibility to address substantively important issues when the traditional CFA-MI scalar model does not fit the data. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  15. Hanford Technical Basis for Multiple Dosimetry Effective Dose Methodology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hill, Robin L.; Rathbone, Bruce A.

    2010-08-01

    The current method at Hanford for dealing with the results from multiple dosimeters worn during non-uniform irradiation is to use a compartmentalization method to calculate the effective dose (E). The method, as documented in the current version of Section 6.9.3 in the 'Hanford External Dosimetry Technical Basis Manual, PNL-MA-842,' is based on the compartmentalization method presented in the 1997 ANSI/HPS N13.41 standard, 'Criteria for Performing Multiple Dosimetry.' With the adoption of the ICRP 60 methodology in the 2007 revision to 10 CFR 835 came changes that have a direct affect on the compartmentalization method described in the 1997 ANSI/HPS N13.41more » standard, and, thus, to the method used at Hanford. The ANSI/HPS N13.41 standard committee is in the process of updating the standard, but the changes to the standard have not yet been approved. And, the drafts of the revision of the standard tend to align more with ICRP 60 than with the changes specified in the 2007 revision to 10 CFR 835. Therefore, a revised method for calculating effective dose from non-uniform external irradiation using a compartmental method was developed using the tissue weighting factors and remainder organs specified in 10 CFR 835 (2007).« less

  16. Multi-Harmony: detecting functional specificity from sequence alignment

    PubMed Central

    Brandt, Bernd W.; Feenstra, K. Anton; Heringa, Jaap

    2010-01-01

    Many protein families contain sub-families with functional specialization, such as binding different ligands or being involved in different protein–protein interactions. A small number of amino acids generally determine functional specificity. The identification of these residues can aid the understanding of protein function and help finding targets for experimental analysis. Here, we present multi-Harmony, an interactive web sever for detecting sub-type-specific sites in proteins starting from a multiple sequence alignment. Combining our Sequence Harmony (SH) and multi-Relief (mR) methods in one web server allows simultaneous analysis and comparison of specificity residues; furthermore, both methods have been significantly improved and extended. SH has been extended to cope with more than two sub-groups. mR has been changed from a sampling implementation to a deterministic one, making it more consistent and user friendly. For both methods Z-scores are reported. The multi-Harmony web server produces a dynamic output page, which includes interactive connections to the Jalview and Jmol applets, thereby allowing interactive analysis of the results. Multi-Harmony is available at http://www.ibi.vu.nl/ programs/shmrwww. PMID:20525785

  17. A flexible motif search technique based on generalized profiles.

    PubMed

    Bucher, P; Karplus, K; Moeri, N; Hofmann, K

    1996-03-01

    A flexible motif search technique is presented which has two major components: (1) a generalized profile syntax serving as a motif definition language; and (2) a motif search method specifically adapted to the problem of finding multiple instances of a motif in the same sequence. The new profile structure, which is the core of the generalized profile syntax, combines the functions of a variety of motif descriptors implemented in other methods, including regular expression-like patterns, weight matrices, previously used profiles, and certain types of hidden Markov models (HMMs). The relationship between generalized profiles and other biomolecular motif descriptors is analyzed in detail, with special attention to HMMs. Generalized profiles are shown to be equivalent to a particular class of HMMs, and conversion procedures in both directions are given. The conversion procedures provide an interpretation for local alignment in the framework of stochastic models, allowing for clear, simple significance tests. A mathematical statement of the motif search problem defines the new method exactly without linking it to a specific algorithmic solution. Part of the definition includes a new definition of disjointness of alignments.

  18. Improving homology modeling of G-protein coupled receptors through multiple-template derived conserved inter-residue interactions

    NASA Astrophysics Data System (ADS)

    Chaudhari, Rajan; Heim, Andrew J.; Li, Zhijun

    2015-05-01

    Evidenced by the three-rounds of G-protein coupled receptors (GPCR) Dock competitions, improving homology modeling methods of helical transmembrane proteins including the GPCRs, based on templates of low sequence identity, remains an eminent challenge. Current approaches addressing this challenge adopt the philosophy of "modeling first, refinement next". In the present work, we developed an alternative modeling approach through the novel application of available multiple templates. First, conserved inter-residue interactions are derived from each additional template through conservation analysis of each template-target pairwise alignment. Then, these interactions are converted into distance restraints and incorporated in the homology modeling process. This approach was applied to modeling of the human β2 adrenergic receptor using the bovin rhodopsin and the human protease-activated receptor 1 as templates and improved model quality was demonstrated compared to the homology model generated by standard single-template and multiple-template methods. This method of "refined restraints first, modeling next", provides a fast and complementary way to the current modeling approaches. It allows rational identification and implementation of additional conserved distance restraints extracted from multiple templates and/or experimental data, and has the potential to be applicable to modeling of all helical transmembrane proteins.

  19. Molecular differentiation of Russian wild ginseng using mitochondrial nad7 intron 3 region.

    PubMed

    Li, Guisheng; Cui, Yan; Wang, Hongtao; Kwon, Woo-Saeng; Yang, Deok-Chun

    2017-07-01

    Cultivated ginseng is often introduced as a substitute and adulterant of Russian wild ginseng due to its lower cost or misidentification caused by similarity in appearance with wild ginseng. The aim of this study is to develop a simple and reliable method to differentiate Russian wild ginseng from cultivated ginseng. The mitochondrial NADH dehydrogenase subunit 7 ( nad 7) intron 3 regions of Russian wild ginseng and Chinese cultivated ginseng were analyzed. Based on the multiple sequence alignment result, a specific primer for Russian wild ginseng was designed by introducing additional mismatch and allele-specific polymerase chain reaction (PCR) was performed for identification of wild ginseng. Real-time allele-specific PCR with endpoint analysis was used for validation of the developed Russian wild ginseng single nucleotide polymorphism (SNP) marker. An SNP site specific to Russian wild ginseng was exploited by multiple alignments of mitochondrial nad 7 intron 3 regions of different ginseng samples. With the SNP-based specific primer, Russian wild ginseng was successfully discriminated from Chinese and Korean cultivated ginseng samples by allele-specific PCR. The reliability and specificity of the SNP marker was validated by checking 20 individuals of Russian wild ginseng samples with real-time allele-specific PCR assay. An effective DNA method for molecular discrimination of Russian wild ginseng from Chinese and Korean cultivated ginseng was developed. The established real-time allele-specific PCR was simple and reliable, and the present method should be a crucial complement of chemical analysis for authentication of Russian wild ginseng.

  20. Micro-scale and meso-scale architectural cues cooperate and compete to direct aligned tissue formation

    PubMed Central

    Gilchrist, Christopher L.; Ruch, David S.; Little, Dianne; Guilak, Farshid

    2014-01-01

    Tissue and biomaterial microenvironments provide architectural cues that direct important cell behaviors including cell shape, alignment, migration, and resulting tissue formation. These architectural features may be presented to cells across multiple length scales, from nanometers to millimeters in size. In this study, we examined how architectural cues at two distinctly different length scales, “micro-scale” cues on the order of ~1–2 μm, and “meso-scale” cues several orders of magnitude larger (>100 μm), interact to direct aligned neo-tissue formation. Utilizing a micro-photopatterning (μPP) model system to precisely arrange cell-adhesive patterns, we examined the effects of substrate architecture at these length scales on human mesenchymal stem cell (hMSC) organization, gene expression, and fibrillar collagen deposition. Both micro- and meso-scale architectures directed cell alignment and resulting tissue organization, and when combined, meso cues could enhance or compete against micro-scale cues. As meso boundary aspect ratios were increased, meso-scale cues overrode micro-scale cues and controlled tissue alignment, with a characteristic critical width (~500 μm) similar to boundary dimensions that exist in vivo in highly aligned tissues. Meso-scale cues acted via both lateral confinement (in a cell-density-dependent manner) and by permitting end-to-end cell arrangements that yielded greater fibrillar collagen deposition. Despite large differences in fibrillar collagen content and organization between μPP architectural conditions, these changes did not correspond with changes in gene expression of key matrix or tendon-related genes. These findings highlight the complex interplay between geometric cues at multiple length scales and may have implications for tissue engineering strategies, where scaffold designs that incorporate cues at multiple length scales could improve neo-tissue organization and resulting functional outcomes. PMID:25263687

  1. Customisation of the exome data analysis pipeline using a combinatorial approach.

    PubMed

    Pattnaik, Swetansu; Vaidyanathan, Srividya; Pooja, Durgad G; Deepak, Sa; Panda, Binay

    2012-01-01

    The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis. Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination determined by a simple framework of pre-existing metrics to create significant datasets.

  2. Automatic variance analysis of multistage care pathways.

    PubMed

    Li, Xiang; Liu, Haifeng; Zhang, Shilei; Mei, Jing; Xie, Guotong; Yu, Yiqin; Li, Jing; Lakshmanan, Geetika T

    2014-01-01

    A care pathway (CP) is a standardized process that consists of multiple care stages, clinical activities and their relations, aimed at ensuring and enhancing the quality of care. However, actual care may deviate from the planned CP, and analysis of these deviations can help clinicians refine the CP and reduce medical errors. In this paper, we propose a CP variance analysis method to automatically identify the deviations between actual patient traces in electronic medical records (EMR) and a multistage CP. As the care stage information is usually unavailable in EMR, we first align every trace with the CP using a hidden Markov model. From the aligned traces, we report three types of deviations for every care stage: additional activities, absent activities and violated constraints, which are identified by using the techniques of temporal logic and binomial tests. The method has been applied to a CP for the management of congestive heart failure and real world EMR, providing meaningful evidence for the further improvement of care quality.

  3. Method for alignment of microwires

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beardslee, Joseph A.; Lewis, Nathan S.; Sadtler, Bryce

    2017-01-24

    A method of aligning microwires includes modifying the microwires so they are more responsive to a magnetic field. The method also includes using a magnetic field so as to magnetically align the microwires. The method can further include capturing the microwires in a solid support structure that retains the longitudinal alignment of the microwires when the magnetic field is not applied to the microwires.

  4. Global Network Alignment in the Context of Aging.

    PubMed

    Faisal, Fazle Elahi; Zhao, Han; Milenkovic, Tijana

    2015-01-01

    Analogous to sequence alignment, network alignment (NA) can be used to transfer biological knowledge across species between conserved network regions. NA faces two algorithmic challenges: 1) Which cost function to use to capture "similarities" between nodes in different networks? 2) Which alignment strategy to use to rapidly identify "high-scoring" alignments from all possible alignments? We "break down" existing state-of-the-art methods that use both different cost functions and different alignment strategies to evaluate each combination of their cost functions and alignment strategies. We find that a combination of the cost function of one method and the alignment strategy of another method beats the existing methods. Hence, we propose this combination as a novel superior NA method. Then, since human aging is hard to study experimentally due to long lifespan, we use NA to transfer aging-related knowledge from well annotated model species to poorly annotated human. By doing so, we produce novel human aging-related knowledge, which complements currently available knowledge about aging that has been obtained mainly by sequence alignment. We demonstrate significant similarity between topological and functional properties of our novel predictions and those of known aging-related genes. We are the first to use NA to learn more about aging.

  5. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results.

    PubMed

    Worley, K C; Wiese, B A; Smith, R F

    1995-09-01

    BEAUTY (BLAST enhanced alignment utility) is an enhanced version of the NCBI's BLAST data base search tool that facilitates identification of the functions of matched sequences. We have created new data bases of conserved regions and functional domains for protein sequences in NCBI's Entrez data base, and BEAUTY allows this information to be incorporated directly into BLAST search results. A Conserved Regions Data Base, containing the locations of conserved regions within Entrez protein sequences, was constructed by (1) clustering the entire data base into families, (2) aligning each family using our PIMA multiple sequence alignment program, and (3) scanning the multiple alignments to locate the conserved regions within each aligned sequence. A separate Annotated Domains Data Base was constructed by extracting the locations of all annotated domains and sites from sequences represented in the Entrez, PROSITE, BLOCKS, and PRINTS data bases. BEAUTY performs a BLAST search of those Entrez sequences with conserved regions and/or annotated domains. BEAUTY then uses the information from the Conserved Regions and Annotated Domains data bases to generate, for each matched sequence, a schematic display that allows one to directly compare the relative locations of (1) the conserved regions, (2) annotated domains and sites, and (3) the locally aligned regions matched in the BLAST search. In addition, BEAUTY search results include World-Wide Web hypertext links to a number of external data bases that provide a variety of additional types of information on the function of matched sequences. This convenient integration of protein families, conserved regions, annotated domains, alignment displays, and World-Wide Web resources greatly enhances the biological informativeness of sequence similarity searches. BEAUTY searches can be performed remotely on our system using the "BCM Search Launcher" World-Wide Web pages (URL is < http:/ /gc.bcm.tmc.edu:8088/ search-launcher/launcher.html > ).

  6. Laser beam alignment apparatus and method

    DOEpatents

    Gruhn, C.R.; Hammond, R.B.

    The disclosure related to an apparatus and method for laser beam alignment. Thermoelectric properties of a disc in a laser beam path are used to provide an indication of beam alignment and/or automatic laser alignment.

  7. Laser beam alignment apparatus and method

    DOEpatents

    Gruhn, Charles R.; Hammond, Robert B.

    1981-01-01

    The disclosure relates to an apparatus and method for laser beam alignment. Thermoelectric properties of a disc in a laser beam path are used to provide an indication of beam alignment and/or automatic laser alignment.

  8. Sequence alignment visualization in HTML5 without Java.

    PubMed

    Gille, Christoph; Birgit, Weyand; Gille, Andreas

    2014-01-01

    Java has been extensively used for the visualization of biological data in the web. However, the Java runtime environment is an additional layer of software with an own set of technical problems and security risks. HTML in its new version 5 provides features that for some tasks may render Java unnecessary. Alignment-To-HTML is the first HTML-based interactive visualization for annotated multiple sequence alignments. The server side script interpreter can perform all tasks like (i) sequence retrieval, (ii) alignment computation, (iii) rendering, (iv) identification of a homologous structural models and (v) communication with BioDAS-servers. The rendered alignment can be included in web pages and is displayed in all browsers on all platforms including touch screen tablets. The functionality of the user interface is similar to legacy Java applets and includes color schemes, highlighting of conserved and variable alignment positions, row reordering by drag and drop, interlinked 3D visualization and sequence groups. Novel features are (i) support for multiple overlapping residue annotations, such as chemical modifications, single nucleotide polymorphisms and mutations, (ii) mechanisms to quickly hide residue annotations, (iii) export to MS-Word and (iv) sequence icons. Alignment-To-HTML, the first interactive alignment visualization that runs in web browsers without additional software, confirms that to some extend HTML5 is already sufficient to display complex biological data. The low speed at which programs are executed in browsers is still the main obstacle. Nevertheless, we envision an increased use of HTML and JavaScript for interactive biological software. Under GPL at: http://www.bioinformatics.org/strap/toHTML/.

  9. Automatic prediction of protein domains from sequence information using a hybrid learning system.

    PubMed

    Nagarajan, Niranjan; Yona, Golan

    2004-06-12

    We describe a novel method for detecting the domain structure of a protein from sequence information alone. The method is based on analyzing multiple sequence alignments that are derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence and are combined into a single predictor using a neural network. The output is further smoothed and post-processed using a probabilistic model to predict the most likely transition positions between domains. The method was assessed using the domain definitions in SCOP and CATH for proteins of known structure and was compared with several other existing methods. Our method performs well both in terms of accuracy and sensitivity. It improves significantly over the best methods available, even some of the semi-manual ones, while being fully automatic. Our method can also be used to suggest and verify domain partitions based on structural data. A few examples of predicted domain definitions and alternative partitions, as suggested by our method, are also discussed. An online domain-prediction server is available at http://biozon.org/tools/domains/

  10. SW#db: GPU-Accelerated Exact Sequence Similarity Database Search.

    PubMed

    Korpar, Matija; Šošić, Martin; Blažeka, Dino; Šikić, Mile

    2015-01-01

    In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result-the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goals: how to keep the running times acceptable while maintaining a high-enough level of sensitivity. The most time consuming step of similarity search are the local alignments between query and database sequences. This step is usually performed using exact local alignment algorithms such as Smith-Waterman. Due to its quadratic time complexity, alignments of a query to the whole database are usually too slow. Therefore, the majority of the protein similarity search methods prior to doing the exact local alignment apply heuristics to reduce the number of possible candidate sequences in the database. However, there is still a need for the alignment of a query sequence to a reduced database. In this paper we present the SW#db tool and a library for fast exact similarity search. Although its running times, as a standalone tool, are comparable to the running times of BLAST, it is primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced. It uses both GPU and CPU parallelization and was 4-5 times faster than SSEARCH, 6-25 times faster than CUDASW++ and more than 20 times faster than SSW at the time of writing, using multiple queries on Swiss-prot and Uniref90 databases.

  11. CCD Camera Lens Interface for Real-Time Theodolite Alignment

    NASA Technical Reports Server (NTRS)

    Wake, Shane; Scott, V. Stanley, III

    2012-01-01

    Theodolites are a common instrument in the testing, alignment, and building of various systems ranging from a single optical component to an entire instrument. They provide a precise way to measure horizontal and vertical angles. They can be used to align multiple objects in a desired way at specific angles. They can also be used to reference a specific location or orientation of an object that has moved. Some systems may require a small margin of error in position of components. A theodolite can assist with accurately measuring and/or minimizing that error. The technology is an adapter for a CCD camera with lens to attach to a Leica Wild T3000 Theodolite eyepiece that enables viewing on a connected monitor, and thus can be utilized with multiple theodolites simultaneously. This technology removes a substantial part of human error by relying on the CCD camera and monitors. It also allows image recording of the alignment, and therefore provides a quantitative means to measure such error.

  12. Fabrication of high temperature superconductors

    DOEpatents

    Balachandran, Uthamalingam; Dorris, Stephen E.; Ma, Beihai; Li, Meiya

    2003-06-17

    A method of forming a biaxially aligned superconductor on a non-biaxially aligned substrate substantially chemically inert to the biaxially aligned superconductor comprising is disclosed. A non-biaxially aligned substrate chemically inert to the superconductor is provided and a biaxially aligned superconductor material is deposited directly on the non-biaxially aligned substrate. A method forming a plume of superconductor material and contacting the plume and the non-biaxially aligned substrate at an angle greater than 0.degree. and less than 90.degree. to deposit a biaxially aligned superconductor on the non-biaxially aligned substrate is also disclosed. Various superconductors and substrates are illustrated.

  13. Modified alignment CGHs for aspheric surface test

    NASA Astrophysics Data System (ADS)

    Song, Jae-Bong; Yang, Ho-Soon; Rhee, Hyug-Gyo; Lee, Yun-Woo

    2009-08-01

    Computer Generated Holograms (CGH) for optical test are commonly consisted of one main pattern for testing aspheric surface and some alignment patterns for aligning the interferometer, CGH, and the test optics. To align the CGH plate and the test optics, we designed the alignment CGHs modified from the cat's eye alignment method, which are consisted of a couple of CGH patterns. The incident beam passed through the one part of the alignment CGH pattern is focused onto the one radius position of the test aspheric surface, and is reflected to the other part, and vice versa. This method has several merits compared to the conventional cat's eye alignment method. First, this method can be used in testing optics with a center hole, and the center part of CGH plate can be assigned to the alignment pattern. Second, the alignment pattern becomes a concentric circular arc pattern. The whole CGH patterns including the main pattern and alignment patterns are consisted of only concentric circular fringes. This concentric circular pattern can be easily made by the polar coordinated writer with circular scanning. The required diffraction angle becomes relatively small, so the 1st order diffraction beams instead of the 3rd order diffraction beam can be used as alignment beams, and the visibility can be improved. This alignment method also is more sensitive to the tilt and the lateral shift of the test aspheric surface. Using this alignment pattern, a 200 mm diameter F/0.5 aspheric mirror and a 600 mm diameter F/0.9 mirror were tested.

  14. Metabarcoding of marine nematodes – evaluation of reference datasets used in tree-based taxonomy assignment approach

    PubMed Central

    2016-01-01

    Abstract Background Metabarcoding is becoming a common tool used to assess and compare diversity of organisms in environmental samples. Identification of OTUs is one of the critical steps in the process and several taxonomy assignment methods were proposed to accomplish this task. This publication evaluates the quality of reference datasets, alongside with several alignment and phylogeny inference methods used in one of the taxonomy assignment methods, called tree-based approach. This approach assigns anonymous OTUs to taxonomic categories based on relative placements of OTUs and reference sequences on the cladogram and support that these placements receive. New information In tree-based taxonomy assignment approach, reliable identification of anonymous OTUs is based on their placement in monophyletic and highly supported clades together with identified reference taxa. Therefore, it requires high quality reference dataset to be used. Resolution of phylogenetic trees is strongly affected by the presence of erroneous sequences as well as alignment and phylogeny inference methods used in the process. Two preparation steps are essential for the successful application of tree-based taxonomy assignment approach. Curated collections of genetic information do include erroneous sequences. These sequences have detrimental effect on the resolution of cladograms used in tree-based approach. They must be identified and excluded from the reference dataset beforehand. Various combinations of multiple sequence alignment and phylogeny inference methods provide cladograms with different topology and bootstrap support. These combinations of methods need to be tested in order to determine the one that gives highest resolution for the particular reference dataset. Completing the above mentioned preparation steps is expected to decrease the number of unassigned OTUs and thus improve the results of the tree-based taxonomy assignment approach. PMID:27932919

  15. Metabarcoding of marine nematodes - evaluation of reference datasets used in tree-based taxonomy assignment approach.

    PubMed

    Holovachov, Oleksandr

    2016-01-01

    Metabarcoding is becoming a common tool used to assess and compare diversity of organisms in environmental samples. Identification of OTUs is one of the critical steps in the process and several taxonomy assignment methods were proposed to accomplish this task. This publication evaluates the quality of reference datasets, alongside with several alignment and phylogeny inference methods used in one of the taxonomy assignment methods, called tree-based approach. This approach assigns anonymous OTUs to taxonomic categories based on relative placements of OTUs and reference sequences on the cladogram and support that these placements receive. In tree-based taxonomy assignment approach, reliable identification of anonymous OTUs is based on their placement in monophyletic and highly supported clades together with identified reference taxa. Therefore, it requires high quality reference dataset to be used. Resolution of phylogenetic trees is strongly affected by the presence of erroneous sequences as well as alignment and phylogeny inference methods used in the process. Two preparation steps are essential for the successful application of tree-based taxonomy assignment approach. Curated collections of genetic information do include erroneous sequences. These sequences have detrimental effect on the resolution of cladograms used in tree-based approach. They must be identified and excluded from the reference dataset beforehand.Various combinations of multiple sequence alignment and phylogeny inference methods provide cladograms with different topology and bootstrap support. These combinations of methods need to be tested in order to determine the one that gives highest resolution for the particular reference dataset.Completing the above mentioned preparation steps is expected to decrease the number of unassigned OTUs and thus improve the results of the tree-based taxonomy assignment approach.

  16. ExScalibur: A High-Performance Cloud-Enabled Suite for Whole Exome Germline and Somatic Mutation Identification

    PubMed Central

    Huang, Lei; Kang, Wenjun; Bartom, Elizabeth; Onel, Kenan; Volchenboum, Samuel; Andrade, Jorge

    2015-01-01

    Whole exome sequencing has facilitated the discovery of causal genetic variants associated with human diseases at deep coverage and low cost. In particular, the detection of somatic mutations from tumor/normal pairs has provided insights into the cancer genome. Although there is an abundance of publicly-available software for the detection of germline and somatic variants, concordance is generally limited among variant callers and alignment algorithms. Successful integration of variants detected by multiple methods requires in-depth knowledge of the software, access to high-performance computing resources, and advanced programming techniques. We present ExScalibur, a set of fully automated, highly scalable and modulated pipelines for whole exome data analysis. The suite integrates multiple alignment and variant calling algorithms for the accurate detection of germline and somatic mutations with close to 99% sensitivity and specificity. ExScalibur implements streamlined execution of analytical modules, real-time monitoring of pipeline progress, robust handling of errors and intuitive documentation that allows for increased reproducibility and sharing of results and workflows. It runs on local computers, high-performance computing clusters and cloud environments. In addition, we provide a data analysis report utility to facilitate visualization of the results that offers interactive exploration of quality control files, read alignment and variant calls, assisting downstream customization of potential disease-causing mutations. ExScalibur is open-source and is also available as a public image on Amazon cloud. PMID:26271043

  17. Sequence analysis by iterated maps, a review.

    PubMed

    Almeida, Jonas S

    2014-05-01

    Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also distinct from other alignment-free methodologies in being rooted in statistical mechanics instead of computational linguistics. Both of these roots go back over two decades to the use of fractal geometry in the characterization of phase-space representations. The time series analysis origin of the field is betrayed by the title of the manuscript that started this alignment-free subdomain in 1990, 'Chaos Game Representation'. The clash between the analysis of sequences as continuous series and the better established use of Markovian approaches to discrete series was almost immediate, with a defining critique published in same journal 2 years later. The rest of that decade would go by before the scale-free nature of the IM space was uncovered. The ensuing decade saw this scalability generalized for non-genomic alphabets as well as an interest in its use for graphic representation of biological sequences. Finally, in the past couple of years, in step with the emergence of BigData and MapReduce as a new computational paradigm, there is a surprising third act in the IM story. Multiple reports have described gains in computational efficiency of multiple orders of magnitude over more conventional sequence analysis methodologies. The stage appears to be now set for a recasting of IMs with a central role in processing nextgen sequencing results.

  18. Centroid stabilization in alignment of FOA corner cube: designing of a matched filter

    NASA Astrophysics Data System (ADS)

    Awwal, Abdul; Wilhelmsen, Karl; Roberts, Randy; Leach, Richard; Miller Kamm, Victoria; Ngo, Tony; Lowe-Webb, Roger

    2015-02-01

    The current automation of image-based alignment of NIF high energy laser beams is providing the capability of executing multiple target shots per day. An important aspect of performing multiple shots in a day is to reduce additional time spent aligning specific beams due to perturbations in those beam images. One such alignment is beam centration through the second and third harmonic generating crystals in the final optics assembly (FOA), which employs two retro-reflecting corner cubes to represent the beam center. The FOA houses the frequency conversion crystals for third harmonic generation as the beams enters the target chamber. Beam-to-beam variations and systematic beam changes over time in the FOA corner-cube images can lead to a reduction in accuracy as well as increased convergence durations for the template based centroid detector. This work presents a systematic approach of maintaining FOA corner cube centroid templates so that stable position estimation is applied thereby leading to fast convergence of alignment control loops. In the matched filtering approach, a template is designed based on most recent images taken in the last 60 days. The results show that new filter reduces the divergence of the position estimation of FOA images.

  19. GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters.

    PubMed

    Sela, Itamar; Ashkenazy, Haim; Katoh, Kazutaka; Pupko, Tal

    2015-07-01

    Inference of multiple sequence alignments (MSAs) is a critical part of phylogenetic and comparative genomics studies. However, from the same set of sequences different MSAs are often inferred, depending on the methodologies used and the assumed parameters. Much effort has recently been devoted to improving the ability to identify unreliable alignment regions. Detecting such unreliable regions was previously shown to be important for downstream analyses relying on MSAs, such as the detection of positive selection. Here we developed GUIDANCE2, a new integrative methodology that accounts for: (i) uncertainty in the process of indel formation, (ii) uncertainty in the assumed guide tree and (iii) co-optimal solutions in the pairwise alignments, used as building blocks in progressive alignment algorithms. We compared GUIDANCE2 with seven methodologies to detect unreliable MSA regions using extensive simulations and empirical benchmarks. We show that GUIDANCE2 outperforms all previously developed methodologies. Furthermore, GUIDANCE2 also provides a set of alternative MSAs which can be useful for downstream analyses. The novel algorithm is implemented as a web-server, available at: http://guidance.tau.ac.il. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Acceleration of the Smith-Waterman algorithm using single and multiple graphics processors

    NASA Astrophysics Data System (ADS)

    Khajeh-Saeed, Ali; Poole, Stephen; Blair Perot, J.

    2010-06-01

    Finding regions of similarity between two very long data streams is a computationally intensive problem referred to as sequence alignment. Alignment algorithms must allow for imperfect sequence matching with different starting locations and some gaps and errors between the two data sequences. Perhaps the most well known application of sequence matching is the testing of DNA or protein sequences against genome databases. The Smith-Waterman algorithm is a method for precisely characterizing how well two sequences can be aligned and for determining the optimal alignment of those two sequences. Like many applications in computational science, the Smith-Waterman algorithm is constrained by the memory access speed and can be accelerated significantly by using graphics processors (GPUs) as the compute engine. In this work we show that effective use of the GPU requires a novel reformulation of the Smith-Waterman algorithm. The performance of this new version of the algorithm is demonstrated using the SSCA#1 (Bioinformatics) benchmark running on one GPU and on up to four GPUs executing in parallel. The results indicate that for large problems a single GPU is up to 45 times faster than a CPU for this application, and the parallel implementation shows linear speed up on up to 4 GPUs.

  1. Evaluation of Eight Methods for Aligning Orientation of Two Coordinate Systems.

    PubMed

    Mecheri, Hakim; Robert-Lachaine, Xavier; Larue, Christian; Plamondon, André

    2016-08-01

    The aim of this study was to evaluate eight methods for aligning the orientation of two different local coordinate systems. Alignment is very important when combining two different systems of motion analysis. Two of the methods were developed specifically for biomechanical studies, and because there have been at least three decades of algorithm development in robotics, it was decided to include six methods from this field. To compare these methods, an Xsens sensor and two Optotrak clusters were attached to a Plexiglas plate. The first optical marker cluster was fixed on the sensor and 20 trials were recorded. The error of alignment was calculated for each trial, and the mean, the standard deviation, and the maximum values of this error over all trials were reported. One-way repeated measures analysis of variance revealed that the alignment error differed significantly across the eight methods. Post-hoc tests showed that the alignment error from the methods based on angular velocities was significantly lower than for the other methods. The method using angular velocities performed the best, with an average error of 0.17 ± 0.08 deg. We therefore recommend this method, which is easy to perform and provides accurate alignment.

  2. Fiducial marker application method for position alignment of in situ multimodal X-ray experiments and reconstructions

    DOE PAGES

    Shade, Paul A.; Menasche, David B.; Bernier, Joel V.; ...

    2016-03-01

    An evolving suite of X-ray characterization methods are presently available to the materials community, providing a great opportunity to gain new insight into material behavior and provide critical validation data for materials models. Two critical and related issues are sample repositioning during anin situexperiment and registration of multiple data sets after the experiment. To address these issues, a method is described which utilizes a focused ion-beam scanning electron microscope equipped with a micromanipulator to apply gold fiducial markers to samples for X-ray measurements. The method is demonstrated with a synchrotron X-ray experiment involvingin situloading of a titanium alloy tensile specimen.

  3. Alignment-free inference of hierarchical and reticulate phylogenomic relationships.

    PubMed

    Bernard, Guillaume; Chan, Cheong Xin; Chan, Yao-Ban; Chua, Xin-Yi; Cong, Yingnan; Hogan, James M; Maetschke, Stefan R; Ragan, Mark A

    2017-06-30

    We are amidst an ongoing flood of sequence data arising from the application of high-throughput technologies, and a concomitant fundamental revision in our understanding of how genomes evolve individually and within the biosphere. Workflows for phylogenomic inference must accommodate data that are not only much larger than before, but often more error prone and perhaps misassembled, or not assembled in the first place. Moreover, genomes of microbes, viruses and plasmids evolve not only by tree-like descent with modification but also by incorporating stretches of exogenous DNA. Thus, next-generation phylogenomics must address computational scalability while rethinking the nature of orthogroups, the alignment of multiple sequences and the inference and comparison of trees. New phylogenomic workflows have begun to take shape based on so-called alignment-free (AF) approaches. Here, we review the conceptual foundations of AF phylogenetics for the hierarchical (vertical) and reticulate (lateral) components of genome evolution, focusing on methods based on k-mers. We reflect on what seems to be successful, and on where further development is needed. © The Author 2017. Published by Oxford University Press.

  4. PIV measurements of airflow past multiple cylinders

    NASA Astrophysics Data System (ADS)

    Wodziak, Waldemar; Sobczyk, Jacek

    2018-06-01

    Flow characteristics in vicinity of six circular cylinders aligned inline was investigated experimentally by means of PIV method. Experiments were conducted in a low speed closed circuit wind tunnel. Inflow velocity was 1.2 m/s which corresponds to Re=1600 based on the cylinder diameter. Spacing ratio between cylinders L/D was 1.5. Instantaneous and averaged velocity fields were presented. Experiments were designed in order to use their results as a test case for future numerical calculations.

  5. Formation of bulk refractive index structures

    DOEpatents

    Potter, Jr., Barrett George; Potter, Kelly Simmons; Wheeler, David R.; Jamison, Gregory M.

    2003-07-15

    A method of making a stacked three-dimensional refractive index structure in photosensitive materials using photo-patterning where first determined is the wavelength at which a photosensitive material film exhibits a change in refractive index upon exposure to optical radiation, a portion of the surfaces of the photosensitive material film is optically irradiated, the film is marked to produce a registry mark. Multiple films are produced and aligned using the registry marks to form a stacked three-dimensional refractive index structure.

  6. Underwater Multi-Vehicle Trajectory Alignment and Mapping Using Acoustic and Optical Constraints

    PubMed Central

    Campos, Ricard; Gracias, Nuno; Ridao, Pere

    2016-01-01

    Multi-robot formations are an important advance in recent robotic developments, as they allow a group of robots to merge their capacities and perform surveys in a more convenient way. With the aim of keeping the costs and acoustic communications to a minimum, cooperative navigation of multiple underwater vehicles is usually performed at the control level. In order to maintain the desired formation, individual robots just react to simple control directives extracted from range measurements or ultra-short baseline (USBL) systems. Thus, the robots are unaware of their global positioning, which presents a problem for the further processing of the collected data. The aim of this paper is two-fold. First, we present a global alignment method to correct the dead reckoning trajectories of multiple vehicles to resemble the paths followed during the mission using the acoustic messages passed between vehicles. Second, we focus on the optical mapping application of these types of formations and extend the optimization framework to allow for multi-vehicle geo-referenced optical 3D mapping using monocular cameras. The inclusion of optical constraints is not performed using the common bundle adjustment techniques, but in a form improving the computational efficiency of the resulting optimization problem and presenting a generic process to fuse optical reconstructions with navigation data. We show the performance of the proposed method on real datasets collected within the Morph EU-FP7 project. PMID:26999144

  7. Independent Metrics for Protein Backbone and Side-Chain Flexibility: Time Scales and Effects of Ligand Binding.

    PubMed

    Fuchs, Julian E; Waldner, Birgit J; Huber, Roland G; von Grafenstein, Susanne; Kramer, Christian; Liedl, Klaus R

    2015-03-10

    Conformational dynamics are central for understanding biomolecular structure and function, since biological macromolecules are inherently flexible at room temperature and in solution. Computational methods are nowadays capable of providing valuable information on the conformational ensembles of biomolecules. However, analysis tools and intuitive metrics that capture dynamic information from in silico generated structural ensembles are limited. In standard work-flows, flexibility in a conformational ensemble is represented through residue-wise root-mean-square fluctuations or B-factors following a global alignment. Consequently, these approaches relying on global alignments discard valuable information on local dynamics. Results inherently depend on global flexibility, residue size, and connectivity. In this study we present a novel approach for capturing positional fluctuations based on multiple local alignments instead of one single global alignment. The method captures local dynamics within a structural ensemble independent of residue type by splitting individual local and global degrees of freedom of protein backbone and side-chains. Dependence on residue type and size in the side-chains is removed via normalization with the B-factors of the isolated residue. As a test case, we demonstrate its application to a molecular dynamics simulation of bovine pancreatic trypsin inhibitor (BPTI) on the millisecond time scale. This allows for illustrating different time scales of backbone and side-chain flexibility. Additionally, we demonstrate the effects of ligand binding on side-chain flexibility of three serine proteases. We expect our new methodology for quantifying local flexibility to be helpful in unraveling local changes in biomolecular dynamics.

  8. Cloud-Coffee: implementation of a parallel consistency-based multiple alignment algorithm in the T-Coffee package and its benchmarking on the Amazon Elastic-Cloud.

    PubMed

    Di Tommaso, Paolo; Orobitg, Miquel; Guirado, Fernando; Cores, Fernado; Espinosa, Toni; Notredame, Cedric

    2010-08-01

    We present the first parallel implementation of the T-Coffee consistency-based multiple aligner. We benchmark it on the Amazon Elastic Cloud (EC2) and show that the parallelization procedure is reasonably effective. We also conclude that for a web server with moderate usage (10K hits/month) the cloud provides a cost-effective alternative to in-house deployment. T-Coffee is a freeware open source package available from http://www.tcoffee.org/homepage.html

  9. A Kalman Filter for SINS Self-Alignment Based on Vector Observation.

    PubMed

    Xu, Xiang; Xu, Xiaosu; Zhang, Tao; Li, Yao; Tong, Jinwu

    2017-01-29

    In this paper, a self-alignment method for strapdown inertial navigation systems based on the q -method is studied. In addition, an improved method based on integrating gravitational apparent motion to form apparent velocity is designed, which can reduce the random noises of the observation vectors. For further analysis, a novel self-alignment method using a Kalman filter based on adaptive filter technology is proposed, which transforms the self-alignment procedure into an attitude estimation using the observation vectors. In the proposed method, a linear psuedo-measurement equation is adopted by employing the transfer method between the quaternion and the observation vectors. Analysis and simulation indicate that the accuracy of the self-alignment is improved. Meanwhile, to improve the convergence rate of the proposed method, a new method based on parameter recognition and a reconstruction algorithm for apparent gravitation is devised, which can reduce the influence of the random noises of the observation vectors. Simulations and turntable tests are carried out, and the results indicate that the proposed method can acquire sound alignment results with lower standard variances, and can obtain higher alignment accuracy and a faster convergence rate.

  10. A Kalman Filter for SINS Self-Alignment Based on Vector Observation

    PubMed Central

    Xu, Xiang; Xu, Xiaosu; Zhang, Tao; Li, Yao; Tong, Jinwu

    2017-01-01

    In this paper, a self-alignment method for strapdown inertial navigation systems based on the q-method is studied. In addition, an improved method based on integrating gravitational apparent motion to form apparent velocity is designed, which can reduce the random noises of the observation vectors. For further analysis, a novel self-alignment method using a Kalman filter based on adaptive filter technology is proposed, which transforms the self-alignment procedure into an attitude estimation using the observation vectors. In the proposed method, a linear psuedo-measurement equation is adopted by employing the transfer method between the quaternion and the observation vectors. Analysis and simulation indicate that the accuracy of the self-alignment is improved. Meanwhile, to improve the convergence rate of the proposed method, a new method based on parameter recognition and a reconstruction algorithm for apparent gravitation is devised, which can reduce the influence of the random noises of the observation vectors. Simulations and turntable tests are carried out, and the results indicate that the proposed method can acquire sound alignment results with lower standard variances, and can obtain higher alignment accuracy and a faster convergence rate. PMID:28146059

  11. Automated Stitching of Microtubule Centerlines across Serial Electron Tomograms

    PubMed Central

    Weber, Britta; Tranfield, Erin M.; Höög, Johanna L.; Baum, Daniel; Antony, Claude; Hyman, Tony; Verbavatz, Jean-Marc; Prohaska, Steffen

    2014-01-01

    Tracing microtubule centerlines in serial section electron tomography requires microtubules to be stitched across sections, that is lines from different sections need to be aligned, endpoints need to be matched at section boundaries to establish a correspondence between neighboring sections, and corresponding lines need to be connected across multiple sections. We present computational methods for these tasks: 1) An initial alignment is computed using a distance compatibility graph. 2) A fine alignment is then computed with a probabilistic variant of the iterative closest points algorithm, which we extended to handle the orientation of lines by introducing a periodic random variable to the probabilistic formulation. 3) Endpoint correspondence is established by formulating a matching problem in terms of a Markov random field and computing the best matching with belief propagation. Belief propagation is not generally guaranteed to converge to a minimum. We show how convergence can be achieved, nonetheless, with minimal manual input. In addition to stitching microtubule centerlines, the correspondence is also applied to transform and merge the electron tomograms. We applied the proposed methods to samples from the mitotic spindle in C. elegans, the meiotic spindle in X. laevis, and sub-pellicular microtubule arrays in T. brucei. The methods were able to stitch microtubules across section boundaries in good agreement with experts' opinions for the spindle samples. Results, however, were not satisfactory for the microtubule arrays. For certain experiments, such as an analysis of the spindle, the proposed methods can replace manual expert tracing and thus enable the analysis of microtubules over long distances with reasonable manual effort. PMID:25438148

  12. Automated stitching of microtubule centerlines across serial electron tomograms.

    PubMed

    Weber, Britta; Tranfield, Erin M; Höög, Johanna L; Baum, Daniel; Antony, Claude; Hyman, Tony; Verbavatz, Jean-Marc; Prohaska, Steffen

    2014-01-01

    Tracing microtubule centerlines in serial section electron tomography requires microtubules to be stitched across sections, that is lines from different sections need to be aligned, endpoints need to be matched at section boundaries to establish a correspondence between neighboring sections, and corresponding lines need to be connected across multiple sections. We present computational methods for these tasks: 1) An initial alignment is computed using a distance compatibility graph. 2) A fine alignment is then computed with a probabilistic variant of the iterative closest points algorithm, which we extended to handle the orientation of lines by introducing a periodic random variable to the probabilistic formulation. 3) Endpoint correspondence is established by formulating a matching problem in terms of a Markov random field and computing the best matching with belief propagation. Belief propagation is not generally guaranteed to converge to a minimum. We show how convergence can be achieved, nonetheless, with minimal manual input. In addition to stitching microtubule centerlines, the correspondence is also applied to transform and merge the electron tomograms. We applied the proposed methods to samples from the mitotic spindle in C. elegans, the meiotic spindle in X. laevis, and sub-pellicular microtubule arrays in T. brucei. The methods were able to stitch microtubules across section boundaries in good agreement with experts' opinions for the spindle samples. Results, however, were not satisfactory for the microtubule arrays. For certain experiments, such as an analysis of the spindle, the proposed methods can replace manual expert tracing and thus enable the analysis of microtubules over long distances with reasonable manual effort.

  13. Improved, ACMG-Compliant, in silico prediction of pathogenicity for missense substitutions encoded by TP53 variants.

    PubMed

    Fortuno, Cristina; James, Paul A; Young, Erin L; Feng, Bing; Olivier, Magali; Pesaran, Tina; Tavtigian, Sean V; Spurdle, Amanda B

    2018-05-18

    Clinical interpretation of germline missense variants represents a major challenge, including those in the TP53 Li-Fraumeni syndrome gene. Bioinformatic prediction is a key part of variant classification strategies. We aimed to optimize the performance of the Align-GVGD tool used for p53 missense variant prediction, and compare its performance to other bioinformatic tools (SIFT, PolyPhen-2) and ensemble methods (REVEL, BayesDel). Reference sets of assumed pathogenic and assumed benign variants were defined using functional and/or clinical data. Area under the curve and Matthews correlation coefficient (MCC) values were used as objective functions to select an optimized protein multi-sequence alignment with best performance for Align-GVGD. MCC comparison of tools using binary categories showed optimized Align-GVGD (C15 cut-off) combined with BayesDel (0.16 cut-off), or with REVEL (0.5 cut-off), to have the best overall performance. Further, a semi-quantitative approach using multiple tiers of bioinformatic prediction, validated using an independent set of non-functional and functional variants, supported use of Align-GVGD and BayesDel prediction for different strength of evidence levels in ACMG/AMP rules. We provide rationale for bioinformatic tool selection for TP53 variant classification, and have also computed relevant bioinformatic predictions for every possible p53 missense variant to facilitate their use by the scientific and medical community. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  14. Multiple alignment-free sequence comparison

    PubMed Central

    Ren, Jie; Song, Kai; Sun, Fengzhu; Deng, Minghua; Reinert, Gesine

    2013-01-01

    Motivation: Recently, a range of new statistics have become available for the alignment-free comparison of two sequences based on k-tuple word content. Here, we extend these statistics to the simultaneous comparison of more than two sequences. Our suite of statistics contains, first, and , extensions of statistics for pairwise comparison of the joint k-tuple content of all the sequences, and second, , and , averages of sums of pairwise comparison statistics. The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences. Results: Our investigation uses both simulated data as well as cis-regulatory module data where the task is to identify cis-regulatory modules with similar transcription factor binding sites. We find that although for real data, all of our statistics show a similar performance, on simulated data the Shepp-type statistics are in some instances outperformed by star-type statistics. The multiple alignment-free statistics are more sensitive to contamination in the data than the pairwise average statistics. Availability: Our implementation of the five statistics is available as R package named ‘multiAlignFree’ at be http://www-rcf.usc.edu/∼fsun/Programs/multiAlignFree/multiAlignFreemain.html. Contact: reinert@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23990418

  15. Genetic Algorithm Phase Retrieval for the Systematic Image-Based Optical Alignment Testbed

    NASA Technical Reports Server (NTRS)

    Rakoczy, John; Steincamp, James; Taylor, Jaime

    2003-01-01

    A reduced surrogate, one point crossover genetic algorithm with random rank-based selection was used successfully to estimate the multiple phases of a segmented optical system modeled on the seven-mirror Systematic Image-Based Optical Alignment testbed located at NASA's Marshall Space Flight Center.

  16. Using structure to explore the sequence alignment space of remote homologs.

    PubMed

    Kuziemko, Andrew; Honig, Barry; Petrey, Donald

    2011-10-01

    Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP) are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is "optimal" in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are "suboptimal" in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for "modelability", we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended.

  17. Multiple main-belt asteroid mission options for a Mariner Mark II spacecraft

    NASA Astrophysics Data System (ADS)

    Sauer, Carl G., Jr.; Yen, Chen-Wan L.

    This paper presents the trajectory options available for a MMII spacecraft mission to asteroids and introduces systematic methods of uncovering attractive mission opportunities. The analysis presented considers multiple synchronous gravity assists of Mars and introduces a terminal resonant or phasing orbit; a concept useful for both increasing the number of asteroid rendezvous targets attainable during a launch opportunity, and also in increasing the number of potential asteroid flybys. Systematic examinations of the requirements for superior asteroidal alignments are made and a comprehensive set of asteroid rendezvous opportunities for the 1998 to 2010 period are presented. Examples of candidate missions involving one or more rendezvous and several flybys are also presented.

  18. Multiple main-belt asteroid mission options for a Mariner Mark II spacecraft

    NASA Technical Reports Server (NTRS)

    Sauer, Carl G., Jr.; Yen, Chen-Wan L.

    1990-01-01

    This paper presents the trajectory options available for a MMII spacecraft mission to asteroids and introduces systematic methods of uncovering attractive mission opportunities. The analysis presented considers multiple synchronous gravity assists of Mars and introduces a terminal resonant or phasing orbit; a concept useful for both increasing the number of asteroid rendezvous targets attainable during a launch opportunity, and also in increasing the number of potential asteroid flybys. Systematic examinations of the requirements for superior asteroidal alignments are made and a comprehensive set of asteroid rendezvous opportunities for the 1998 to 2010 period are presented. Examples of candidate missions involving one or more rendezvous and several flybys are also presented.

  19. PSO-based methods for medical image registration and change assessment of pigmented skin

    NASA Astrophysics Data System (ADS)

    Kacenjar, Steve; Zook, Matthew; Balint, Michael

    2011-03-01

    There are various scientific and technological areas in which it is imperative to rapidly detect and quantify changes in imagery over time. In fields such as earth remote sensing, aerospace systems, and medical imaging, searching for timedependent, regional changes across deformable topographies is complicated by varying camera acquisition geometries, lighting environments, background clutter conditions, and occlusion. Under these constantly-fluctuating conditions, the use of standard, rigid-body registration approaches often fail to provide sufficient fidelity to overlay image scenes together. This is problematic because incorrect assessments of the underlying changes of high-level topography can result in systematic errors in the quantification and classification of interested areas. For example, in the current naked-eye detection strategies of melanoma, a dermatologist often uses static morphological attributes to identify suspicious skin lesions for biopsy. This approach does not incorporate temporal changes which suggest malignant degeneration. By performing the co-registration of time-separated skin imagery, a dermatologist may more effectively detect and identify early morphological changes in pigmented lesions; enabling the physician to detect cancers at an earlier stage resulting in decreased morbidity and mortality. This paper describes an image processing system which will be used to detect changes in the characteristics of skin lesions over time. The proposed system consists of three main functional elements: 1.) coarse alignment of timesequenced imagery, 2.) refined alignment of local skin topographies, and 3.) assessment of local changes in lesion size. During the coarse alignment process, various approaches can be used to obtain a rough alignment, including: 1.) a manual landmark/intensity-based registration method1, and 2.) several flavors of autonomous optical matched filter methods2. These procedures result in the rough alignment of a patient's back topography. Since the skin is a deformable membrane, this process only provides an initial condition for subsequent refinements in aligning the localized topography of the skin. To achieve a refined enhancement, a Particle Swarm Optimizer (PSO) is used to optimally determine the local camera models associated with a generalized geometric transform. Here the optimization process is driven using the minimization of entropy between the multiple time-separated images. Once the camera models are corrected for local skin deformations, the images are compared using both pixel-based and regional-based methods. Limits on the detectability of change are established by the fidelity to which the algorithm corrects for local skin deformation and background alterations. These limits provide essential information in establishing early-warning thresholds for Melanoma detection. Key to this work is the development of a PSO alignment algorithm to perform the refined alignment in local skin topography between the time sequenced imagery (TSI). Test and validation of this alignment process is achieved using a forward model producing known geometric artifacts in the images and afterwards using a PSO algorithm to demonstrate the ability to identify and correct for these artifacts. Specifically, the forward model introduces local translational, rotational, and magnification changes within the image. These geometric modifiers are expected during TSI acquisition because of logistical issues to precisely align the patient to the image recording geometry and is therefore of paramount importance to any viable image registration system. This paper shows that the PSO alignment algorithm is effective in autonomously determining and mitigating these geometric modifiers. The degree of efficacy is measured by several statistically and morphologically based pre-image filtering operations applied to the TSI imagery before applying the PSO alignment algorithm. These trade studies show that global image threshold binarization provides rapid and superior convergence characteristics relative to that of morphologically based methods.

  20. ChromA: signal-based retention time alignment for chromatography–mass spectrometry data

    PubMed Central

    Hoffmann, Nils; Stoye, Jens

    2009-01-01

    Summary: We describe ChromA, a web-based alignment tool for chromatography–mass spectrometry data from the metabolomics and proteomics domains. Users can supply their data in open and standardized file formats for retention time alignment using dynamic time warping with different configurable local distance and similarity functions. Additionally, user-defined anchors can be used to constrain and speedup the alignment. A neighborhood around each anchor can be added to increase the flexibility of the constrained alignment. ChromA offers different visualizations of the alignment for easier qualitative interpretation and comparison of the data. For the multiple alignment of more than two data files, the center-star approximation is applied to select a reference among input files to align to. Availability: ChromA is available at http://bibiserv.techfak.uni-bielefeld.de/chroma. Executables and source code under the L-GPL v3 license are provided for download at the same location. Contact: stoye@techfak.uni-bielefeld.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19505941

  1. Multispectra CWT-based algorithm (MCWT) in mass spectra for peak extraction.

    PubMed

    Hsueh, Huey-Miin; Kuo, Hsun-Chih; Tsai, Chen-An

    2008-01-01

    An important objective in mass spectrometry (MS) is to identify a set of biomarkers that can be used to potentially distinguish patients between distinct treatments (or conditions) from tens or hundreds of spectra. A common two-step approach involving peak extraction and quantification is employed to identify the features of scientific interest. The selected features are then used for further investigation to understand underlying biological mechanism of individual protein or for development of genomic biomarkers to early diagnosis. However, the use of inadequate or ineffective peak detection and peak alignment algorithms in peak extraction step may lead to a high rate of false positives. Also, it is crucial to reduce the false positive rate in detecting biomarkers from ten or hundreds of spectra. Here a new procedure is introduced for feature extraction in mass spectrometry data that extends the continuous wavelet transform-based (CWT-based) algorithm to multiple spectra. The proposed multispectra CWT-based algorithm (MCWT) not only can perform peak detection for multiple spectra but also carry out peak alignment at the same time. The author' MCWT algorithm constructs a reference, which integrates information of multiple raw spectra, for feature extraction. The algorithm is applied to a SELDI-TOF mass spectra data set provided by CAMDA 2006 with known polypeptide m/z positions. This new approach is easy to implement and it outperforms the existing peak extraction method from the Bioconductor PROcess package.

  2. Automatic frequency and phase alignment of in vivo J-difference-edited MR spectra by frequency domain correlation.

    PubMed

    Wiegers, Evita C; Philips, Bart W J; Heerschap, Arend; van der Graaf, Marinette

    2017-12-01

    J-difference editing is often used to select resonances of compounds with coupled spins in 1 H-MR spectra. Accurate phase and frequency alignment prior to subtracting J-difference-edited MR spectra is important to avoid artefactual contributions to the edited resonance. In-vivo J-difference-edited MR spectra were aligned by maximizing the normalized scalar product between two spectra (i.e., the correlation over a spectral region). The performance of our correlation method was compared with alignment by spectral registration and by alignment of the highest point in two spectra. The correlation method was tested at different SNR levels and for a broad range of phase and frequency shifts. In-vivo application of the proposed correlation method showed reduced subtraction errors and increased fit reliability in difference spectra as compared with conventional peak alignment. The correlation method and the spectral registration method generally performed equally well. However, better alignment using the correlation method was obtained for spectra with a low SNR (down to ~2) and for relatively large frequency shifts. Our correlation method for simultaneously phase and frequency alignment is able to correct both small and large phase and frequency drifts and also performs well at low SNR levels.

  3. A pseudo-penalized quasi-likelihood approach to the spatial misalignment problem with non-normal data.

    PubMed

    Lopiano, Kenneth K; Young, Linda J; Gotway, Carol A

    2014-09-01

    Spatially referenced datasets arising from multiple sources are routinely combined to assess relationships among various outcomes and covariates. The geographical units associated with the data, such as the geographical coordinates or areal-level administrative units, are often spatially misaligned, that is, observed at different locations or aggregated over different geographical units. As a result, the covariate is often predicted at the locations where the response is observed. The method used to align disparate datasets must be accounted for when subsequently modeling the aligned data. Here we consider the case where kriging is used to align datasets in point-to-point and point-to-areal misalignment problems when the response variable is non-normally distributed. If the relationship is modeled using generalized linear models, the additional uncertainty induced from using the kriging mean as a covariate introduces a Berkson error structure. In this article, we develop a pseudo-penalized quasi-likelihood algorithm to account for the additional uncertainty when estimating regression parameters and associated measures of uncertainty. The method is applied to a point-to-point example assessing the relationship between low-birth weights and PM2.5 levels after the onset of the largest wildfire in Florida history, the Bugaboo scrub fire. A point-to-areal misalignment problem is presented where the relationship between asthma events in Florida's counties and PM2.5 levels after the onset of the fire is assessed. Finally, the method is evaluated using a simulation study. Our results indicate the method performs well in terms of coverage for 95% confidence intervals and naive methods that ignore the additional uncertainty tend to underestimate the variability associated with parameter estimates. The underestimation is most profound in Poisson regression models. © 2014, The International Biometric Society.

  4. Self-Alignment MEMS IMU Method Based on the Rotation Modulation Technique on a Swing Base

    PubMed Central

    Chen, Zhiyong; Yang, Haotian; Wang, Chengbin; Lin, Zhihui; Guo, Meifeng

    2018-01-01

    The micro-electro-mechanical-system (MEMS) inertial measurement unit (IMU) has been widely used in the field of inertial navigation due to its small size, low cost, and light weight, but aligning MEMS IMUs remains a challenge for researchers. MEMS IMUs have been conventionally aligned on a static base, requiring other sensors, such as magnetometers or satellites, to provide auxiliary information, which limits its application range to some extent. Therefore, improving the alignment accuracy of MEMS IMU as much as possible under swing conditions is of considerable value. This paper proposes an alignment method based on the rotation modulation technique (RMT), which is completely self-aligned, unlike the existing alignment techniques. The effect of the inertial sensor errors is mitigated by rotating the IMU. Then, inertial frame-based alignment using the rotation modulation technique (RMT-IFBA) achieved coarse alignment on the swing base. The strong tracking filter (STF) further improved the alignment accuracy. The performance of the proposed method was validated with a physical experiment, and the results of the alignment showed that the standard deviations of pitch, roll, and heading angle were 0.0140°, 0.0097°, and 0.91°, respectively, which verified the practicality and efficacy of the proposed method for the self-alignment of the MEMS IMU on a swing base. PMID:29649150

  5. PrimerDesign-M: A multiple-alignment based multiple-primer design tool for walking across variable genomes

    DOE PAGES

    Yoon, Hyejin; Leitner, Thomas

    2014-12-17

    Analyses of entire viral genomes or mtDNA requires comprehensive design of many primers across their genomes. In addition, simultaneous optimization of several DNA primer design criteria may improve overall experimental efficiency and downstream bioinformatic processing. To achieve these goals, we developed PrimerDesign-M. It includes several options for multiple-primer design, allowing researchers to efficiently design walking primers that cover long DNA targets, such as entire HIV-1 genomes, and that optimizes primers simultaneously informed by genetic diversity in multiple alignments and experimental design constraints given by the user. PrimerDesign-M can also design primers that include DNA barcodes and minimize primer dimerization. PrimerDesign-Mmore » finds optimal primers for highly variable DNA targets and facilitates design flexibility by suggesting alternative designs to adapt to experimental conditions.« less

  6. Accurate prediction of protein–protein interactions from sequence alignments using a Bayesian method

    PubMed Central

    Burger, Lukas; van Nimwegen, Erik

    2008-01-01

    Accurate and large-scale prediction of protein–protein interactions directly from amino-acid sequences is one of the great challenges in computational biology. Here we present a new Bayesian network method that predicts interaction partners using only multiple alignments of amino-acid sequences of interacting protein domains, without tunable parameters, and without the need for any training examples. We first apply the method to bacterial two-component systems and comprehensively reconstruct two-component signaling networks across all sequenced bacteria. Comparisons of our predictions with known interactions show that our method infers interaction partners genome-wide with high accuracy. To demonstrate the general applicability of our method we show that it also accurately predicts interaction partners in a recent dataset of polyketide synthases. Analysis of the predicted genome-wide two-component signaling networks shows that cognates (interacting kinase/regulator pairs, which lie adjacent on the genome) and orphans (which lie isolated) form two relatively independent components of the signaling network in each genome. In addition, while most genes are predicted to have only a small number of interaction partners, we find that 10% of orphans form a separate class of ‘hub' nodes that distribute and integrate signals to and from up to tens of different interaction partners. PMID:18277381

  7. Overhead drilling: Comparing three bases for aligning a drilling jig to vertical

    PubMed Central

    Rempel, David; Star, Demetra; Barr, Alan; Janowitz, Ira

    2010-01-01

    Problem Drilling overhead into concrete or metal ceilings is a strenuous task done by construction workers to hang ductwork, piping, and electrical equipment. The task is associated with upper body pain and musculoskeletal disorders. Previously, we described a field usability evaluation of a foot lever and inverted drill press intervention devices that were compared to the usual method for overhead drilling. Both interventions were rated as inferior to the usual method based on poor setup time and mobility. Method Three new interventions, which differed on the design used for aligning the drilling column to vertical, were compared to the usual method for overhead drilling by commercial construction workers (n=16). Results The usual method was associated with the highest levels of regional body fatigue and the poorest usability ratings when compared to the three interventions. Conclusion Overall, the ‘Collar Base’ intervention design received the best usability ratings. Impact on Industry Intervention designs developed for overhead drilling may reduce shoulder fatigue and prevent subsequent musculoskeletal disorders. These designs may also be useful for other overhead work such as lifting and supporting materials (e.g., piping, ducts) that are installed near the ceiling. Workplace health and safety interventions may require multiple rounds of field-testing prior to achieving acceptable usability ratings by the end users. PMID:20630276

  8. A Method for the Alignment of Heterogeneous Macromolecules from Electron Microscopy

    PubMed Central

    Shatsky, Maxim; Hall, Richard J.; Brenner, Steven E.; Glaeser, Robert M.

    2009-01-01

    We propose a feature-based image alignment method for single-particle electron microscopy that is able to accommodate various similarity scoring functions while efficiently sampling the two-dimensional transformational space. We use this image alignment method to evaluate the performance of a scoring function that is based on the Mutual Information (MI) of two images rather than one that is based on the cross-correlation function. We show that alignment using MI for the scoring function has far less model-dependent bias than is found with cross-correlation based alignment. We also demonstrate that MI improves the alignment of some types of heterogeneous data, provided that the signal to noise ratio is relatively high. These results indicate, therefore, that use of MI as the scoring function is well suited for the alignment of class-averages computed from single particle images. Our method is tested on data from three model structures and one real dataset. PMID:19166941

  9. Automated interferometric alignment system for paraboloidal mirrors

    DOEpatents

    Maxey, L.C.

    1993-09-28

    A method is described for a systematic method of interpreting interference fringes obtained by using a corner cube retroreflector as an alignment aid when aligning a paraboloid to a spherical wavefront. This is applicable to any general case where such alignment is required, but is specifically applicable in the case of aligning an autocollimating test using a diverging beam wavefront. In addition, the method provides information which can be systematically interpreted such that independent information about pitch, yaw and focus errors can be obtained. Thus, the system lends itself readily to automation. Finally, although the method is developed specifically for paraboloids, it can be seen to be applicable to a variety of other aspheric optics when applied in combination with a wavefront corrector that produces a wavefront which, when reflected from the correctly aligned aspheric surface will produce a collimated wavefront like that obtained from the paraboloid when it is correctly aligned to a spherical wavefront. 14 figures.

  10. An alignment method for mammographic X-ray spectroscopy under clinical conditions.

    PubMed

    Miyajima, S; Imagawa, K; Matsumoto, M

    2002-09-01

    This paper describes an alignment method for mammographic X-ray spectroscopy under clinical conditions. A pinhole, a fluorescent screen, a laser device and the case for a detector are used for alignment of the focal spot, a collimator and a detector. The method determines the line between the focal spot and the point of interest in an X-ray field radiographically. The method allows alignment for both central axis and off-axis directions.

  11. Dynamics of submicron aerosol droplets in a robust optical trap formed by multiple Bessel beams

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thanopulos, Ioannis; Theoretical and Physical Chemistry Institute, National Hellenic Research Foundation, Athens 11635; Luckhaus, David

    In this paper, we model the three-dimensional escape dynamics of single submicron-sized aerosol droplets in optical multiple Bessel beam traps. Trapping in counter-propagating Bessel beams (CPBBs) is compared with a newly proposed quadruple Bessel beam (QBB) trap, which consists of two perpendicularly arranged CPBB traps. Calculations are performed for perfectly and imperfectly aligned traps. Mie-theory and finite-difference time-domain methods are used to calculate the optical forces. The droplet escape kinetics are obtained from the solution of the Langevin equation using a Verlet algorithm. Provided the traps are perfectly aligned, the calculations indicate very long lifetimes for droplets trapped either inmore » the CPBB or in the QBB trap. However, minor misalignments that are hard to control experimentally already severely diminish the stability of the CPBB trap. By contrast, such minor misalignments hardly affect the extended droplet lifetimes in a QBB trap. The QBB trap is found to be a stable, robust optical trap, which should enable the experimental investigation of submicron droplets with radii down to 100 nm. Optical binding between two droplets and its potential role in preventing coagulation when loading a CPBB trap is briefly addressed.« less

  12. Augmenting the spectral efficiency of enhanced PAM-DMT-based optical wireless communications.

    PubMed

    Islim, Mohamed Sufyan; Haas, Harald

    2016-05-30

    The energy efficiency of pulse-amplitude-modulated discrete multitone modulation (PAM-DMT) decreases as the modulation order of M-PAM modulation increases. Enhanced PAM-DMT (ePAM-DMT) was proposed as a solution to the reduced energy efficiency of PAM-DMT. This was achieved by allowing multiple streams of PAM-DMT to be superimposed and successively demodulated at the receiver side. In order to maintain a distortion-free unipolar ePAM-DMT system, the multiple time-domain PAM-DMT streams are required to be aligned. However, aligning the antisymmetry in ePAM-DMT is complex and results in efficiency losses. In this paper, a novel simplified method to apply the superposition modulation on M-PAM modulated discrete multitone (DMT) is introduced. Contrary to ePAM-DMT, the signal generation of the proposed system, termed augmented spectral efficiency discrete multitone (ASE-DMT), occurs in the frequency domain. This results in an improved spectral and energy efficiency. The analytical bit error rate (BER) performance bound of the proposed system is derived and compared with Monte-Carlo simulations. The system performance is shown to offer significant electrical and optical energy savings compared with ePAM-DMT and DC-biased optical orthogonal frequency division multiplexing (DCO-OFDM).

  13. Are the gyro-ages of field stars underestimated?

    NASA Astrophysics Data System (ADS)

    Kovács, Géza

    2015-09-01

    By using the current photometric rotational data on eight galactic open clusters, we show that the evolutionary stellar model (isochrone) ages of these clusters are tightly correlated with the period shifts applied to the (B - V)0-Prot ridges that optimally align these ridges to the one defined by Praesepe and the Hyades. On the other hand, when the traditional Skumanich-type multiplicative transformation is used, the ridges become far less aligned due to the age-dependent slope change introduced by the period multiplication. Therefore, we employ our simple additive gyro-age calibration on various datasets of Galactic field stars to test its applicability. We show that, in the overall sense, the gyro-ages are systematically greater than the isochrone ages. The difference could exceed several giga years, depending on the stellar parameters. Although the age overlap between the open clusters used in the calibration and the field star samples is only partial, the systematic difference indicates the limitation of the currently available gyro-age methods and suggests that the rotation of field stars slows down with a considerably lower speed than we would expect from the simple extrapolation of the stellar rotation rates in open clusters.

  14. Optimal network alignment with graphlet degree vectors.

    PubMed

    Milenković, Tijana; Ng, Weng Leong; Hayes, Wayne; Przulj, Natasa

    2010-06-30

    Important biological information is encoded in the topology of biological networks. Comparative analyses of biological networks are proving to be valuable, as they can lead to transfer of knowledge between species and give deeper insights into biological function, disease, and evolution. We introduce a new method that uses the Hungarian algorithm to produce optimal global alignment between two networks using any cost function. We design a cost function based solely on network topology and use it in our network alignment. Our method can be applied to any two networks, not just biological ones, since it is based only on network topology. We use our new method to align protein-protein interaction networks of two eukaryotic species and demonstrate that our alignment exposes large and topologically complex regions of network similarity. At the same time, our alignment is biologically valid, since many of the aligned protein pairs perform the same biological function. From the alignment, we predict function of yet unannotated proteins, many of which we validate in the literature. Also, we apply our method to find topological similarities between metabolic networks of different species and build phylogenetic trees based on our network alignment score. The phylogenetic trees obtained in this way bear a striking resemblance to the ones obtained by sequence alignments. Our method detects topologically similar regions in large networks that are statistically significant. It does this independent of protein sequence or any other information external to network topology.

  15. Automated and Adaptable Quantification of Cellular Alignment from Microscopic Images for Tissue Engineering Applications

    PubMed Central

    Xu, Feng; Beyazoglu, Turker; Hefner, Evan; Gurkan, Umut Atakan

    2011-01-01

    Cellular alignment plays a critical role in functional, physical, and biological characteristics of many tissue types, such as muscle, tendon, nerve, and cornea. Current efforts toward regeneration of these tissues include replicating the cellular microenvironment by developing biomaterials that facilitate cellular alignment. To assess the functional effectiveness of the engineered microenvironments, one essential criterion is quantification of cellular alignment. Therefore, there is a need for rapid, accurate, and adaptable methodologies to quantify cellular alignment for tissue engineering applications. To address this need, we developed an automated method, binarization-based extraction of alignment score (BEAS), to determine cell orientation distribution in a wide variety of microscopic images. This method combines a sequenced application of median and band-pass filters, locally adaptive thresholding approaches and image processing techniques. Cellular alignment score is obtained by applying a robust scoring algorithm to the orientation distribution. We validated the BEAS method by comparing the results with the existing approaches reported in literature (i.e., manual, radial fast Fourier transform-radial sum, and gradient based approaches). Validation results indicated that the BEAS method resulted in statistically comparable alignment scores with the manual method (coefficient of determination R2=0.92). Therefore, the BEAS method introduced in this study could enable accurate, convenient, and adaptable evaluation of engineered tissue constructs and biomaterials in terms of cellular alignment and organization. PMID:21370940

  16. PF2fit: Polar Fast Fourier Matched Alignment of Atomistic Structures with 3D Electron Microscopy Maps.

    PubMed

    Bettadapura, Radhakrishna; Rasheed, Muhibur; Vollrath, Antje; Bajaj, Chandrajit

    2015-10-01

    There continue to be increasing occurrences of both atomistic structure models in the PDB (possibly reconstructed from X-ray diffraction or NMR data), and 3D reconstructed cryo-electron microscopy (3D EM) maps (albeit at coarser resolution) of the same or homologous molecule or molecular assembly, deposited in the EMDB. To obtain the best possible structural model of the molecule at the best achievable resolution, and without any missing gaps, one typically aligns (match and fits) the atomistic structure model with the 3D EM map. We discuss a new algorithm and generalized framework, named PF(2) fit (Polar Fast Fourier Fitting) for the best possible structural alignment of atomistic structures with 3D EM. While PF(2) fit enables only a rigid, six dimensional (6D) alignment method, it augments prior work on 6D X-ray structure and 3D EM alignment in multiple ways: Scoring. PF(2) fit includes a new scoring scheme that, in addition to rewarding overlaps between the volumes occupied by the atomistic structure and 3D EM map, rewards overlaps between the volumes complementary to them. We quantitatively demonstrate how this new complementary scoring scheme improves upon existing approaches. PF(2) fit also includes two scoring functions, the non-uniform exterior penalty and the skeleton-secondary structure score, and implements the scattering potential score as an alternative to traditional Gaussian blurring. Search. PF(2) fit utilizes a fast polar Fourier search scheme, whose main advantage is the ability to search over uniformly and adaptively sampled subsets of the space of rigid-body motions. PF(2) fit also implements a new reranking search and scoring methodology that considerably improves alignment metrics in results obtained from the initial search.

  17. Capabilities and challenges in transferring the wavefront-based alignment approach to small aperture multi-element optical systems

    NASA Astrophysics Data System (ADS)

    Krappig, Reik; Schmitt, Robert

    2017-02-01

    Present alignment methods already have an accuracy of some microns, allowing in general the fairly precise assembly of multi element optical systems. Nevertheless, they suffer decisive drawbacks, such as the necessity of an iterative process, stepping through all optical surfaces of the system when using autocollimation telescopes. In contrast to these limitations, the wavefront based alignment offers an elegant approach to potentially reach sub-µm accuracy in the alignment within a highly efficient process, that simultaneously acquires and evaluates the best optical solution possible. However, the practical use of these capabilities in corresponding alignment devices needs to take real sensor behavior into account. This publication will especially elaborate on the influence of the sensor properties in relation to the alignment process. The first dominant requirement is a highly stable measurement, since tiny perturbations in the optical system will have an also tiny influence on the wavefront. Secondly, the lateral sampling of the measured wavefront is supposed to be as high as possible, in order to be able to extract higher order Zernike coefficients reliable. The resulting necessity of using the largest sensor area possible conflicts with the requirement to allow a certain lateral displacement of the measured spot, indicating a perturbation. A movement of the sensor with suitable stages in turn leads to additional uncertainties connected to the actuators. Further factors include the SNR-ratio of the sensor as well as multiple measurements, in order to improve data repeatability. This publication will present a procedure of dealing with these relevant influence factors. Depending on the optical system and its properties the optimal adjustment of these parameters is derived.

  18. PF2 fit: Polar Fast Fourier Matched Alignment of Atomistic Structures with 3D Electron Microscopy Maps

    PubMed Central

    Bettadapura, Radhakrishna; Rasheed, Muhibur; Vollrath, Antje; Bajaj, Chandrajit

    2015-01-01

    There continue to be increasing occurrences of both atomistic structure models in the PDB (possibly reconstructed from X-ray diffraction or NMR data), and 3D reconstructed cryo-electron microscopy (3D EM) maps (albeit at coarser resolution) of the same or homologous molecule or molecular assembly, deposited in the EMDB. To obtain the best possible structural model of the molecule at the best achievable resolution, and without any missing gaps, one typically aligns (match and fits) the atomistic structure model with the 3D EM map. We discuss a new algorithm and generalized framework, named PF2 fit (Polar Fast Fourier Fitting) for the best possible structural alignment of atomistic structures with 3D EM. While PF2 fit enables only a rigid, six dimensional (6D) alignment method, it augments prior work on 6D X-ray structure and 3D EM alignment in multiple ways: Scoring. PF2 fit includes a new scoring scheme that, in addition to rewarding overlaps between the volumes occupied by the atomistic structure and 3D EM map, rewards overlaps between the volumes complementary to them. We quantitatively demonstrate how this new complementary scoring scheme improves upon existing approaches. PF2 fit also includes two scoring functions, the non-uniform exterior penalty and the skeleton-secondary structure score, and implements the scattering potential score as an alternative to traditional Gaussian blurring. Search. PF2 fit utilizes a fast polar Fourier search scheme, whose main advantage is the ability to search over uniformly and adaptively sampled subsets of the space of rigid-body motions. PF2 fit also implements a new reranking search and scoring methodology that considerably improves alignment metrics in results obtained from the initial search. PMID:26469938

  19. Super-resolution with an SLM and two intensity images

    NASA Astrophysics Data System (ADS)

    Alcalá Ochoa, Noé; de León, Y. Ponce

    2018-06-01

    It is reported a method which may simplify the optical setups used to achieve super-resolution through the amplitude multiplication of two waves. For this end we decompose a super-resolving pupil into two complex masks and with the aid of a Spatial Light Modulator (LCoS) we obtain two intensity images that are subtracted. With this proposal, the traditional experimental optical setups are considerably simplified, with the additional benefit that different masks can be utilized without needing to perform the setup alignment each time.

  20. Fibre Optic Connections And Method For Using Same

    DOEpatents

    Chan, Benson; Cohen, Mitchell S.; Fortier, Paul F.; Freitag, Ladd W.; Hall, Richard R.; Johnson, Glen W.; Lin, How Tzu; Sherman, John H.

    2004-03-30

    A package is described that couples a twelve channel wide fiber optic cable to a twelve channel Vertical Cavity Surface Emitting Laser (VCSEL) transmitter and a multiple channel Perpendicularly Aligned Integrated Die (PAID) receiver. The package allows for reduction in the height of the assembly package by vertically orienting certain dies parallel to the fiber optic cable and horizontally orienting certain other dies. The assembly allows the vertically oriented optoelectronic dies to be perpendicularly attached to the horizontally oriented laminate via a flexible circuit.

  1. Reconstructing evolutionary trees in parallel for massive sequences.

    PubMed

    Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam

    2017-12-14

    Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .

  2. Parental alignments and rejection: an empirical study of alienation in children of divorce.

    PubMed

    Johnston, Janet R

    2003-01-01

    This study of family relationships after divorce examined the frequency and extent of child-parent alignments and correlates of children's rejection of a parent, these being basic components of the controversial idea of "parental alienation syndrome." The sample consisted of 215 children from the family courts and general community two to three years after parental separation. The findings indicate that children's attitudes toward their parents range from positive to negative, with relatively few being extremely aligned or rejecting. Rejection of a parent has multiple determinants, with both the aligned and rejected parents contributing to the problem, in addition to vulnerabilities within children themselves.

  3. Increased alignment sensitivity improves the usage of genome alignments for comparative gene annotation.

    PubMed

    Sharma, Virag; Hiller, Michael

    2017-08-21

    Genome alignments provide a powerful basis to transfer gene annotations from a well-annotated reference genome to many other aligned genomes. The completeness of these annotations crucially depends on the sensitivity of the underlying genome alignment. Here, we investigated the impact of the genome alignment parameters and found that parameters with a higher sensitivity allow the detection of thousands of novel alignments between orthologous exons that have been missed before. In particular, comparisons between species separated by an evolutionary distance of >0.75 substitutions per neutral site, like human and other non-placental vertebrates, benefit from increased sensitivity. To systematically test if increased sensitivity improves comparative gene annotations, we built a multiple alignment of 144 vertebrate genomes and used this alignment to map human genes to the other 143 vertebrates with CESAR. We found that higher alignment sensitivity substantially improves the completeness of comparative gene annotations by adding on average 2382 and 7440 novel exons and 117 and 317 novel genes for mammalian and non-mammalian species, respectively. Our results suggest a more sensitive alignment strategy that should generally be used for genome alignments between distantly-related species. Our 144-vertebrate genome alignment and the comparative gene annotations (https://bds.mpi-cbg.de/hillerlab/144VertebrateAlignment_CESAR/) are a valuable resource for comparative genomics. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. The Caterpillar Game: A SW-PBIS Aligned Classroom Management System

    ERIC Educational Resources Information Center

    Floress, Margaret T.; Jacoby, Amber L.

    2017-01-01

    The Caterpillar Game is a classroom management system that is aligned with School-wide Positive Behavioral Interventions and Supports standards. A single-case, multiple-baseline design was used to evaluate the effects of the Caterpillar Game on disruptive student behavior and teacher praise. Three classrooms were included in the study (preschool,…

  5. Instructional Alignment as a Measure of Teaching Quality

    ERIC Educational Resources Information Center

    Polikoff, Morgan S.; Porter, Andrew C.

    2014-01-01

    Recent years have seen the convergence of two major policy streams in U.S. K-12 education: standards/accountability and teacher quality reforms. Work in these areas has led to the creation of multiple measures of teacher quality, including measures of their instructional alignment to standards/assessments, observational and student survey measures…

  6. SEAN: SNP prediction and display program utilizing EST sequence clusters.

    PubMed

    Huntley, Derek; Baldo, Angela; Johri, Saurabh; Sergot, Marek

    2006-02-15

    SEAN is an application that predicts single nucleotide polymorphisms (SNPs) using multiple sequence alignments produced from expressed sequence tag (EST) clusters. The algorithm uses rules of sequence identity and SNP abundance to determine the quality of the prediction. A Java viewer is provided to display the EST alignments and predicted SNPs.

  7. Sequence Diversity Diagram for comparative analysis of multiple sequence alignments.

    PubMed

    Sakai, Ryo; Aerts, Jan

    2014-01-01

    The sequence logo is a graphical representation of a set of aligned sequences, commonly used to depict conservation of amino acid or nucleotide sequences. Although it effectively communicates the amount of information present at every position, this visual representation falls short when the domain task is to compare between two or more sets of aligned sequences. We present a new visual presentation called a Sequence Diversity Diagram and validate our design choices with a case study. Our software was developed using the open-source program called Processing. It loads multiple sequence alignment FASTA files and a configuration file, which can be modified as needed to change the visualization. The redesigned figure improves on the visual comparison of two or more sets, and it additionally encodes information on sequential position conservation. In our case study of the adenylate kinase lid domain, the Sequence Diversity Diagram reveals unexpected patterns and new insights, for example the identification of subgroups within the protein subfamily. Our future work will integrate this visual encoding into interactive visualization tools to support higher level data exploration tasks.

  8. Multi-resolution Shape Analysis via Non-Euclidean Wavelets: Applications to Mesh Segmentation and Surface Alignment Problems.

    PubMed

    Kim, Won Hwa; Chung, Moo K; Singh, Vikas

    2013-01-01

    The analysis of 3-D shape meshes is a fundamental problem in computer vision, graphics, and medical imaging. Frequently, the needs of the application require that our analysis take a multi-resolution view of the shape's local and global topology, and that the solution is consistent across multiple scales. Unfortunately, the preferred mathematical construct which offers this behavior in classical image/signal processing, Wavelets, is no longer applicable in this general setting (data with non-uniform topology). In particular, the traditional definition does not allow writing out an expansion for graphs that do not correspond to the uniformly sampled lattice (e.g., images). In this paper, we adapt recent results in harmonic analysis, to derive Non-Euclidean Wavelets based algorithms for a range of shape analysis problems in vision and medical imaging. We show how descriptors derived from the dual domain representation offer native multi-resolution behavior for characterizing local/global topology around vertices. With only minor modifications, the framework yields a method for extracting interest/key points from shapes, a surprisingly simple algorithm for 3-D shape segmentation (competitive with state of the art), and a method for surface alignment (without landmarks). We give an extensive set of comparison results on a large shape segmentation benchmark and derive a uniqueness theorem for the surface alignment problem.

  9. Sparse alignment for robust tensor learning.

    PubMed

    Lai, Zhihui; Wong, Wai Keung; Xu, Yong; Zhao, Cairong; Sun, Mingming

    2014-10-01

    Multilinear/tensor extensions of manifold learning based algorithms have been widely used in computer vision and pattern recognition. This paper first provides a systematic analysis of the multilinear extensions for the most popular methods by using alignment techniques, thereby obtaining a general tensor alignment framework. From this framework, it is easy to show that the manifold learning based tensor learning methods are intrinsically different from the alignment techniques. Based on the alignment framework, a robust tensor learning method called sparse tensor alignment (STA) is then proposed for unsupervised tensor feature extraction. Different from the existing tensor learning methods, L1- and L2-norms are introduced to enhance the robustness in the alignment step of the STA. The advantage of the proposed technique is that the difficulty in selecting the size of the local neighborhood can be avoided in the manifold learning based tensor feature extraction algorithms. Although STA is an unsupervised learning method, the sparsity encodes the discriminative information in the alignment step and provides the robustness of STA. Extensive experiments on the well-known image databases as well as action and hand gesture databases by encoding object images as tensors demonstrate that the proposed STA algorithm gives the most competitive performance when compared with the tensor-based unsupervised learning methods.

  10. A parallel approach of COFFEE objective function to multiple sequence alignment

    NASA Astrophysics Data System (ADS)

    Zafalon, G. F. D.; Visotaky, J. M. V.; Amorim, A. R.; Valêncio, C. R.; Neves, L. A.; de Souza, R. C. G.; Machado, J. M.

    2015-09-01

    The computational tools to assist genomic analyzes show even more necessary due to fast increasing of data amount available. With high computational costs of deterministic algorithms for sequence alignments, many works concentrate their efforts in the development of heuristic approaches to multiple sequence alignments. However, the selection of an approach, which offers solutions with good biological significance and feasible execution time, is a great challenge. Thus, this work aims to show the parallelization of the processing steps of MSA-GA tool using multithread paradigm in the execution of COFFEE objective function. The standard objective function implemented in the tool is the Weighted Sum of Pairs (WSP), which produces some distortions in the final alignments when sequences sets with low similarity are aligned. Then, in studies previously performed we implemented the COFFEE objective function in the tool to smooth these distortions. Although the nature of COFFEE objective function implies in the increasing of execution time, this approach presents points, which can be executed in parallel. With the improvements implemented in this work, we can verify the execution time of new approach is 24% faster than the sequential approach with COFFEE. Moreover, the COFFEE multithreaded approach is more efficient than WSP, because besides it is slightly fast, its biological results are better.

  11. Multilayer Microfluidic Devices Created From A Single Photomask

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kelly, Ryan T.; Sheen, Allison M.; Jambovane, Sachin R.

    2013-08-28

    The time and expense associated with high quality photomask production can discourage the creation of multilayer microfluidic devices, as each layer currently requires a separate photomask. Here we describe an approach in which multilayer microfabricated devices can be created from a single photomask. The separate layers and their corresponding alignment marks are arranged in separate halves of the mask for two layer devices or quadrants for four layer devices. Selective exposure of the photomask features and rotation of the device substrate between exposures result in multiple copies of the devices on each wafer. Subsequent layers are aligned to patterned featuresmore » on the substrate with the same alignment accuracy as when multiple photomasks are used. We demonstrate this approach for fabricating devices employing multilayer soft lithography (MSL) for pneumatic valving. MSL devices containing as many as 5 layers (4 aligned fluidic layers plus a manually aligned control layer) were successfully created using this approach. Device design is also modularized, enabling the presence or absence of features as well as channel heights to be selected independently from one another. The use of a single photomask to create multilayer devices results in a dramatic savings of time and/or money required to advance from device design to completed prototype.« less

  12. DAMBE7: New and Improved Tools for Data Analysis in Molecular Biology and Evolution.

    PubMed

    Xia, Xuhua

    2018-06-01

    DAMBE is a comprehensive software package for genomic and phylogenetic data analysis on Windows, Linux, and Macintosh computers. New functions include imputing missing distances and phylogeny simultaneously (paving the way to build large phage and transposon trees), new bootstrapping/jackknifing methods for PhyPA (phylogenetics from pairwise alignments), and an improved function for fast and accurate estimation of the shape parameter of the gamma distribution for fitting rate heterogeneity over sites. Previous method corrects multiple hits for each site independently. DAMBE's new method uses all sites simultaneously for correction. DAMBE, featuring a user-friendly graphic interface, is freely available from http://dambe.bio.uottawa.ca (last accessed, April 17, 2018).

  13. A comparison of radiographic anatomic axis knee alignment measurements and cross-sectional associations with knee osteoarthritis

    PubMed Central

    Goulston, L.M.; Sanchez-Santos, M.T.; D'Angelo, S.; Leyland, K.M.; Hart, D.J.; Spector, T.D.; Cooper, C.; Dennison, E.M.; Hunter, D.; Arden, N.K.

    2016-01-01

    Summary Objective Malalignment is associated with knee osteoarthritis (KOA), however, the optimal anatomic axis (AA) knee alignment measurement on a standard limb radiograph (SLR) is unknown. This study compares one-point (1P) and two-point (2P) AA methods using three knee joint centre locations and examines cross-sectional associations with symptomatic radiographic knee osteoarthritis (SRKOA), radiographic knee osteoarthritis (RKOA) and knee pain. Methods AA alignment was measured six different ways using the KneeMorf software on 1058 SLRs from 584 women in the Chingford Study. Cross-sectional associations with principal outcome SRKOA combined with greatest reproducibility determined the optimal 1P and 2P AA method. Appropriate varus/neutral/valgus alignment categories were established using logistic regression with generalised estimating equation models fitted with restricted cubic spline function. Results The tibial plateau centre displayed greatest reproducibility and associations with SRKOA. As mean 1P and 2P values differed by >2°, new alignment categories were generated for 1P: varus <178°, neutral 178–182°, valgus >182° and for 2P methods: varus <180°, neutral 180–185°, valgus >185°. Varus vs neutral alignment was associated with a near 2-fold increase in SRKOA and RKOA, and valgus vs neutral for RKOA using 2P method. Nonsignificant associations were seen for 1P method for SRKOA, RKOA and knee pain. Conclusions AA alignment was associated with SRKOA and the tibial plateau centre had the strongest association. Differences in AA alignment when 1P vs 2P methods were compared indicated bespoke alignment categories were necessary. Further replication and validation with mechanical axis alignment comparison is required. PMID:26700504

  14. HubAlign: an accurate and efficient method for global alignment of protein-protein interaction networks.

    PubMed

    Hashemifar, Somaye; Xu, Jinbo

    2014-09-01

    High-throughput experimental techniques have produced a large amount of protein-protein interaction (PPI) data. The study of PPI networks, such as comparative analysis, shall benefit the understanding of life process and diseases at the molecular level. One way of comparative analysis is to align PPI networks to identify conserved or species-specific subnetwork motifs. A few methods have been developed for global PPI network alignment, but it still remains challenging in terms of both accuracy and efficiency. This paper presents a novel global network alignment algorithm, denoted as HubAlign, that makes use of both network topology and sequence homology information, based upon the observation that topologically important proteins in a PPI network usually are much more conserved and thus, more likely to be aligned. HubAlign uses a minimum-degree heuristic algorithm to estimate the topological and functional importance of a protein from the global network topology information. Then HubAlign aligns topologically important proteins first and gradually extends the alignment to the whole network. Extensive tests indicate that HubAlign greatly outperforms several popular methods in terms of both accuracy and efficiency, especially in detecting functionally similar proteins. HubAlign is available freely for non-commercial purposes at http://ttic.uchicago.edu/∼hashemifar/software/HubAlign.zip. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  15. StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase.

    PubMed

    Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L

    2011-06-02

    Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.

  16. Using Variable-Length Aligned Fragment Pairs and an Improved Transition Function for Flexible Protein Structure Alignment.

    PubMed

    Cao, Hu; Lu, Yonggang

    2017-01-01

    With the rapid growth of known protein 3D structures in number, how to efficiently compare protein structures becomes an essential and challenging problem in computational structural biology. At present, many protein structure alignment methods have been developed. Among all these methods, flexible structure alignment methods are shown to be superior to rigid structure alignment methods in identifying structure similarities between proteins, which have gone through conformational changes. It is also found that the methods based on aligned fragment pairs (AFPs) have a special advantage over other approaches in balancing global structure similarities and local structure similarities. Accordingly, we propose a new flexible protein structure alignment method based on variable-length AFPs. Compared with other methods, the proposed method possesses three main advantages. First, it is based on variable-length AFPs. The length of each AFP is separately determined to maximally represent a local similar structure fragment, which reduces the number of AFPs. Second, it uses local coordinate systems, which simplify the computation at each step of the expansion of AFPs during the AFP identification. Third, it decreases the number of twists by rewarding the situation where nonconsecutive AFPs share the same transformation in the alignment, which is realized by dynamic programming with an improved transition function. The experimental data show that compared with FlexProt, FATCAT, and FlexSnap, the proposed method can achieve comparable results by introducing fewer twists. Meanwhile, it can generate results similar to those of the FATCAT method in much less running time due to the reduced number of AFPs.

  17. A new method and device of aligning patient setup lasers in radiation therapy.

    PubMed

    Hwang, Ui-Jung; Jo, Kwanghyun; Lim, Young Kyung; Kwak, Jung Won; Choi, Sang Hyuon; Jeong, Chiyoung; Kim, Mi Young; Jeong, Jong Hwi; Shin, Dongho; Lee, Se Byeong; Park, Jeong-Hoon; Park, Sung Yong; Kim, Siyong

    2016-01-08

    The aim of this study is to develop a new method to align the patient setup lasers in a radiation therapy treatment room and examine its validity and efficiency. The new laser alignment method is realized by a device composed of both a metallic base plate and a few acrylic transparent plates. Except one, every plate has either a crosshair line (CHL) or a single vertical line that is used for alignment. Two holders for radiochromic film insertion are prepared in the device to find a radiation isocenter. The right laser positions can be found optically by matching the shadows of all the CHLs in the gantry head and the device. The reproducibility, accuracy, and efficiency of laser alignment and the dependency on the position error of the light source were evaluated by comparing the means and the standard deviations of the measured laser positions. After the optical alignment of the lasers, the radiation isocenter was found by the gantry and collimator star shots, and then the lasers were translated parallel to the isocenter. In the laser position reproducibility test, the mean and standard deviation on the wall of treatment room were 32.3 ± 0.93 mm for the new method whereas they were 33.4 ± 1.49 mm for the conventional method. The mean alignment accuracy was 1.4 mm for the new method, and 2.1 mm for the conventional method on the walls. In the test of the dependency on the light source position error, the mean laser position was shifted just by a similar amount of the shift of the light source in the new method, but it was greatly magnified in the conventional method. In this study, a new laser alignment method was devised and evaluated successfully. The new method provided more accurate, more reproducible, and faster alignment of the lasers than the conventional method.

  18. Electrospun fibrinogen-PLA nanofibres for vascular tissue engineering.

    PubMed

    Gugutkov, D; Gustavsson, J; Cantini, M; Salmeron-Sánchez, M; Altankov, G

    2017-10-01

    Here we report on the development of a new type of hybrid fibrinogen-polylactic acid (FBG-PLA) nanofibres (NFs) with improved stiffness, combining the good mechanical properties of PLA with the excellent cell recognition properties of native FBG. We were particularly interested in the dorsal and ventral cell response to the nanofibres' organization (random or aligned), using human umbilical endothelial cells (HUVECs) as a model system. Upon ventral contact with random NFs, the cells developed a stellate-like morphology with multiple projections. The well-developed focal adhesion complexes suggested a successful cellular interaction. However, time-lapse analysis shows significantly lowered cell movements, resulting in the cells traversing a relatively short distance in multiple directions. Conversely, an elongated cell shape and significantly increased cell mobility were observed in aligned NFs. To follow the dorsal cell response, artificial wounds were created on confluent cell layers previously grown on glass slides and covered with either random or aligned NFs. Time-lapse analysis showed significantly faster wound coverage (within 12 h) of HUVECs on aligned samples vs. almost absent directional migration on random ones. However, nitric oxide (NO) release shows that endothelial cells possess lowered functionality on aligned NFs compared to random ones, where significantly higher NO production was found. Collectively, our studies show that randomly organized NFs could support the endothelization of implants while aligned NFs would rather direct cell locomotion for guided neovascularization. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  19. A Novel Adaptive H∞ Filtering Method with Delay Compensation for the Transfer Alignment of Strapdown Inertial Navigation Systems.

    PubMed

    Lyu, Weiwei; Cheng, Xianghong

    2017-11-28

    Transfer alignment is always a key technology in a strapdown inertial navigation system (SINS) because of its rapidity and accuracy. In this paper a transfer alignment model is established, which contains the SINS error model and the measurement model. The time delay in the process of transfer alignment is analyzed, and an H∞ filtering method with delay compensation is presented. Then the H∞ filtering theory and the robust mechanism of H∞ filter are deduced and analyzed in detail. In order to improve the transfer alignment accuracy in SINS with time delay, an adaptive H∞ filtering method with delay compensation is proposed. Since the robustness factor plays an important role in the filtering process and has effect on the filtering accuracy, the adaptive H∞ filter with delay compensation can adjust the value of robustness factor adaptively according to the dynamic external environment. The vehicle transfer alignment experiment indicates that by using the adaptive H∞ filtering method with delay compensation, the transfer alignment accuracy and the pure inertial navigation accuracy can be dramatically improved, which demonstrates the superiority of the proposed filtering method.

  20. Depth image super-resolution via semi self-taught learning framework

    NASA Astrophysics Data System (ADS)

    Zhao, Furong; Cao, Zhiguo; Xiao, Yang; Zhang, Xiaodi; Xian, Ke; Li, Ruibo

    2017-06-01

    Depth images have recently attracted much attention in computer vision and high-quality 3D content for 3DTV and 3D movies. In this paper, we present a new semi self-taught learning application framework for enhancing resolution of depth maps without making use of ancillary color images data at the target resolution, or multiple aligned depth maps. Our framework consists of cascade random forests reaching from coarse to fine results. We learn the surface information and structure transformations both from a small high-quality depth exemplars and the input depth map itself across different scales. Considering that edge plays an important role in depth map quality, we optimize an effective regularized objective that calculates on output image space and input edge space in random forests. Experiments show the effectiveness and superiority of our method against other techniques with or without applying aligned RGB information

  1. Apparatus and method for variable angle slant hole collimator

    DOEpatents

    Lee, Seung Joon; Kross, Brian J.; McKisson, John E.

    2017-07-18

    A variable angle slant hole (VASH) collimator for providing collimation of high energy photons such as gamma rays during radiological imaging of humans. The VASH collimator includes a stack of multiple collimator leaves and a means of quickly aligning each leaf to provide various projection angles. Rather than rotate the detector around the subject, the VASH collimator enables the detector to remain stationary while the projection angle of the collimator is varied for tomographic acquisition. High collimator efficiency is achieved by maintaining the leaves in accurate alignment through the various projection angles. Individual leaves include unique angled cuts to maintain a precise target collimation angle. Matching wedge blocks driven by two actuators with twin-lead screws accurately position each leaf in the stack resulting in the precise target collimation angle. A computer interface with the actuators enables precise control of the projection angle of the collimator.

  2. Robust image registration for multiple exposure high dynamic range image synthesis

    NASA Astrophysics Data System (ADS)

    Yao, Susu

    2011-03-01

    Image registration is an important preprocessing technique in high dynamic range (HDR) image synthesis. This paper proposed a robust image registration method for aligning a group of low dynamic range images (LDR) that are captured with different exposure times. Illumination change and photometric distortion between two images would result in inaccurate registration. We propose to transform intensity image data into phase congruency to eliminate the effect of the changes in image brightness and use phase cross correlation in the Fourier transform domain to perform image registration. Considering the presence of non-overlapped regions due to photometric distortion, evolutionary programming is applied to search for the accurate translation parameters so that the accuracy of registration is able to be achieved at a hundredth of a pixel level. The proposed algorithm works well for under and over-exposed image registration. It has been applied to align LDR images for synthesizing high quality HDR images..

  3. High efficiency x-ray nanofocusing by the blazed stacking of binary zone plates

    NASA Astrophysics Data System (ADS)

    Mohacsi, I.; Karvinen, P.; Vartiainen, I.; Diaz, A.; Somogyi, A.; Kewish, C. M.; Mercere, P.; David, C.

    2013-09-01

    The focusing efficiency of binary Fresnel zone plate lenses is fundamentally limited and higher efficiency requires a multi step lens profile. To overcome the manufacturing problems of high resolution and high efficiency multistep zone plates, we investigate the concept of stacking two different binary zone plates in each other's optical near-field. We use a coarse zone plate with π phase shift and a double density fine zone plate with π/2 phase shift to produce an effective 4- step profile. Using a compact experimental setup with piezo actuators for alignment, we demonstrated 47.1% focusing efficiency at 6.5 keV using a pair of 500 μm diameter and 200 nm smallest zone width. Furthermore, we present a spatially resolved characterization method using multiple diffraction orders to identify manufacturing errors, alignment errors and pattern distortions and their effect on diffraction efficiency.

  4. 'Compromise position' image alignment to accommodate independent motion of multiple clinical target volumes during radiotherapy: A high risk prostate cancer example.

    PubMed

    Rosewall, Tara; Yan, Jing; Alasti, Hamideh; Cerase, Carla; Bayley, Andrew

    2017-04-01

    Inclusion of multiple independently moving clinical target volumes (CTVs) in the irradiated volume causes an image guidance conundrum. The purpose of this research was to use high risk prostate cancer as a clinical example to evaluate a 'compromise' image alignment strategy. The daily pre-treatment orthogonal EPI for 14 consecutive patients were included in this analysis. Image matching was performed by aligning to the prostate only, the bony pelvis only and using the 'compromise' strategy. Residual CTV surrogate displacements were quantified for each of the alignment strategies. Analysis of the 388 daily fractions indicated surrogate displacements were well-correlated in all directions (r 2  = 0.95 (LR), 0.67 (AP) and 0.59 (SI). Differences between the surrogates displacements (95% range) were -0.4 to 1.8 mm (LR), -1.2 to 5.2 mm (SI) and -1.2 to 5.2 mm (AP). The distribution of the residual displacements was significantly smaller using the 'compromise' strategy, compared to the other strategies (p 0.005). The 'compromise' strategy ensured the CTV was encompassed by the PTV in all fractions, compared to 47 PTV violations when aligned to prostate only. This study demonstrated the feasibility of a compromise position image guidance strategy to accommodate simultaneous displacements of two independently moving CTVs. Application of this strategy was facilitated by correlation between the CTV displacements and resulted in no geometric excursions of the CTVs beyond standard sized PTVs. This simple image guidance strategy may also be applicable to other disease sites that concurrently irradiate multiple CTVs, such as head and neck, lung and cervix cancer. © 2016 The Royal Australian and New Zealand College of Radiologists.

  5. Fabrication Process for Large Size Mold and Alignment Method for Nanoimprint System

    NASA Astrophysics Data System (ADS)

    Ishibashi, Kentaro; Kokubo, Mitsunori; Goto, Hiroshi; Mizuno, Jun; Shoji, Shuichi

    Nanoimprint technology is considered one of the mass production methods of the display for cellular phone or notebook computer, with Anti-Reflection Structures (ARS) pattern and so on. In this case, the large size mold with nanometer order pattern is very important. Then, we describe the fabrication process for large size mold, and the alignment method for UV nanoimprint system. We developed the original mold fabrication process using nanoimprint method and etching techniques. In 66 × 45 mm2 area, 200nm period seamless patterns were formed using this process. And, we constructed original alignment system that consists of the CCD-camera system, X-Y-θ table, method of moiré fringe, and image processing system, because the accuracy of pattern connection depends on the alignment method. This alignment system accuracy was within 20nm.

  6. Automatic alignment method for calibration of hydrometers

    NASA Astrophysics Data System (ADS)

    Lee, Y. J.; Chang, K. H.; Chon, J. C.; Oh, C. Y.

    2004-04-01

    This paper presents a new method to automatically align specific scale-marks for the calibration of hydrometers. A hydrometer calibration system adopting the new method consists of a vision system, a stepping motor, and software to control the system. The vision system is composed of a CCD camera and a frame grabber, and is used to acquire images. The stepping motor moves the camera, which is attached to the vessel containing a reference liquid, along the hydrometer. The operating program has two main functions: to process images from the camera to find the position of the horizontal plane and to control the stepping motor for the alignment of the horizontal plane with a particular scale-mark. Any system adopting this automatic alignment method is a convenient and precise means of calibrating a hydrometer. The performance of the proposed method is illustrated by comparing the calibration results using the automatic alignment method with those obtained using the manual method.

  7. Feature-based three-dimensional registration for repetitive geometry in machine vision

    PubMed Central

    Gong, Yuanzheng; Seibel, Eric J.

    2016-01-01

    As an important step in three-dimensional (3D) machine vision, 3D registration is a process of aligning two or multiple 3D point clouds that are collected from different perspectives together into a complete one. The most popular approach to register point clouds is to minimize the difference between these point clouds iteratively by Iterative Closest Point (ICP) algorithm. However, ICP does not work well for repetitive geometries. To solve this problem, a feature-based 3D registration algorithm is proposed to align the point clouds that are generated by vision-based 3D reconstruction. By utilizing texture information of the object and the robustness of image features, 3D correspondences can be retrieved so that the 3D registration of two point clouds is to solve a rigid transformation. The comparison of our method and different ICP algorithms demonstrates that our proposed algorithm is more accurate, efficient and robust for repetitive geometry registration. Moreover, this method can also be used to solve high depth uncertainty problem caused by little camera baseline in vision-based 3D reconstruction. PMID:28286703

  8. Sensor assembly method using silicon interposer with trenches for three-dimensional binocular range sensors

    NASA Astrophysics Data System (ADS)

    Nakajima, Kazuhiro; Yamamoto, Yuji; Arima, Yutaka

    2018-04-01

    To easily assemble a three-dimensional binocular range sensor, we devised an alignment method for two image sensors using a silicon interposer with trenches. The trenches were formed using deep reactive ion etching (RIE) equipment. We produced a three-dimensional (3D) range sensor using the method and experimentally confirmed that sufficient alignment accuracy was realized. It was confirmed that the alignment accuracy of the two image sensors when using the proposed method is more than twice that of the alignment assembly method on a conventional board. In addition, as a result of evaluating the deterioration of the detection performance caused by the alignment accuracy, it was confirmed that the vertical deviation between the corresponding pixels in the two image sensors is substantially proportional to the decrease in detection performance. Therefore, we confirmed that the proposed method can realize more than twice the detection performance of the conventional method. Through these evaluations, the effectiveness of the 3D binocular range sensor aligned by the silicon interposer with the trenches was confirmed.

  9. Algorithm for Aligning an Array of Receiving Radio Antennas

    NASA Technical Reports Server (NTRS)

    Rogstad, David

    2006-01-01

    A digital-signal-processing algorithm (somewhat arbitrarily) called SUMPLE has been devised as a means of aligning the outputs of multiple receiving radio antennas in a large array for the purpose of receiving a weak signal transmitted by a single distant source. As used here, aligning signifies adjusting the delays and phases of the outputs from the various antennas so that their relatively weak replicas of the desired signal can be added coherently to increase the signal-to-noise ratio (SNR) for improved reception, as though one had a single larger antenna. The method was devised to enhance spacecraft-tracking and telemetry operations in NASA's Deep Space Network (DSN); the method could also be useful in such other applications as both satellite and terrestrial radio communications and radio astronomy. Heretofore, most commonly, alignment has been effected by a process that involves correlation of signals in pairs. This approach necessitates the use of a large amount of hardware most notably, the N(N - 1)/2 correlators needed to process signals from all possible pairs of N antennas. Moreover, because the incoming signals typically have low SNRs, the delay and phase adjustments are poorly determined from the pairwise correlations. SUMPLE also involves correlations, but the correlations are not performed in pairs. Instead, in a partly iterative process, each signal is appropriately weighted and then correlated with a composite signal equal to the sum of the other signals (see Figure 1). One benefit of this approach is that only N correlators are needed; in an array of N much greater than 1 antennas, this results in a significant reduction of the amount of hardware. Another benefit is that once the array achieves coherence, the correlation SNR is N - 1 times that of a pair of antennas.

  10. Relationships between static foot alignment and dynamic plantar loads in runners with acute and chronic stages of plantar fasciitis: a cross-sectional study

    PubMed Central

    Ribeiro, Ana P.; Sacco, Isabel C. N.; Dinato, Roberto C.; João, Silvia M. A.

    2016-01-01

    BACKGROUND: The risk factors for the development of plantar fasciitis (PF) have been associated with the medial longitudinal arch (MLA), rearfoot alignment and calcaneal overload. However, the relationships between the biomechanical variables have yet to be determined. OBJECTIVE: The goal of this study was to investigate the relationships between the MLA, rearfoot alignment, and dynamic plantar loads in runners with unilateral PF in acute and chronic phases. METHOD: Cross-sectional study which thirty-five runners with unilateral PF were evaluated: 20 in the acute phase (with pain) and 15 with previous chronic PF (without pain). The MLA index and rearfoot alignment were calculated using digital images. The contact area, maximum force, peak pressure, and force-time integral over three plantar areas were acquired with Pedar X insoles while running at 12 km/h, and the loading rates were calculated from the vertical forces. RESULTS: The multiple regression analyses indicated that both the force-time integral (R 2=0.15 for acute phase PF; R 2=0.17 for chronic PF) and maximum force (R 2=0.35 for chronic PF) over the forefoot were predicted by an elevated MLA index. The rearfoot valgus alignment predicted the maximum force over the rearfoot in both PF groups: acute (R 2=0.18) and chronic (R 2=0.45). The rearfoot valgus alignment also predicted higher loading rates in the PF groups: acute (R 2=0.19) and chronic (R 2=0.40). CONCLUSION: The MLA index and the rearfoot alignment were good predictors of plantar loads over the forefoot and rearfoot areas in runners with PF. However, rearfoot valgus was demonstrated to be an important clinical measure, since it was able to predict the maximum force and both loading rates over the rearfoot. PMID:26786073

  11. Vertical epitaxial wire-on-wire growth of Ge/Si on Si(100) substrate.

    PubMed

    Shimizu, Tomohiro; Zhang, Zhang; Shingubara, Shoso; Senz, Stephan; Gösele, Ulrich

    2009-04-01

    Vertically aligned epitaxial Ge/Si heterostructure nanowire arrays on Si(100) substrates were prepared by a two-step chemical vapor deposition method in anodic aluminum oxide templates. n-Butylgermane vapor was employed as new safer precursor for Ge nanowire growth instead of germane. First a Si nanowire was grown by the vapor liquid solid growth mechanism using Au as catalyst and silane. The second step was the growth of Ge nanowires on top of the Si nanowires. The method presented will allow preparing epitaxially grown vertical heterostructure nanowires consisting of multiple materials on an arbitrary substrate avoiding undesired lateral growth.

  12. Alignment of Standards and Assessment: A Theoretical and Empirical Study of Methods for Alignment

    ERIC Educational Resources Information Center

    Nasstrom, Gunilla; Henriksson, Widar

    2008-01-01

    Introduction: In a standards-based school-system alignment of policy documents with standards and assessment is important. To be able to evaluate whether schools and students have reached the standards, the assessment should focus on the standards. Different models and methods can be used for measuring alignment, i.e. the correspondence between…

  13. Multiple reaction monitoring-ion pair finder: a systematic approach to transform nontargeted mode to pseudotargeted mode for metabolomics study based on liquid chromatography-mass spectrometry.

    PubMed

    Luo, Ping; Dai, Weidong; Yin, Peiyuan; Zeng, Zhongda; Kong, Hongwei; Zhou, Lina; Wang, Xiaolin; Chen, Shili; Lu, Xin; Xu, Guowang

    2015-01-01

    Pseudotargeted metabolic profiling is a novel strategy combining the advantages of both targeted and untargeted methods. The strategy obtains metabolites and their product ions from quadrupole time-of-flight (Q-TOF) MS by information-dependent acquisition (IDA) and then picks targeted ion pairs and measures them on a triple-quadrupole MS by multiple reaction monitoring (MRM). The picking of ion pairs from thousands of candidates is the most time-consuming step of the pseudotargeted strategy. Herein, a systematic and automated approach and software (MRM-Ion Pair Finder) were developed to acquire characteristic MRM ion pairs by precursor ions alignment, MS(2) spectrum extraction and reduction, characteristic product ion selection, and ion fusion. To test the reliability of the approach, a mixture of 15 metabolite standards was first analyzed; the representative ion pairs were correctly picked out. Then, pooled serum samples were further studied, and the results were confirmed by the manual selection. Finally, a comparison with a commercial peak alignment software was performed, and a good characteristic ion coverage of metabolites was obtained. As a proof of concept, the proposed approach was applied to a metabolomics study of liver cancer; 854 metabolite ion pairs were defined in the positive ion mode from serum. Our approach provides a high throughput method which is reliable to acquire MRM ion pairs for pseudotargeted metabolomics with improved metabolite coverage and facilitate more reliable biomarkers discoveries.

  14. A comparison of radiographic anatomic axis knee alignment measurements and cross-sectional associations with knee osteoarthritis.

    PubMed

    Goulston, L M; Sanchez-Santos, M T; D'Angelo, S; Leyland, K M; Hart, D J; Spector, T D; Cooper, C; Dennison, E M; Hunter, D; Arden, N K

    2016-04-01

    Malalignment is associated with knee osteoarthritis (KOA), however, the optimal anatomic axis (AA) knee alignment measurement on a standard limb radiograph (SLR) is unknown. This study compares one-point (1P) and two-point (2P) AA methods using three knee joint centre locations and examines cross-sectional associations with symptomatic radiographic knee osteoarthritis (SRKOA), radiographic knee osteoarthritis (RKOA) and knee pain. AA alignment was measured six different ways using the KneeMorf software on 1058 SLRs from 584 women in the Chingford Study. Cross-sectional associations with principal outcome SRKOA combined with greatest reproducibility determined the optimal 1P and 2P AA method. Appropriate varus/neutral/valgus alignment categories were established using logistic regression with generalised estimating equation models fitted with restricted cubic spline function. The tibial plateau centre displayed greatest reproducibility and associations with SRKOA. As mean 1P and 2P values differed by >2°, new alignment categories were generated for 1P: varus <178°, neutral 178-182°, valgus >182° and for 2P methods: varus <180°, neutral 180-185°, valgus >185°. Varus vs neutral alignment was associated with a near 2-fold increase in SRKOA and RKOA, and valgus vs neutral for RKOA using 2P method. Nonsignificant associations were seen for 1P method for SRKOA, RKOA and knee pain. AA alignment was associated with SRKOA and the tibial plateau centre had the strongest association. Differences in AA alignment when 1P vs 2P methods were compared indicated bespoke alignment categories were necessary. Further replication and validation with mechanical axis alignment comparison is required. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.

  15. Evolutionary profiles derived from the QR factorization of multiple structural alignments gives an economy of information.

    PubMed

    O'Donoghue, Patrick; Luthey-Schulten, Zaida

    2005-02-25

    We present a new algorithm, based on the multidimensional QR factorization, to remove redundancy from a multiple structural alignment by choosing representative protein structures that best preserve the phylogenetic tree topology of the homologous group. The classical QR factorization with pivoting, developed as a fast numerical solution to eigenvalue and linear least-squares problems of the form Ax=b, was designed to re-order the columns of A by increasing linear dependence. Removing the most linear dependent columns from A leads to the formation of a minimal basis set which well spans the phase space of the problem at hand. By recasting the problem of redundancy in multiple structural alignments into this framework, in which the matrix A now describes the multiple alignment, we adapted the QR factorization to produce a minimal basis set of protein structures which best spans the evolutionary (phase) space. The non-redundant and representative profiles obtained from this procedure, termed evolutionary profiles, are shown in initial results to outperform well-tested profiles in homology detection searches over a large sequence database. A measure of structural similarity between homologous proteins, Q(H), is presented. By properly accounting for the effect and presence of gaps, a phylogenetic tree computed using this metric is shown to be congruent with the maximum-likelihood sequence-based phylogeny. The results indicate that evolutionary information is indeed recoverable from the comparative analysis of protein structure alone. Applications of the QR ordering and this structural similarity metric to analyze the evolution of structure among key, universally distributed proteins involved in translation, and to the selection of representatives from an ensemble of NMR structures are also discussed.

  16. Automated registration of multispectral MR vessel wall images of the carotid artery

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Klooster, R. van 't; Staring, M.; Reiber, J. H. C.

    2013-12-15

    Purpose: Atherosclerosis is the primary cause of heart disease and stroke. The detailed assessment of atherosclerosis of the carotid artery requires high resolution imaging of the vessel wall using multiple MR sequences with different contrast weightings. These images allow manual or automated classification of plaque components inside the vessel wall. Automated classification requires all sequences to be in alignment, which is hampered by patient motion. In clinical practice, correction of this motion is performed manually. Previous studies applied automated image registration to correct for motion using only nondeformable transformation models and did not perform a detailed quantitative validation. The purposemore » of this study is to develop an automated accurate 3D registration method, and to extensively validate this method on a large set of patient data. In addition, the authors quantified patient motion during scanning to investigate the need for correction. Methods: MR imaging studies (1.5T, dedicated carotid surface coil, Philips) from 55 TIA/stroke patients with ipsilateral <70% carotid artery stenosis were randomly selected from a larger cohort. Five MR pulse sequences were acquired around the carotid bifurcation, each containing nine transverse slices: T1-weighted turbo field echo, time of flight, T2-weighted turbo spin-echo, and pre- and postcontrast T1-weighted turbo spin-echo images (T1W TSE). The images were manually segmented by delineating the lumen contour in each vessel wall sequence and were manually aligned by applying throughplane and inplane translations to the images. To find the optimal automatic image registration method, different masks, choice of the fixed image, different types of the mutual information image similarity metric, and transformation models including 3D deformable transformation models, were evaluated. Evaluation of the automatic registration results was performed by comparing the lumen segmentations of the fixed image and moving image after registration. Results: The average required manual translation per image slice was 1.33 mm. Translations were larger as the patient was longer inside the scanner. Manual alignment took 187.5 s per patient resulting in a mean surface distance of 0.271 ± 0.127 mm. After minimal user interaction to generate the mask in the fixed image, the remaining sequences are automatically registered with a computation time of 52.0 s per patient. The optimal registration strategy used a circular mask with a diameter of 10 mm, a 3D B-spline transformation model with a control point spacing of 15 mm, mutual information as image similarity metric, and the precontrast T1W TSE as fixed image. A mean surface distance of 0.288 ± 0.128 mm was obtained with these settings, which is very close to the accuracy of the manual alignment procedure. The exact registration parameters and software were made publicly available. Conclusions: An automated registration method was developed and optimized, only needing two mouse clicks to mark the start and end point of the artery. Validation on a large group of patients showed that automated image registration has similar accuracy as the manual alignment procedure, substantially reducing the amount of user interactions needed, and is multiple times faster. In conclusion, the authors believe that the proposed automated method can replace the current manual procedure, thereby reducing the time to analyze the images.« less

  17. Efficient and robust model-to-image alignment using 3D scale-invariant features.

    PubMed

    Toews, Matthew; Wells, William M

    2013-04-01

    This paper presents feature-based alignment (FBA), a general method for efficient and robust model-to-image alignment. Volumetric images, e.g. CT scans of the human body, are modeled probabilistically as a collage of 3D scale-invariant image features within a normalized reference space. Features are incorporated as a latent random variable and marginalized out in computing a maximum a posteriori alignment solution. The model is learned from features extracted in pre-aligned training images, then fit to features extracted from a new image to identify a globally optimal locally linear alignment solution. Novel techniques are presented for determining local feature orientation and efficiently encoding feature intensity in 3D. Experiments involving difficult magnetic resonance (MR) images of the human brain demonstrate FBA achieves alignment accuracy similar to widely-used registration methods, while requiring a fraction of the memory and computation resources and offering a more robust, globally optimal solution. Experiments on CT human body scans demonstrate FBA as an effective system for automatic human body alignment where other alignment methods break down. Copyright © 2012 Elsevier B.V. All rights reserved.

  18. Efficient and Robust Model-to-Image Alignment using 3D Scale-Invariant Features

    PubMed Central

    Toews, Matthew; Wells, William M.

    2013-01-01

    This paper presents feature-based alignment (FBA), a general method for efficient and robust model-to-image alignment. Volumetric images, e.g. CT scans of the human body, are modeled probabilistically as a collage of 3D scale-invariant image features within a normalized reference space. Features are incorporated as a latent random variable and marginalized out in computing a maximum a-posteriori alignment solution. The model is learned from features extracted in pre-aligned training images, then fit to features extracted from a new image to identify a globally optimal locally linear alignment solution. Novel techniques are presented for determining local feature orientation and efficiently encoding feature intensity in 3D. Experiments involving difficult magnetic resonance (MR) images of the human brain demonstrate FBA achieves alignment accuracy similar to widely-used registration methods, while requiring a fraction of the memory and computation resources and offering a more robust, globally optimal solution. Experiments on CT human body scans demonstrate FBA as an effective system for automatic human body alignment where other alignment methods break down. PMID:23265799

  19. Automated interferometric alignment system for paraboloidal mirrors

    DOEpatents

    Maxey, L. Curtis

    1993-01-01

    A method is described for a systematic method of interpreting interference fringes obtained by using a corner cube retroreflector as an alignment aid when aigning a paraboloid to a spherical wavefront. This is applicable to any general case where such alignment is required, but is specifically applicable in the case of aligning an autocollimating test using a diverging beam wavefront. In addition, the method provides information which can be systematically interpreted such that independent information about pitch, yaw and focus errors can be obtained. Thus, the system lends itself readily to automation. Finally, although the method is developed specifically for paraboloids, it can be seen to be applicable to a variety of other aspheric optics when applied in combination with a wavefront corrector that produces a wavefront which, when reflected from the correctly aligned aspheric surface will produce a collimated wavefront like that obtained from the paraboloid when it is correctly aligned to a spherical wavefront.

  20. Polar cap particle precipitation and aurora: Review and commentary

    NASA Astrophysics Data System (ADS)

    Newell, Patrick T.; Liou, Kan; Wilson, Gordon R.

    2009-02-01

    Polar rain has a beautiful set of symmetry properties, individually established, but not previously discussed collectively, which can be organized by a single unifying principle. The key polar rain properties are favored hemisphere (controlled by the interplanetary magnetic field Bx), dawn/dusk gradient (IMF By), merging rate (IMF Bz or more generally d[Phi]MP/dt), nightside/dayside gradient, and seasonal effect. We argue that all five properties involve variants on a single theme: the further downstream a field line exits the magnetosphere (or less directly points toward the solar wind electron heat flux), the weaker the polar rain. This effect is the result of the requirements of charge quasi-neutrality, and because the ion thermal velocity declines and the tailward ion bulk flow velocity rises moving down tail from the frontside magnetopause. Polar cap arcs (or more properly, high-latitude sun-aligned arcs) are largely complementary to the polar rain, occurring most frequently when the dayside merging rate is low, and thus when polar rain is weak. Sun-aligned arcs are often considered as originating either in the polar rain or the expansion of the plasma sheet into the polar cap. In fact three quite distinct types of sun-aligned high-latitude arcs exist, two common, and one rare. One type of arc occurs as intensifications of the polar rain, and is common, but weak, typically <0.1 ergs/cm2 s, and lacks associated ion precipitation. A second category of Sun-aligned arcs with energy flux >0.1 ergs/cm2 s usually occurs adjacent to the auroral oval, and includes ion precipitation. The plasma regime of these common, and at times intense, arcs is often distinct from the oval which they abut. Convection alone does not specify the open/closed nature of these arcs, because multiple narrow convection reversals are common around such arcs, and the arcs themselves can be embedded within flows that are either sunward or anti-sunward. These observational facts do not neatly fit into either a plasma sheet origin or a polar rain origin (e.g., the necessity to abut the auroral oval, and the presence of ions does not fit the properties of polar rain, which can in any event be nearly absent for northward interplanetary magnetic field). One theory is that such arcs are associated with merging tailward of the cusp. Both of these common types of sun-aligned arcs fade within about 30 min of a southward IMF turning. The third, and rarest, category of sun-aligned arcs are intense, well detached from the auroral oval, contain plasma sheet origin ion precipitation as well as electrons, and persist for hours after a southward turning. These intense detached sun-aligned arcs can rapidly cross the polar cap, sometimes multiple times. Most events discussed in the literature as "theta-aurora" do not fit into this category (for example, although they may appear detached in images, they abut the oval in particle data, and do not have the persistence of detached events under southward IMF turnings). It is possible that no single theory can account for all three types of sun-aligned arcs. Solar energetic particle (SEP) events are at times used to demarcate polar cap open/closed boundaries. Although this works at times, examples exist where this method fails (e.g., very quiet conditions for which SEP reaches below L=4), and the method should be used with caution. Finally, it is shown that, although it is rare, the polar cap can at times completely close.

  1. Effect of Capacitive and Resistive electric transfer on changes in muscle flexibility and lumbopelvic alignment after fatiguing exercise.

    PubMed

    Yokota, Yuki; Sonoda, Takuya; Tashiro, Yuto; Suzuki, Yusuke; Kajiwara, Yu; Zeidan, Hala; Nakayama, Yasuaki; Kawagoe, Mirei; Shimoura, Kanako; Tatsumi, Masataka; Nakai, Kengo; Nishida, Yuichi; Bito, Tsubasa; Yoshimi, Soyoka; Aoyama, Tomoki

    2018-05-01

    [Purpose] This study aimed to clarify the effects of Capacitive and Resistive electric transfer (CRet) on changes in muscle flexibility and lumbopelvic alignment after fatiguing exercise. [Subjects and Methods] Twenty-two healthy males were assigned into either the CRet (n=11) or control (n=11) group. Fatiguing exercise and CRet intervention were applied at the quadriceps muscle of the participants' dominant legs. The Ely test, pelvic tilt, lumbar lordosis, and superficial temperature were measured before and after exercise and for 30 minutes after intervention. Statistical analysis was performed using one-way analysis of variance, with Tukey's post-hoc multiple comparison test to clarify within-group changes and Student's t-test to clarify between-group differences. [Results] The Ely test and pelvic tilt were significantly different in both groups after exercise, but there was no difference in the CRet group after intervention. Superficial temperature significantly increased in the CRet group for 30 minutes after intervention, in contrast to after the exercise and intervention in the control group. There was no significant between-group difference at any timepoint, except in superficial temperature. [Conclusion] CRet could effectively improve muscle flexibility and lumbopelvic alignment after fatiguing exercise.

  2. Unidirectionally aligned line patterns driven by entropic effects on faceted surfaces

    PubMed Central

    Hong, Sung Woo; Huh, June; Gu, Xiaodan; Lee, Dong Hyun; Jo, Won Ho; Park, Soojin; Xu, Ting; Russell, Thomas P.

    2012-01-01

    A simple, versatile approach to the directed self-assembly of block copolymers into a macroscopic array of unidirectionally aligned cylindrical microdomains on reconstructed faceted single crystal surfaces or on flexible, inexpensive polymeric replicas was discovered. High fidelity transfer of the line pattern generated from the microdomains to a master mold is also shown. A single-grained line patterns over arbitrarily large surface areas without the use of top-down techniques is demonstrated, which has an order parameter typically in excess of 0.97 and a slope error of 1.1 deg. This degree of perfection, produced in a short time period, has yet to be achieved by any other methods. The exceptional alignment arises from entropic penalties of chain packing in the facets coupled with the bending modulus of the cylindrical microdomains. This is shown, theoretically, to be the lowest energy state. The atomic crystalline ordering of the substrate is transferred, over multiple length scales, to the block copolymer microdomains, opening avenues to large-scale roll-to-roll type and nanoimprint processing of perfectly patterned surfaces and as templates and scaffolds for magnetic storage media, polarizing devices, and nanowire arrays. PMID:22307591

  3. Effect of Capacitive and Resistive electric transfer on changes in muscle flexibility and lumbopelvic alignment after fatiguing exercise

    PubMed Central

    Yokota, Yuki; Sonoda, Takuya; Tashiro, Yuto; Suzuki, Yusuke; Kajiwara, Yu; Zeidan, Hala; Nakayama, Yasuaki; Kawagoe, Mirei; Shimoura, Kanako; Tatsumi, Masataka; Nakai, Kengo; Nishida, Yuichi; Bito, Tsubasa; Yoshimi, Soyoka; Aoyama, Tomoki

    2018-01-01

    [Purpose] This study aimed to clarify the effects of Capacitive and Resistive electric transfer (CRet) on changes in muscle flexibility and lumbopelvic alignment after fatiguing exercise. [Subjects and Methods] Twenty-two healthy males were assigned into either the CRet (n=11) or control (n=11) group. Fatiguing exercise and CRet intervention were applied at the quadriceps muscle of the participants’ dominant legs. The Ely test, pelvic tilt, lumbar lordosis, and superficial temperature were measured before and after exercise and for 30 minutes after intervention. Statistical analysis was performed using one-way analysis of variance, with Tukey’s post-hoc multiple comparison test to clarify within-group changes and Student’s t-test to clarify between-group differences. [Results] The Ely test and pelvic tilt were significantly different in both groups after exercise, but there was no difference in the CRet group after intervention. Superficial temperature significantly increased in the CRet group for 30 minutes after intervention, in contrast to after the exercise and intervention in the control group. There was no significant between-group difference at any timepoint, except in superficial temperature. [Conclusion] CRet could effectively improve muscle flexibility and lumbopelvic alignment after fatiguing exercise. PMID:29765189

  4. Cellular and Nuclear Alignment Analysis for Determining Epithelial Cell Chirality

    PubMed Central

    Raymond, Michael J.; Ray, Poulomi; Kaur, Gurleen; Singh, Ajay V.; Wan, Leo Q.

    2015-01-01

    Left-right (LR) asymmetry is a biologically conserved property in living organisms that can be observed in the asymmetrical arrangement of organs and tissues and in tissue morphogenesis, such as the directional looping of the gastrointestinal tract and heart. The expression of LR asymmetry in embryonic tissues can be appreciated in biased cell alignment. Previously an in vitro chirality assay was reported by patterning multiple cells on microscale defined geometries and quantified the cell phenotype–dependent LR asymmetry, or cell chirality. However, morphology and chirality of individual cells on micropatterned surfaces has not been well characterized. Here, a Python-based algorithm was developed to identify and quantify immunofluorescence stained individual epithelial cells on multicellular patterns. This approach not only produces results similar to the image intensity gradient-based method reported previously, but also can capture properties of single cells such as area and aspect ratio. We also found that cell nuclei exhibited biased alignment. Around 35% cells were misaligned and were typically smaller and less elongated. This new imaging analysis approach is an effective tool for measuring single cell chirality inside multicellular structures and can potentially help unveil biophysical mechanisms underlying cellular chiral bias both in vitro and in vivo. PMID:26294010

  5. NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets.

    PubMed

    Nielsen, Morten; Andreatta, Massimo

    2016-03-30

    Binding of peptides to MHC class I molecules (MHC-I) is essential for antigen presentation to cytotoxic T-cells. Here, we demonstrate how a simple alignment step allowing insertions and deletions in a pan-specific MHC-I binding machine-learning model enables combining information across both multiple MHC molecules and peptide lengths. This pan-allele/pan-length algorithm significantly outperforms state-of-the-art methods, and captures differences in the length profile of binders to different MHC molecules leading to increased accuracy for ligand identification. Using this model, we demonstrate that percentile ranks in contrast to affinity-based thresholds are optimal for ligand identification due to uniform sampling of the MHC space. We have developed a neural network-based machine-learning algorithm leveraging information across multiple receptor specificities and ligand length scales, and demonstrated how this approach significantly improves the accuracy for prediction of peptide binding and identification of MHC ligands. The method is available at www.cbs.dtu.dk/services/NetMHCpan-3.0 .

  6. Bioanalysis of antibody-drug conjugates: American Association of Pharmaceutical Scientists Antibody-Drug Conjugate Working Group position paper.

    PubMed

    Gorovits, Boris; Alley, Stephen C; Bilic, Sanela; Booth, Brian; Kaur, Surinder; Oldfield, Phillip; Purushothama, Shobha; Rao, Chetana; Shord, Stacy; Siguenza, Patricia

    2013-05-01

    Antibody-drug conjugates (ADCs) typically consist of a cytotoxic drug covalently bound to an antibody by a linker. These conjugates have the potential to substantially improve efficacy and reduce toxicity compared with cytotoxic small-molecule drugs. Since ADCs are generally complex heterogeneous mixtures of multiple species, these novel therapeutic products present unique bioanalytical challenges. The growing number of ADCs being developed across the industry suggests the need for alignment of the bioanalytical methods or approaches used to assess the multiple species and facilitate consistent interpretation of the bioanalytical data. With limited clinical data, the current strategies that can be used to provide insight into the relationship between the multiple species and the observed clinical safety and efficacy are still evolving. Considerations of the bioanalytical strategies for ADCs based on the current industry practices that take into account the complexity and heterogeneity of ADCs are discussed.

  7. EAPhy: A Flexible Tool for High-throughput Quality Filtering of Exon-alignments and Data Processing for Phylogenetic Methods.

    PubMed

    Blom, Mozes P K

    2015-08-05

    Recently developed molecular methods enable geneticists to target and sequence thousands of orthologous loci and infer evolutionary relationships across the tree of life. Large numbers of genetic markers benefit species tree inference but visual inspection of alignment quality, as traditionally conducted, is challenging with thousands of loci. Furthermore, due to the impracticality of repeated visual inspection with alternative filtering criteria, the potential consequences of using datasets with different degrees of missing data remain nominally explored in most empirical phylogenomic studies. In this short communication, I describe a flexible high-throughput pipeline designed to assess alignment quality and filter exonic sequence data for subsequent inference. The stringency criteria for alignment quality and missing data can be adapted based on the expected level of sequence divergence. Each alignment is automatically evaluated based on the stringency criteria specified, significantly reducing the number of alignments that require visual inspection. By developing a rapid method for alignment filtering and quality assessment, the consistency of phylogenetic estimation based on exonic sequence alignments can be further explored across distinct inference methods, while accounting for different degrees of missing data.

  8. A method to align a bent crystal for channeling experiments by using quasichanneling oscillations

    NASA Astrophysics Data System (ADS)

    Sytov, A. I.; Guidi, V.; Tikhomirov, V. V.; Bandiera, L.; Bagli, E.; Germogli, G.; Mazzolari, A.; Romagnoni, M.

    2018-04-01

    A method to calculate both the bent crystal angle of alignment and radius of curvature by using only one distribution of deflection angles has been developed. The method is based on measuring of the angular position of recently predicted and observed quasichanneling oscillations in the deflection angle distribution and consequent fitting of both the radius and angular alignment by analytic formulae. In this paper this method is applied on the example of simulated angular distributions over a wide range of values of both radius and alignment for electrons. It is carried out through the example of (111) nonequidistant planes though this technique is general and could be applied to any kind of planes. In addition, the method application constraints are also discussed. It is shown by simulations that this method, being in fact a sort of beam diagnostics, allows one in a certain case to increase the crystal alignment accuracy as well as to control precisely the radius of curvature inside an accelerator tube without vacuum breaking. In addition, it speeds up the procedure of crystal alignment in channeling experiments, reducing beamtime consuming.

  9. A Novel Adaptive H∞ Filtering Method with Delay Compensation for the Transfer Alignment of Strapdown Inertial Navigation Systems

    PubMed Central

    Lyu, Weiwei

    2017-01-01

    Transfer alignment is always a key technology in a strapdown inertial navigation system (SINS) because of its rapidity and accuracy. In this paper a transfer alignment model is established, which contains the SINS error model and the measurement model. The time delay in the process of transfer alignment is analyzed, and an H∞ filtering method with delay compensation is presented. Then the H∞ filtering theory and the robust mechanism of H∞ filter are deduced and analyzed in detail. In order to improve the transfer alignment accuracy in SINS with time delay, an adaptive H∞ filtering method with delay compensation is proposed. Since the robustness factor plays an important role in the filtering process and has effect on the filtering accuracy, the adaptive H∞ filter with delay compensation can adjust the value of robustness factor adaptively according to the dynamic external environment. The vehicle transfer alignment experiment indicates that by using the adaptive H∞ filtering method with delay compensation, the transfer alignment accuracy and the pure inertial navigation accuracy can be dramatically improved, which demonstrates the superiority of the proposed filtering method. PMID:29182592

  10. R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server

    PubMed Central

    Cannone, Jamie J.; Sweeney, Blake A.; Petrov, Anton I.; Gutell, Robin R.; Zirbel, Craig L.; Leontis, Neocles

    2015-01-01

    The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. PMID:26048960

  11. Physician-Hospital Alignment in Orthopedic Surgery.

    PubMed

    Bushnell, Brandon D

    2015-09-01

    The concept of "alignment" between physicians and hospitals is a popular buzzword in the age of health care reform. Despite their often tumultuous histories, physicians and hospitals find themselves under increasing pressures to work together toward common goals. However, effective alignment is more than just simple cooperation between parties. The process of achieving alignment does not have simple, universal steps. Alignment will differ based on individual situational factors and the type of specialty involved. Ultimately, however, there are principles that underlie the concept of alignment and should be a part of any physician-hospital alignment efforts. In orthopedic surgery, alignment involves the clinical, administrative, financial, and even personal aspects of a surgeon's practice. It must be based on the principles of financial interest, clinical authority, administrative participation, transparency, focus on the patient, and mutual necessity. Alignment can take on various forms as well, with popular models consisting of shared governance and comanagement, gainsharing, bundled payments, accountable care organizations, and other methods. As regulatory and financial pressures continue to motivate physicians and hospitals to develop alignment relationships, new and innovative methods of alignment will also appear. Existing models will mature and evolve, with individual variability based on local factors. However, certain trends seem to be appearing as time progresses and alignment relationships deepen, including regional and national collaboration, population management, and changes in the legal system. This article explores the history, principles, and specific methods of physician-hospital alignment and its critical importance for the future of health care delivery. Copyright 2015, SLACK Incorporated.

  12. Magnet-assisted device-level alignment for the fabrication of membrane-sandwiched polydimethylsiloxane microfluidic devices

    NASA Astrophysics Data System (ADS)

    Lu, J.-C.; Liao, W.-H.; Tung, Y.-C.

    2012-07-01

    Polydimethylsiloxane (PDMS) microfluidic device is one of the most essential techniques that advance microfluidics research in recent decades. PDMS is broadly exploited to construct microfluidic devices due to its unique and advantageous material properties. To realize more functionalities, PDMS microfluidic devices with multi-layer architectures, especially those with sandwiched membranes, have been developed for various applications. However, existing alignment methods for device fabrication are mainly based on manual observations, which are time consuming, inaccurate and inconsistent. This paper develops a magnet-assisted alignment method to enhance device-level alignment accuracy and precision without complicated fabrication processes. In the developed alignment method, magnets are embedded into PDMS layers at the corners of the device. The paired magnets are arranged in symmetric positions at each PDMS layer, and the magnetic attraction force automatically pulls the PDMS layers into the aligned position during assembly. This paper also applies the method to construct a practical microfluidic device, a tunable chaotic micromixer. The results demonstrate the successful operation of the device without failure, which suggests the accurate alignment and reliable bonding achieved by the method. Consequently, the fabrication method developed in this paper is promising to be exploited to construct various membrane-sandwiched PDMS microfluidic devices with more integrated functionalities to advance microfluidics research.

  13. Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign

    PubMed Central

    2007-01-01

    Background Joint alignment and secondary structure prediction of two RNA sequences can significantly improve the accuracy of the structural predictions. Methods addressing this problem, however, are forced to employ constraints that reduce computation by restricting the alignments and/or structures (i.e. folds) that are permissible. In this paper, a new methodology is presented for the purpose of establishing alignment constraints based on nucleotide alignment and insertion posterior probabilities. Using a hidden Markov model, posterior probabilities of alignment and insertion are computed for all possible pairings of nucleotide positions from the two sequences. These alignment and insertion posterior probabilities are additively combined to obtain probabilities of co-incidence for nucleotide position pairs. A suitable alignment constraint is obtained by thresholding the co-incidence probabilities. The constraint is integrated with Dynalign, a free energy minimization algorithm for joint alignment and secondary structure prediction. The resulting method is benchmarked against the previous version of Dynalign and against other programs for pairwise RNA structure prediction. Results The proposed technique eliminates manual parameter selection in Dynalign and provides significant computational time savings in comparison to prior constraints in Dynalign while simultaneously providing a small improvement in the structural prediction accuracy. Savings are also realized in memory. In experiments over a 5S RNA dataset with average sequence length of approximately 120 nucleotides, the method reduces computation by a factor of 2. The method performs favorably in comparison to other programs for pairwise RNA structure prediction: yielding better accuracy, on average, and requiring significantly lesser computational resources. Conclusion Probabilistic analysis can be utilized in order to automate the determination of alignment constraints for pairwise RNA structure prediction methods in a principled fashion. These constraints can reduce the computational and memory requirements of these methods while maintaining or improving their accuracy of structural prediction. This extends the practical reach of these methods to longer length sequences. The revised Dynalign code is freely available for download. PMID:17445273

  14. A new method and device of aligning patient setup lasers in radiation therapy

    PubMed Central

    Hwang, Ui‐Jung; Jo, Kwanghyun; Kwak, Jung Won; Choi, Sang Hyoun; Jeong, Chiyoung; Kim, Mi Young; Jeong, Jong Hwi; Shin, Dongho; Lee, Se Byeong; Park, Jeong‐Hoon; Park, Sung Yong; Kim, Siyong

    2016-01-01

    The aim of this study is to develop a new method to align the patient setup lasers in a radiation therapy treatment room and examine its validity and efficiency. The new laser alignment method is realized by a device composed of both a metallic base plate and a few acrylic transparent plates. Except one, every plate has either a crosshair line (CHL) or a single vertical line that is used for alignment. Two holders for radiochromic film insertion are prepared in the device to find a radiation isocenter. The right laser positions can be found optically by matching the shadows of all the CHLs in the gantry head and the device. The reproducibility, accuracy, and efficiency of laser alignment and the dependency on the position error of the light source were evaluated by comparing the means and the standard deviations of the measured laser positions. After the optical alignment of the lasers, the radiation isocenter was found by the gantry and collimator star shots, and then the lasers were translated parallel to the isocenter. In the laser position reproducibility test, the mean and standard deviation on the wall of treatment room were 32.3±0.93 mm for the new method whereas they were 33.4±1.49 mm for the conventional method. The mean alignment accuracy was 1.4 mm for the new method, and 2.1 mm for the conventional method on the walls. In the test of the dependency on the light source position error, the mean laser position was shifted just by a similar amount of the shift of the light source in the new method, but it was greatly magnified in the conventional method. In this study, a new laser alignment method was devised and evaluated successfully. The new method provided more accurate, more reproducible, and faster alignment of the lasers than the conventional method. PACS numbers: 87.56.Fc, 87.53.Bn, 87.53.Kn, 87.53.Ly, 87.55.Gh PMID:26894331

  15. Hidden Markov models of biological primary sequence information.

    PubMed Central

    Baldi, P; Chauvin, Y; Hunkapiller, T; McClure, M A

    1994-01-01

    Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences. PMID:8302831

  16. Gemi: PCR Primers Prediction from Multiple Alignments

    PubMed Central

    Sobhy, Haitham; Colson, Philippe

    2012-01-01

    Designing primers and probes for polymerase chain reaction (PCR) is a preliminary and critical step that requires the identification of highly conserved regions in a given set of sequences. This task can be challenging if the targeted sequences display a high level of diversity, as frequently encountered in microbiologic studies. We developed Gemi, an automated, fast, and easy-to-use bioinformatics tool with a user-friendly interface to design primers and probes based on multiple aligned sequences. This tool can be used for the purpose of real-time and conventional PCR and can deal efficiently with large sets of sequences of a large size. PMID:23316117

  17. Open-Phylo: a customizable crowd-computing platform for multiple sequence alignment

    PubMed Central

    2013-01-01

    Citizen science games such as Galaxy Zoo, Foldit, and Phylo aim to harness the intelligence and processing power generated by crowds of online gamers to solve scientific problems. However, the selection of the data to be analyzed through these games is under the exclusive control of the game designers, and so are the results produced by gamers. Here, we introduce Open-Phylo, a freely accessible crowd-computing platform that enables any scientist to enter our system and use crowds of gamers to assist computer programs in solving one of the most fundamental problems in genomics: the multiple sequence alignment problem. PMID:24148814

  18. A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy.

    PubMed

    Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng

    2017-05-10

    Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite its higher computational costs, our method is still suitable for analyzing large-scale microbiome datasets for practical purposes. Furthermore, our method can be applied for taxonomic classification of any phylogenetic marker gene sequences. Our software, called BLCA, is freely available at https://github.com/qunfengdong/BLCA .

  19. K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.

    PubMed

    Lin, Jie; Adjeroh, Donald A; Jiang, Bing-Hua; Jiang, Yue

    2018-05-15

    Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). yueljiang@163.com. Supplementary data are available at Bioinformatics online.

  20. Feature-based Alignment of Volumetric Multi-modal Images

    PubMed Central

    Toews, Matthew; Zöllei, Lilla; Wells, William M.

    2014-01-01

    This paper proposes a method for aligning image volumes acquired from different imaging modalities (e.g. MR, CT) based on 3D scale-invariant image features. A novel method for encoding invariant feature geometry and appearance is developed, based on the assumption of locally linear intensity relationships, providing a solution to poor repeatability of feature detection in different image modalities. The encoding method is incorporated into a probabilistic feature-based model for multi-modal image alignment. The model parameters are estimated via a group-wise alignment algorithm, that iteratively alternates between estimating a feature-based model from feature data, then realigning feature data to the model, converging to a stable alignment solution with few pre-processing or pre-alignment requirements. The resulting model can be used to align multi-modal image data with the benefits of invariant feature correspondence: globally optimal solutions, high efficiency and low memory usage. The method is tested on the difficult RIRE data set of CT, T1, T2, PD and MP-RAGE brain images of subjects exhibiting significant inter-subject variability due to pathology. PMID:24683955

  1. Physically motivated global alignment method for electron tomography

    DOE PAGES

    Sanders, Toby; Prange, Micah; Akatay, Cem; ...

    2015-04-08

    Electron tomography is widely used for nanoscale determination of 3-D structures in many areas of science. Determining the 3-D structure of a sample from electron tomography involves three major steps: acquisition of sequence of 2-D projection images of the sample with the electron microscope, alignment of the images to a common coordinate system, and 3-D reconstruction and segmentation of the sample from the aligned image data. The resolution of the 3-D reconstruction is directly influenced by the accuracy of the alignment, and therefore, it is crucial to have a robust and dependable alignment method. In this paper, we develop amore » new alignment method which avoids the use of markers and instead traces the computed paths of many identifiable ‘local’ center-of-mass points as the sample is rotated. Compared with traditional correlation schemes, the alignment method presented here is resistant to cumulative error observed from correlation techniques, has very rigorous mathematical justification, and is very robust since many points and paths are used, all of which inevitably improves the quality of the reconstruction and confidence in the scientific results.« less

  2. Line Up, Line Up: Using Technology to Align and Enhance Peer Learning and Assessment in a Student Centred Foundation Organic Chemistry Module

    ERIC Educational Resources Information Center

    Ryan, Barry J.

    2013-01-01

    This paper describes how three technologies were utilised in combination to align student learning and assessment as part of a case study. Multiple choice questions (MCQs) were central to all these technologies. The peer learning technologies; Personal Response Devices (a.k.a. "Clickers") and "PeerWise"…

  3. Diffractive optics for precision alignment of Euclid space telescope optics (Conference Presentation)

    NASA Astrophysics Data System (ADS)

    Asfour, Jean-Michel; Weidner, Frank; Bodendorf, Christof; Bode, Andreas; Poleshchuk, Alexander G.; Nasyrov, Ruslan K.; Grupp, Frank; Bender, Ralf

    2017-09-01

    We present a method for precise alignment of lens elements using specific Computer Generated Hologram (CGH) with an integrated Fizeau reference flat surface and a Fizeau interferometer. The method is used for aligning the so called Camera Lens Assembly for ESAs Euclid telescope. Each lens has a corresponding annular area on the diffractive optics, which is used to control the position of each lens. The lenses are subsequently positioned using individual annular rings of the CGH. The overall alignment accuracy is below 1 µm, the alignment sensitivity is in the range of 0.1 µm. The achieved alignment accuracy of the lenses relative to each other is mainly depending on the stability in time of the alignment tower. Error budgets when using computer generated holograms and physical limitations are explained. Calibration measurements of the alignment system and the typically reached alignment accuracies will be shown and discussed.

  4. GalaxyTBM: template-based modeling by building a reliable core and refining unreliable local regions.

    PubMed

    Ko, Junsu; Park, Hahnbeom; Seok, Chaok

    2012-08-10

    Protein structures can be reliably predicted by template-based modeling (TBM) when experimental structures of homologous proteins are available. However, it is challenging to obtain structures more accurate than the single best templates by either combining information from multiple templates or by modeling regions that vary among templates or are not covered by any templates. We introduce GalaxyTBM, a new TBM method in which the more reliable core region is modeled first from multiple templates and less reliable, variable local regions, such as loops or termini, are then detected and re-modeled by an ab initio method. This TBM method is based on "Seok-server," which was tested in CASP9 and assessed to be amongst the top TBM servers. The accuracy of the initial core modeling is enhanced by focusing on more conserved regions in the multiple-template selection and multiple sequence alignment stages. Additional improvement is achieved by ab initio modeling of up to 3 unreliable local regions in the fixed framework of the core structure. Overall, GalaxyTBM reproduced the performance of Seok-server, with GalaxyTBM and Seok-server resulting in average GDT-TS of 68.1 and 68.4, respectively, when tested on 68 single-domain CASP9 TBM targets. For application to multi-domain proteins, GalaxyTBM must be combined with domain-splitting methods. Application of GalaxyTBM to CASP9 targets demonstrates that accurate protein structure prediction is possible by use of a multiple-template-based approach, and ab initio modeling of variable regions can further enhance the model quality.

  5. A direct method for computing extreme value (Gumbel) parameters for gapped biological sequence alignments.

    PubMed

    Quinn, Terrance; Sinkala, Zachariah

    2014-01-01

    We develop a general method for computing extreme value distribution (Gumbel, 1958) parameters for gapped alignments. Our approach uses mixture distribution theory to obtain associated BLOSUM matrices for gapped alignments, which in turn are used for determining significance of gapped alignment scores for pairs of biological sequences. We compare our results with parameters already obtained in the literature.

  6. Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method

    PubMed Central

    Nielsen, Morten; Lundegaard, Claus; Lund, Ole

    2007-01-01

    Background Antigen presenting cells (APCs) sample the extra cellular space and present peptides from here to T helper cells, which can be activated if the peptides are of foreign origin. The peptides are presented on the surface of the cells in complex with major histocompatibility class II (MHC II) molecules. Identification of peptides that bind MHC II molecules is thus a key step in rational vaccine design and developing methods for accurate prediction of the peptide:MHC interactions play a central role in epitope discovery. The MHC class II binding groove is open at both ends making the correct alignment of a peptide in the binding groove a crucial part of identifying the core of an MHC class II binding motif. Here, we present a novel stabilization matrix alignment method, SMM-align, that allows for direct prediction of peptide:MHC binding affinities. The predictive performance of the method is validated on a large MHC class II benchmark data set covering 14 HLA-DR (human MHC) and three mouse H2-IA alleles. Results The predictive performance of the SMM-align method was demonstrated to be superior to that of the Gibbs sampler, TEPITOPE, SVRMHC, and MHCpred methods. Cross validation between peptide data set obtained from different sources demonstrated that direct incorporation of peptide length potentially results in over-fitting of the binding prediction method. Focusing on amino terminal peptide flanking residues (PFR), we demonstrate a consistent gain in predictive performance by favoring binding registers with a minimum PFR length of two amino acids. Visualizing the binding motif as obtained by the SMM-align and TEPITOPE methods highlights a series of fundamental discrepancies between the two predicted motifs. For the DRB1*1302 allele for instance, the TEPITOPE method favors basic amino acids at most anchor positions, whereas the SMM-align method identifies a preference for hydrophobic or neutral amino acids at the anchors. Conclusion The SMM-align method was shown to outperform other state of the art MHC class II prediction methods. The method predicts quantitative peptide:MHC binding affinity values, making it ideally suited for rational epitope discovery. The method has been trained and evaluated on the, to our knowledge, largest benchmark data set publicly available and covers the nine HLA-DR supertypes suggested as well as three mouse H2-IA allele. Both the peptide benchmark data set, and SMM-align prediction method (NetMHCII) are made publicly available. PMID:17608956

  7. Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method.

    PubMed

    Nielsen, Morten; Lundegaard, Claus; Lund, Ole

    2007-07-04

    Antigen presenting cells (APCs) sample the extra cellular space and present peptides from here to T helper cells, which can be activated if the peptides are of foreign origin. The peptides are presented on the surface of the cells in complex with major histocompatibility class II (MHC II) molecules. Identification of peptides that bind MHC II molecules is thus a key step in rational vaccine design and developing methods for accurate prediction of the peptide:MHC interactions play a central role in epitope discovery. The MHC class II binding groove is open at both ends making the correct alignment of a peptide in the binding groove a crucial part of identifying the core of an MHC class II binding motif. Here, we present a novel stabilization matrix alignment method, SMM-align, that allows for direct prediction of peptide:MHC binding affinities. The predictive performance of the method is validated on a large MHC class II benchmark data set covering 14 HLA-DR (human MHC) and three mouse H2-IA alleles. The predictive performance of the SMM-align method was demonstrated to be superior to that of the Gibbs sampler, TEPITOPE, SVRMHC, and MHCpred methods. Cross validation between peptide data set obtained from different sources demonstrated that direct incorporation of peptide length potentially results in over-fitting of the binding prediction method. Focusing on amino terminal peptide flanking residues (PFR), we demonstrate a consistent gain in predictive performance by favoring binding registers with a minimum PFR length of two amino acids. Visualizing the binding motif as obtained by the SMM-align and TEPITOPE methods highlights a series of fundamental discrepancies between the two predicted motifs. For the DRB1*1302 allele for instance, the TEPITOPE method favors basic amino acids at most anchor positions, whereas the SMM-align method identifies a preference for hydrophobic or neutral amino acids at the anchors. The SMM-align method was shown to outperform other state of the art MHC class II prediction methods. The method predicts quantitative peptide:MHC binding affinity values, making it ideally suited for rational epitope discovery. The method has been trained and evaluated on the, to our knowledge, largest benchmark data set publicly available and covers the nine HLA-DR supertypes suggested as well as three mouse H2-IA allele. Both the peptide benchmark data set, and SMM-align prediction method (NetMHCII) are made publicly available.

  8. Development of robust and multi-mode control of tearing in DIII-D

    DOE PAGES

    Welander, A. S.; La Haye, R.J.; Humphreys, D. A.; ...

    2016-06-02

    Neoclassical tearing modes (NTMs) are instabilities that can produce undesirable magnetic islands in tokamak plasmas. They can be stabilized by applying electron cyclotron current drive (ECCD) at the island. The NTM control system on DIII-D can now control multiple modes. Each of 6 mirrors that reflect ECCD beams into the plasma can be assigned to different surfaces in the plasma where NTMs are unstable. The control system then steers the mirrors to keep the beams aimed at the surfaces. The system routinely stabilizes one NTM preemptively and has now also been used to control two modes in the same discharge.more » With the “catch-and-subdue” function, ECCD-generating gyrotrons can be turned on when NTMs appear and off after suppression. Newly triggered NTMs can be promptly suppressed if mode onset is detected early and ECCD immediately applied. Early mode detection is achieved in this paper by spectral analysis of Mirnov probes with a band-pass filter for the expected mode frequency. Targeted surfaces are tracked by equilibrium reconstructions (that include measurements of the motional Stark effect). The ECCD position is tracked by ray-tracing using the TORBEAM code. Several techniques are being explored for fine-tuning alignment when NTMs occur. One method adjusts ECCD alignment in steps until the island decays fast enough. A second method sweeps the alignment to find the optimum. A third method pulses gyrotrons and uses electron cyclotron emission to compare where the resulting temperature pulses are relative to temperature fluctuations from a rotating NTM. NTM control in ITER is expected to use active profile regulation to maximize controllability, followed by repeated catch-and-subdue actions if modes are retriggered, in order to maintain island size below the disruptive threshold while maximizing confinement and fusion gain. Between events, real-time tracking will be performed to maintain alignment and readiness for subsequent catch-andsubdue actions. Methods for active probing of stability boundaries will be studied as possible diagnostics for the profile regulation. Finally, selected elements of this ITER NTM control vision will be discussed and assessed.« less

  9. A Multispectral Image Creating Method for a New Airborne Four-Camera System with Different Bandpass Filters

    PubMed Central

    Li, Hanlun; Zhang, Aiwu; Hu, Shaoxing

    2015-01-01

    This paper describes an airborne high resolution four-camera multispectral system which mainly consists of four identical monochrome cameras equipped with four interchangeable bandpass filters. For this multispectral system, an automatic multispectral data composing method was proposed. The homography registration model was chosen, and the scale-invariant feature transform (SIFT) and random sample consensus (RANSAC) were used to generate matching points. For the difficult registration problem between visible band images and near-infrared band images in cases lacking manmade objects, we presented an effective method based on the structural characteristics of the system. Experiments show that our method can acquire high quality multispectral images and the band-to-band alignment error of the composed multiple spectral images is less than 2.5 pixels. PMID:26205264

  10. Detecting false positive sequence homology: a machine learning approach.

    PubMed

    Fujimoto, M Stanley; Suvorov, Anton; Jensen, Nicholas O; Clement, Mark J; Bybee, Seth M

    2016-02-24

    Accurate detection of homologous relationships of biological sequences (DNA or amino acid) amongst organisms is an important and often difficult task that is essential to various evolutionary studies, ranging from building phylogenies to predicting functional gene annotations. There are many existing heuristic tools, most commonly based on bidirectional BLAST searches that are used to identify homologous genes and combine them into two fundamentally distinct classes: orthologs and paralogs. Due to only using heuristic filtering based on significance score cutoffs and having no cluster post-processing tools available, these methods can often produce multiple clusters constituting unrelated (non-homologous) sequences. Therefore sequencing data extracted from incomplete genome/transcriptome assemblies originated from low coverage sequencing or produced by de novo processes without a reference genome are susceptible to high false positive rates of homology detection. In this paper we develop biologically informative features that can be extracted from multiple sequence alignments of putative homologous genes (orthologs and paralogs) and further utilized in context of guided experimentation to verify false positive outcomes. We demonstrate that our machine learning method trained on both known homology clusters obtained from OrthoDB and randomly generated sequence alignments (non-homologs), successfully determines apparent false positives inferred by heuristic algorithms especially among proteomes recovered from low-coverage RNA-seq data. Almost ~42 % and ~25 % of predicted putative homologies by InParanoid and HaMStR respectively were classified as false positives on experimental data set. Our process increases the quality of output from other clustering algorithms by providing a novel post-processing method that is both fast and efficient at removing low quality clusters of putative homologous genes recovered by heuristic-based approaches.

  11. Triangular Alignment (TAME). A Tensor-based Approach for Higher-order Network Alignment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mohammadi, Shahin; Gleich, David F.; Kolda, Tamara G.

    2015-11-01

    Network alignment is an important tool with extensive applications in comparative interactomics. Traditional approaches aim to simultaneously maximize the number of conserved edges and the underlying similarity of aligned entities. We propose a novel formulation of the network alignment problem that extends topological similarity to higher-order structures and provide a new objective function that maximizes the number of aligned substructures. This objective function corresponds to an integer programming problem, which is NP-hard. Consequently, we approximate this objective function as a surrogate function whose maximization results in a tensor eigenvalue problem. Based on this formulation, we present an algorithm called Triangularmore » AlignMEnt (TAME), which attempts to maximize the number of aligned triangles across networks. We focus on alignment of triangles because of their enrichment in complex networks; however, our formulation and resulting algorithms can be applied to general motifs. Using a case study on the NAPABench dataset, we show that TAME is capable of producing alignments with up to 99% accuracy in terms of aligned nodes. We further evaluate our method by aligning yeast and human interactomes. Our results indicate that TAME outperforms the state-of-art alignment methods both in terms of biological and topological quality of the alignments.« less

  12. Biopython: freely available Python tools for computational molecular biology and bioinformatics.

    PubMed

    Cock, Peter J A; Antao, Tiago; Chang, Jeffrey T; Chapman, Brad A; Cox, Cymon J; Dalke, Andrew; Friedberg, Iddo; Hamelryck, Thomas; Kauff, Frank; Wilczynski, Bartek; de Hoon, Michiel J L

    2009-06-01

    The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning. Biopython is freely available, with documentation and source code at (www.biopython.org) under the Biopython license.

  13. KinView: A visual comparative sequence analysis tool for integrated kinome research

    PubMed Central

    McSkimming, Daniel Ian; Dastgheib, Shima; Baffi, Timothy R.; Byrne, Dominic P.; Ferries, Samantha; Scott, Steven Thomas; Newton, Alexandra C.; Eyers, Claire E.; Kochut, Krzysztof J.; Eyers, Patrick A.

    2017-01-01

    Multiple sequence alignments (MSAs) are a fundamental analysis tool used throughout biology to investigate relationships between protein sequence, structure, function, evolutionary history, and patterns of disease-associated variants. However, their widespread application in systems biology research is currently hindered by the lack of user-friendly tools to simultaneously visualize, manipulate and query the information conceptualized in large sequence alignments, and the challenges in integrating MSAs with multiple orthogonal data such as cancer variants and post-translational modifications, which are often stored in heterogeneous data sources and formats. Here, we present the Multiple Sequence Alignment Ontology (MSAOnt), which represents a profile or consensus alignment in an ontological format. Subsets of the alignment are easily selected through the SPARQL Protocol and RDF Query Language for downstream statistical analysis or visualization. We have also created the Kinome Viewer (KinView), an interactive integrative visualization that places eukaryotic protein kinase cancer variants in the context of natural sequence variation and experimentally determined post-translational modifications, which play central roles in the regulation of cellular signaling pathways. Using KinView, we identified differential phosphorylation patterns between tyrosine and serine/threonine kinases in the activation segment, a major kinase regulatory region that is often mutated in proliferative diseases. We discuss cancer variants that disrupt phosphorylation sites in the activation segment, and show how KinView can be used as a comparative tool to identify differences and similarities in natural variation, cancer variants and post-translational modifications between kinase groups, families and subfamilies. Based on KinView comparisons, we identify and experimentally characterize a regulatory tyrosine (Y177PLK4) in the PLK4 C-terminal activation segment region termed the P+1 loop. To further demonstrate the application of KinView in hypothesis generation and testing, we formulate and validate a hypothesis explaining a novel predicted loss-of-function variant (D523NPKCβ) in the regulatory spine of PKCβ, a recently identified tumor suppressor kinase. KinView provides a novel, extensible interface for performing comparative analyses between subsets of kinases and for integrating multiple types of residue specific annotations in user friendly formats. PMID:27731453

  14. From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction

    PubMed Central

    Cocco, Simona; Monasson, Remi; Weigt, Martin

    2013-01-01

    Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant ‘patterns’ of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold. PMID:23990764

  15. Derivation of Multiple Covarying Material and Process Parameters Using Physics-Based Modeling of X-ray Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Khaira, Gurdaman; Doxastakis, Manolis; Bowen, Alec

    There is considerable interest in developing multimodal characterization frameworks capable of probing critical properties of complex materials by relying on distinct, complementary methods or tools. Any such framework should maximize the amount of information that is extracted from any given experiment and should be sufficiently powerful and efficient to enable on-the-fly analysis of multiple measurements in a self-consistent manner. Such a framework is demonstrated in this work in the context of self-assembling polymeric materials, where theory and simulations provide the language to seamlessly mesh experimental data from two different scattering measurements. Specifically, the samples considered here consist of diblock copolymersmore » (BCP) that are self-assembled on chemically nanopatterned surfaces. The copolymers microphase separate into ordered lamellae with characteristic dimensions on the scale of tens of nanometers that are perfectly aligned by the substrate over macroscopic areas. These aligned lamellar samples provide ideal standards with which to develop the formalism introduced in this work and, more generally, the concept of high-information-content, multimodal experimentation. The outcomes of the proposed analysis are then compared to images generated by 3D scanning electron microscopy tomography, serving to validate the merit of the framework and ideas proposed here.« less

  16. FEAST: sensitive local alignment with multiple rates of evolution.

    PubMed

    Hudek, Alexander K; Brown, Daniel G

    2011-01-01

    We present a pairwise local aligner, FEAST, which uses two new techniques: a sensitive extension algorithm for identifying homologous subsequences, and a descriptive probabilistic alignment model. We also present a new procedure for training alignment parameters and apply it to the human and mouse genomes, producing a better parameter set for these sequences. Our extension algorithm identifies homologous subsequences by considering all evolutionary histories. It has higher maximum sensitivity than Viterbi extensions, and better balances specificity. We model alignments with several submodels, each with unique statistical properties, describing strongly similar and weakly similar regions of homologous DNA. Training parameters using two submodels produces superior alignments, even when we align with only the parameters from the weaker submodel. Our extension algorithm combined with our new parameter set achieves sensitivity 0.59 on synthetic tests. In contrast, LASTZ with default settings achieves sensitivity 0.35 with the same false positive rate. Using the weak submodel as parameters for LASTZ increases its sensitivity to 0.59 with high error. FEAST is available at http://monod.uwaterloo.ca/feast/.

  17. Alignment of Tractograms As Graph Matching.

    PubMed

    Olivetti, Emanuele; Sharmin, Nusrat; Avesani, Paolo

    2016-01-01

    The white matter pathways of the brain can be reconstructed as 3D polylines, called streamlines, through the analysis of diffusion magnetic resonance imaging (dMRI) data. The whole set of streamlines is called tractogram and represents the structural connectome of the brain. In multiple applications, like group-analysis, segmentation, or atlasing, tractograms of different subjects need to be aligned. Typically, this is done with registration methods, that transform the tractograms in order to increase their similarity. In contrast with transformation-based registration methods, in this work we propose the concept of tractogram correspondence, whose aim is to find which streamline of one tractogram corresponds to which streamline in another tractogram, i.e., a map from one tractogram to another. As a further contribution, we propose to use the relational information of each streamline, i.e., its distances from the other streamlines in its own tractogram, as the building block to define the optimal correspondence. We provide an operational procedure to find the optimal correspondence through a combinatorial optimization problem and we discuss its similarity to the graph matching problem. In this work, we propose to represent tractograms as graphs and we adopt a recent inexact sub-graph matching algorithm to approximate the solution of the tractogram correspondence problem. On tractograms generated from the Human Connectome Project dataset, we report experimental evidence that tractogram correspondence, implemented as graph matching, provides much better alignment than affine registration and comparable if not better results than non-linear registration of volumes.

  18. An Accurate Scalable Template-based Alignment Algorithm

    PubMed Central

    Gardner, David P.; Xu, Weijia; Miranker, Daniel P.; Ozer, Stuart; Cannone, Jamie J.; Gutell, Robin R.

    2013-01-01

    The rapid determination of nucleic acid sequences is increasing the number of sequences that are available. Inherent in a template or seed alignment is the culmination of structural and functional constraints that are selecting those mutations that are viable during the evolution of the RNA. While we might not understand these structural and functional, template-based alignment programs utilize the patterns of sequence conservation to encapsulate the characteristics of viable RNA sequences that are aligned properly. We have developed a program that utilizes the different dimensions of information in rCAD, a large RNA informatics resource, to establish a profile for each position in an alignment. The most significant include sequence identity and column composition in different phylogenetic taxa. We have compared our methods with a maximum of eight alternative alignment methods on different sets of 16S and 23S rRNA sequences with sequence percent identities ranging from 50% to 100%. The results showed that CRWAlign outperformed the other alignment methods in both speed and accuracy. A web-based alignment server is available at http://www.rna.ccbb.utexas.edu/SAE/2F/CRWAlign. PMID:24772376

  19. SVM-dependent pairwise HMM: an application to protein pairwise alignments.

    PubMed

    Orlando, Gabriele; Raimondi, Daniele; Khan, Taushif; Lenaerts, Tom; Vranken, Wim F

    2017-12-15

    Methods able to provide reliable protein alignments are crucial for many bioinformatics applications. In the last years many different algorithms have been developed and various kinds of information, from sequence conservation to secondary structure, have been used to improve the alignment performances. This is especially relevant for proteins with highly divergent sequences. However, recent works suggest that different features may have different importance in diverse protein classes and it would be an advantage to have more customizable approaches, capable to deal with different alignment definitions. Here we present Rigapollo, a highly flexible pairwise alignment method based on a pairwise HMM-SVM that can use any type of information to build alignments. Rigapollo lets the user decide the optimal features to align their protein class of interest. It outperforms current state of the art methods on two well-known benchmark datasets when aligning highly divergent sequences. A Python implementation of the algorithm is available at http://ibsquare.be/rigapollo. wim.vranken@vub.be. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  20. Improving the quality of biomarker candidates in untargeted metabolomics via peak table-based alignment of comprehensive two-dimensional gas chromatography-mass spectrometry data

    PubMed Central

    Bean, Heather D.; Hill, Jane E.; Dimandja, Jean-Marie D.

    2015-01-01

    The potential of high-resolution analytical technologies like GC×GC/TOF MS in untargeted metabolomics and biomarker discovery has been limited by the development of fully automated software that can efficiently align and extract information from multiple chromatographic data sets. In this work we report the first investigation on a peak-by-peak basis of the chromatographic factors that impact GC×GC data alignment. A representative set of 16 compounds of different chromatographic characteristics were followed through the alignment of 63 GC×GC chromatograms. We found that varying the mass spectral match parameter had a significant influence on the alignment for poorly- resolved peaks, especially those at the extremes of the detector linear range, and no influence on well- chromatographed peaks. Therefore, optimized chromatography is required for proper GC×GC data alignment. Based on these observations, a workflow is presented for the conservative selection of biomarker candidates from untargeted metabolomics analyses. PMID:25857541

  1. Hal: an automated pipeline for phylogenetic analyses of genomic data.

    PubMed

    Robbertse, Barbara; Yoder, Ryan J; Boyd, Alex; Reeves, John; Spatafora, Joseph W

    2011-02-07

    The rapid increase in genomic and genome-scale data is resulting in unprecedented levels of discrete sequence data available for phylogenetic analyses. Major analytical impasses exist, however, prior to analyzing these data with existing phylogenetic software. Obstacles include the management of large data sets without standardized naming conventions, identification and filtering of orthologous clusters of proteins or genes, and the assembly of alignments of orthologous sequence data into individual and concatenated super alignments. Here we report the production of an automated pipeline, Hal that produces multiple alignments and trees from genomic data. These alignments can be produced by a choice of four alignment programs and analyzed by a variety of phylogenetic programs. In short, the Hal pipeline connects the programs BLASTP, MCL, user specified alignment programs, GBlocks, ProtTest and user specified phylogenetic programs to produce species trees. The script is available at sourceforge (http://sourceforge.net/projects/bio-hal/). The results from an example analysis of Kingdom Fungi are briefly discussed.

  2. Flexible, fast and accurate sequence alignment profiling on GPGPU with PaSWAS.

    PubMed

    Warris, Sven; Yalcin, Feyruz; Jackson, Katherine J L; Nap, Jan Peter

    2015-01-01

    To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis. With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation.

  3. GCALIGNER 1.0: an alignment program to compute a multiple sample comparison data matrix from large eco-chemical datasets obtained by GC.

    PubMed

    Dellicour, Simon; Lecocq, Thomas

    2013-10-01

    GCALIGNER 1.0 is a computer program designed to perform a preliminary data comparison matrix of chemical data obtained by GC without MS information. The alignment algorithm is based on the comparison between the retention times of each detected compound in a sample. In this paper, we test the GCALIGNER efficiency on three datasets of the chemical secretions of bumble bees. The algorithm performs the alignment with a low error rate (<3%). GCALIGNER 1.0 is a useful, simple and free program based on an algorithm that enables the alignment of table-type data from GC. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV)

    PubMed Central

    Martin, Andrew C. R.

    2014-01-01

    The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and ’dotifying’ repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/. PMID:25653836

  5. Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV).

    PubMed

    Martin, Andrew C R

    2014-01-01

    The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and 'dotifying' repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/.

  6. GIRAF: a method for fast search and flexible alignment of ligand binding interfaces in proteins at atomic resolution

    PubMed Central

    Kinjo, Akira R.; Nakamura, Haruki

    2012-01-01

    Comparison and classification of protein structures are fundamental means to understand protein functions. Due to the computational difficulty and the ever-increasing amount of structural data, however, it is in general not feasible to perform exhaustive all-against-all structure comparisons necessary for comprehensive classifications. To efficiently handle such situations, we have previously proposed a method, now called GIRAF. We herein describe further improvements in the GIRAF protein structure search and alignment method. The GIRAF method achieves extremely efficient search of similar structures of ligand binding sites of proteins by exploiting database indexing of structural features of local coordinate frames. In addition, it produces refined atom-wise alignments by iterative applications of the Hungarian method to the bipartite graph defined for a pair of superimposed structures. By combining the refined alignments based on different local coordinate frames, it is made possible to align structures involving domain movements. We provide detailed accounts for the database design, the search and alignment algorithms as well as some benchmark results. PMID:27493524

  7. Genome alignment with graph data structures: a comparison

    PubMed Central

    2014-01-01

    Background Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analyses such as rearrangement studies and phylogenetic inference. Graphs have proven to be a powerful tool for coping with the complexity of genome-scale sequence alignments. The potential of graphs to intuitively represent all aspects of genome alignments led to the development of graph-based approaches for genome alignment. These approaches construct a graph from a set of local alignments, and derive a genome alignment through identification and removal of graph substructures that indicate errors in the alignment. Results We compare the structures of commonly used graphs in terms of their abilities to represent alignment information. We describe how the graphs can be transformed into each other, and identify and classify graph substructures common to one or more graphs. Based on previous approaches, we compile a list of modifications that remove these substructures. Conclusion We show that crucial pieces of alignment information, associated with inversions and duplications, are not visible in the structure of all graphs. If we neglect vertex or edge labels, the graphs differ in their information content. Still, many ideas are shared among all graph-based approaches. Based on these findings, we outline a conceptual framework for graph-based genome alignment that can assist in the development of future genome alignment tools. PMID:24712884

  8. Versatile alignment layer method for new types of liquid crystal photonic devices

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Finnemeyer, V.; Bryant, D.; Lu, L.

    2015-07-21

    Liquid crystal photonic devices are becoming increasingly popular. These devices often present a challenge when it comes to creating a robust alignment layer in pre-assembled cells. In this paper, we describe a method of infusing a dye into a microcavity to produce an effective photo-definable alignment layer. However, previous research on such alignment layers has shown that they have limited stability, particularly against subsequent light exposure. As such, we further describe a method of utilizing a pre-polymer, infused into the microcavity along with the liquid crystal, to provide photostability. We demonstrate that the polymer layer, formed under ultraviolet irradiation ofmore » liquid crystal cells, has been effectively localized to a thin region near the substrate surface and provides a significant improvement in the photostability of the liquid crystal alignment. This versatile alignment layer method, capable of being utilized in devices from the described microcavities to displays, offers significant promise for new photonics applications.« less

  9. Ligand Binding Site Detection by Local Structure Alignment and Its Performance Complementarity

    PubMed Central

    Lee, Hui Sun; Im, Wonpil

    2013-01-01

    Accurate determination of potential ligand binding sites (BS) is a key step for protein function characterization and structure-based drug design. Despite promising results of template-based BS prediction methods using global structure alignment (GSA), there is a room to improve the performance by properly incorporating local structure alignment (LSA) because BS are local structures and often similar for proteins with dissimilar global folds. We present a template-based ligand BS prediction method using G-LoSA, our LSA tool. A large benchmark set validation shows that G-LoSA predicts drug-like ligands’ positions in single-chain protein targets more precisely than TM-align, a GSA-based method, while the overall success rate of TM-align is better. G-LoSA is particularly efficient for accurate detection of local structures conserved across proteins with diverse global topologies. Recognizing the performance complementarity of G-LoSA to TM-align and a non-template geometry-based method, fpocket, a robust consensus scoring method, CMCS-BSP (Complementary Methods and Consensus Scoring for ligand Binding Site Prediction), is developed and shows improvement on prediction accuracy. The G-LoSA source code is freely available at http://im.bioinformatics.ku.edu/GLoSA. PMID:23957286

  10. Cryo-EM image alignment based on nonuniform fast Fourier transform.

    PubMed

    Yang, Zhengfan; Penczek, Pawel A

    2008-08-01

    In single particle analysis, two-dimensional (2-D) alignment is a fundamental step intended to put into register various particle projections of biological macromolecules collected at the electron microscope. The efficiency and quality of three-dimensional (3-D) structure reconstruction largely depends on the computational speed and alignment accuracy of this crucial step. In order to improve the performance of alignment, we introduce a new method that takes advantage of the highly accurate interpolation scheme based on the gridding method, a version of the nonuniform fast Fourier transform, and utilizes a multi-dimensional optimization algorithm for the refinement of the orientation parameters. Using simulated data, we demonstrate that by using less than half of the sample points and taking twice the runtime, our new 2-D alignment method achieves dramatically better alignment accuracy than that based on quadratic interpolation. We also apply our method to image to volume registration, the key step in the single particle EM structure refinement protocol. We find that in this case the accuracy of the method not only surpasses the accuracy of the commonly used real-space implementation, but results are achieved in much shorter time, making gridding-based alignment a perfect candidate for efficient structure determination in single particle analysis.

  11. Cryo-EM Image Alignment Based on Nonuniform Fast Fourier Transform

    PubMed Central

    Yang, Zhengfan; Penczek, Pawel A.

    2008-01-01

    In single particle analysis, two-dimensional (2-D) alignment is a fundamental step intended to put into register various particle projections of biological macromolecules collected at the electron microscope. The efficiency and quality of three-dimensional (3-D) structure reconstruction largely depends on the computational speed and alignment accuracy of this crucial step. In order to improve the performance of alignment, we introduce a new method that takes advantage of the highly accurate interpolation scheme based on the gridding method, a version of the nonuniform Fast Fourier Transform, and utilizes a multi-dimensional optimization algorithm for the refinement of the orientation parameters. Using simulated data, we demonstrate that by using less than half of the sample points and taking twice the runtime, our new 2-D alignment method achieves dramatically better alignment accuracy than that based on quadratic interpolation. We also apply our method to image to volume registration, the key step in the single particle EM structure refinement protocol. We find that in this case the accuracy of the method not only surpasses the accuracy of the commonly used real-space implementation, but results are achieved in much shorter time, making gridding-based alignment a perfect candidate for efficient structure determination in single particle analysis. PMID:18499351

  12. Silicon Alignment Pins: An Easy Way to Realize a Wafer-To-Wafer Alignment

    NASA Technical Reports Server (NTRS)

    Peralta, Alejandro (Inventor); Gill, John J. (Inventor); Toda, Risaku (Inventor); Lin, Robert H. (Inventor); Jung-Kubiak, Cecile (Inventor); Reck, Theodore (Inventor); Thomas, Bertrand (Inventor); Siles, Jose V. (Inventor); Lee, Choonsup (Inventor); Chattopadhyay, Goutam (Inventor)

    2016-01-01

    A silicon alignment pin is used to align successive layers of components made in semiconductor chips and/or metallic components to make easier the assembly of devices having a layered structure. The pin is made as a compressible structure which can be squeezed to reduce its outer diameter, have one end fit into a corresponding alignment pocket or cavity defined in a layer of material to be assembled into a layered structure, and then allowed to expand to produce an interference fit with the cavity. The other end can then be inserted into a corresponding cavity defined in a surface of a second layer of material that mates with the first layer. The two layers are in registry when the pin is mated to both. Multiple layers can be assembled to create a multilayer structure. Examples of such devices are presented.

  13. Patterned growth of individual and multiple vertically aligned carbon nanofibers

    NASA Astrophysics Data System (ADS)

    Merkulov, V. I.; Lowndes, D. H.; Wei, Y. Y.; Eres, G.; Voelkl, E.

    2000-06-01

    The results of studies of patterned growth of vertically aligned carbon nanofibers (VACNFs) prepared by plasma-enhanced chemical vapor deposition are reported. Nickel (Ni) dots of various diameters and Ni lines with variable widths and shapes were fabricated using electron beam lithography and evaporation, and served for catalytic growth of VACNFs whose structure was determined by high resolution transmission electron microscopy. It is found that upon plasma pre-etching and heating up to 600-700 °C, thin films of Ni break into droplets which initiate the growth of VACNFs. Above a critical dot size multiple droplets are formed, and consequently multiple VACNFs grow from a single evaporated dot. For dot sizes smaller than the critical size only one droplet is formed, resulting in a single VACNF. In the case of a patterned line, the growth mechanism is similar to that from a dot. VACNFs grow along the line, and above a critical linewidth multiple VACNFs are produced across the line. The mechanism of the formation of single and multiple catalyst droplets and subsequently of VACNFs is discussed.

  14. Creating a Culture of Continuous Assessment to Improve Student Learning through Curriculum Review

    ERIC Educational Resources Information Center

    Kalu, Frances; Dyjur, Patti

    2018-01-01

    This chapter describes a curriculum review framework that fosters continuous assessment through collaboration with multiple stakeholders, alignment with program level learning outcomes, evaluation based on multiple sources of evidence, and facilitated development of action plans to improve student learning.

  15. Analyzing the Curriculum Alignment of Teachers

    ERIC Educational Resources Information Center

    Turan-Özpolat, Esen; Bay, Erdal

    2017-01-01

    The purpose of this research was to analyze the curriculum alignment of teachers in secondary education 5th grade Science course. Alignment levels of teachers in dimensions of acquisition, content, teaching methods and techniques, activity, material and measurement - assessment, and the reasons for their alignment/non-alignment to the curriculum…

  16. Sustainable Survival for adolescents living with HIV: do SDG-aligned provisions reduce potential mortality risk?

    PubMed

    Cluver, Lucie; Pantelic, Marija; Orkin, Mark; Toska, Elona; Medley, Sally; Sherr, Lorraine

    2018-02-01

    The Sustainable Development Goals (SDGs) present a groundbreaking global development agenda to protect the most vulnerable. Adolescents living with HIV in Sub-Saharan Africa continue to experience extreme health vulnerabilities, but we know little about the impacts of SDG-aligned provisions on their health. This study tests associations of provisions aligned with five SDGs with potential mortality risks. Clinical and interview data were gathered from N = 1060 adolescents living with HIV in rural and urban South Africa in 2014 to 2015. All ART-initiated adolescents from 53 government health facilities were identified, and traced in their communities to include those defaulting and lost-to-follow-up. Potential mortality risk was assessed as either: viral suppression failure (1000+ copies/ml) using patient file records, or adolescent self-report of diagnosed but untreated tuberculosis or symptomatic pulmonary tuberculosis. SDG-aligned provisions were measured through adolescent interviews. Provisions aligned with SDGs 1&2 (no poverty and zero hunger) were operationalized as access to basic necessities, social protection and food security; An SDG 3-aligned provision (ensure healthy lives) was having a healthy primary caregiver; An SDG 8-aligned provision (employment for all) was employment of a household member; An SDG 16-aligned provision (protection from violence) was protection from physical, sexual or emotional abuse. Research partners included the South African national government, UNICEF and Pediatric and Adolescent Treatment for Africa. 20.8% of adolescents living with HIV had potential mortality risk - i.e. viral suppression failure, symptomatic untreated TB, or both. All SDG-aligned provisions were significantly associated with reduced potential mortality risk: SDG 1&2 (OR 0.599 CI 0.361 to 0.994); SDG 3 (OR 0.577 CI 0.411 to 0.808); SDG 8 (OR 0.602 CI 0.440 to 0.823) and SDG 16 (OR 0.686 CI 0.505 to 0.933). Access to multiple SDG-aligned provisions showed a strongly graded reduction in potential mortality risk: Among adolescents living with HIV, potential mortality risk was 38.5% with access to no SDG-aligned provisions, and 9.3% with access to all four. SDG-aligned provisions across a range of SDGs were associated with reduced potential mortality risk among adolescents living with HIV. Access to multiple provisions has the potential to substantially improve survival, suggesting the value of connecting and combining SDGs in our response to paediatric and adolescent HIV. © 2018 The Authors. Journal of the International AIDS Society published by John Wiley & sons Ltd on behalf of the International AIDS Society.

  17. Twisted nematic liquid crystal cells with rubbed poly(3,4-ethylenedioxythiophene)/poly(styrenesulfonate) films for active polarization control of terahertz waves

    NASA Astrophysics Data System (ADS)

    Sasaki, Tomoyuki; Okuyama, Hiroki; Sakamoto, Moritsugu; Noda, Kohei; Okamoto, Hiroyuki; Kawatsuki, Nobuhiro; Ono, Hiroshi

    2017-04-01

    We fabricated a terahertz (THz) polarization converter using a twisted nematic (TN) liquid crystal (LC) cell. Poly(3,4-ethylenedioxythiophene)/poly(styrenesulfonate) (PEDOT/PSS) films coated on quartz glass substrates were used as electrode layers in the TN LC cell. The PEDOT/PSS films were rubbed unidirectionally using a rayon cloth to align the nematic LC, thereby also serving as an alignment layer. The azimuthal surface anchoring strength of the PEDOT/PSS films was measured to be 5 × 10-4 J/m2 using the Néel wall method, which is similar to that of typical polymeric alignment layers. The optical constants of the PEDOT/PSS film in the THz range were also characterized using the Drude-Smith model, and the results indicated that the PEDOT/PSS films could be used both as transparent electrodes in the THz range and as alignment layers for the LC. The electro-optical properties of the fabricated TN LC cell were also investigated using a polarized visible laser and THz time-domain spectroscopic system. In particular, the transmission spectra and polarization conversion property of the TN LC cell in the THz range were theoretically analyzed based on a stratified model that considers optical anisotropy, absorption, and multiple interference. This work substantiates the advantages of TN LC cells with rubbed PEDOT/PSS films useful for THz polarization converters with electrical tunability.

  18. Overcoming Sequence Misalignments with Weighted Structural Superposition

    PubMed Central

    Khazanov, Nickolay A.; Damm-Ganamet, Kelly L.; Quang, Daniel X.; Carlson, Heather A.

    2012-01-01

    An appropriate structural superposition identifies similarities and differences between homologous proteins that are not evident from sequence alignments alone. We have coupled our Gaussian-weighted RMSD (wRMSD) tool with a sequence aligner and seed extension (SE) algorithm to create a robust technique for overlaying structures and aligning sequences of homologous proteins (HwRMSD). HwRMSD overcomes errors in the initial sequence alignment that would normally propagate into a standard RMSD overlay. SE can generate a corrected sequence alignment from the improved structural superposition obtained by wRMSD. HwRMSD’s robust performance and its superiority over standard RMSD are demonstrated over a range of homologous proteins. Its better overlay results in corrected sequence alignments with good agreement to HOMSTRAD. Finally, HwRMSD is compared to established structural alignment methods: FATCAT, SSM, CE, and Dalilite. Most methods are comparable at placing residue pairs within 2 Å, but HwRMSD places many more residue pairs within 1 Å, providing a clear advantage. Such high accuracy is essential in drug design, where small distances can have a large impact on computational predictions. This level of accuracy is also needed to correct sequence alignments in an automated fashion, especially for omics-scale analysis. HwRMSD can align homologs with low sequence identity and large conformational differences, cases where both sequence-based and structural-based methods may fail. The HwRMSD pipeline overcomes the dependency of structural overlays on initial sequence pairing and removes the need to determine the best sequence-alignment method, substitution matrix, and gap parameters for each unique pair of homologs. PMID:22733542

  19. Multilevel, multicomponent microarchitectures of vertically-aligned carbon nanotubes for diverse applications.

    PubMed

    Qu, Liangti; Vaia, Rich A; Dai, Liming

    2011-02-22

    A simple multiple contact transfer technique has been developed for controllable fabrication of multilevel, multicomponent microarchitectures of vertically aligned carbon nanotubes (VA-CNTs). Three dimensional (3-D) multicomponent micropatterns of aligned single-walled carbon nanotubes (SWNTs) and multiwalled carbon nanotubes (MWNTs) have been fabricated, which can be used to develop a newly designed touch sensor with reversible electrical responses for potential applications in electronic devices, as demonstrated in this study. The demonstrated dependence of light diffraction on structural transfiguration of the resultant CNT micropattern also indicates their potential for optical devices. Further introduction of various components with specific properties (e.g., ZnO nanorods) into the CNT micropatterns enabled us to tailor such surface characteristics as wettability and light response. Owing to the highly generic nature of the multiple contact transfer strategy, the methodology developed here could provide a general approach for interposing a large variety of multicomponent elements (e.g., nanotubes, nanorods/wires, photonic crystals, etc.) onto a single chip for multifunctional device applications.

  20. Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm.

    PubMed

    Khaled, Heba; Faheem, Hossam El Deen Mostafa; El Gohary, Rania

    2015-01-01

    This paper provides a novel hybrid model for solving the multiple pair-wise sequence alignment problem combining message passing interface and CUDA, the parallel computing platform and programming model invented by NVIDIA. The proposed model targets homogeneous cluster nodes equipped with similar Graphical Processing Unit (GPU) cards. The model consists of the Master Node Dispatcher (MND) and the Worker GPU Nodes (WGN). The MND distributes the workload among the cluster working nodes and then aggregates the results. The WGN performs the multiple pair-wise sequence alignments using the Smith-Waterman algorithm. We also propose a modified implementation to the Smith-Waterman algorithm based on computing the alignment matrices row-wise. The experimental results demonstrate a considerable reduction in the running time by increasing the number of the working GPU nodes. The proposed model achieved a performance of about 12 Giga cell updates per second when we tested against the SWISS-PROT protein knowledge base running on four nodes.

  1. Comparison of landmark-based and automatic methods for cortical surface registration

    PubMed Central

    Pantazis, Dimitrios; Joshi, Anand; Jiang, Jintao; Shattuck, David; Bernstein, Lynne E.; Damasio, Hanna; Leahy, Richard M.

    2009-01-01

    Group analysis of structure or function in cerebral cortex typically involves as a first step the alignment of the cortices. A surface based approach to this problem treats the cortex as a convoluted surface and coregisters across subjects so that cortical landmarks or features are aligned. This registration can be performed using curves representing sulcal fundi and gyral crowns to constrain the mapping. Alternatively, registration can be based on the alignment of curvature metrics computed over the entire cortical surface. The former approach typically involves some degree of user interaction in defining the sulcal and gyral landmarks while the latter methods can be completely automated. Here we introduce a cortical delineation protocol consisting of 26 consistent landmarks spanning the entire cortical surface. We then compare the performance of a landmark-based registration method that uses this protocol with that of two automatic methods implemented in the software packages FreeSurfer and BrainVoyager. We compare performance in terms of discrepancy maps between the different methods, the accuracy with which regions of interest are aligned, and the ability of the automated methods to correctly align standard cortical landmarks. Our results show similar performance for ROIs in the perisylvian region for the landmark based method and FreeSurfer. However, the discrepancy maps showed larger variability between methods in occipital and frontal cortex and also that automated methods often produce misalignment of standard cortical landmarks. Consequently, selection of the registration approach should consider the importance of accurate sulcal alignment for the specific task for which coregistration is being performed. When automatic methods are used, the users should ensure that sulci in regions of interest in their studies are adequately aligned before proceeding with subsequent analysis. PMID:19796696

  2. A new graph-based method for pairwise global network alignment

    PubMed Central

    Klau, Gunnar W

    2009-01-01

    Background In addition to component-based comparative approaches, network alignments provide the means to study conserved network topology such as common pathways and more complex network motifs. Yet, unlike in classical sequence alignment, the comparison of networks becomes computationally more challenging, as most meaningful assumptions instantly lead to NP-hard problems. Most previous algorithmic work on network alignments is heuristic in nature. Results We introduce the graph-based maximum structural matching formulation for pairwise global network alignment. We relate the formulation to previous work and prove NP-hardness of the problem. Based on the new formulation we build upon recent results in computational structural biology and present a novel Lagrangian relaxation approach that, in combination with a branch-and-bound method, computes provably optimal network alignments. The Lagrangian algorithm alone is a powerful heuristic method, which produces solutions that are often near-optimal and – unlike those computed by pure heuristics – come with a quality guarantee. Conclusion Computational experiments on the alignment of protein-protein interaction networks and on the classification of metabolic subnetworks demonstrate that the new method is reasonably fast and has advantages over pure heuristics. Our software tool is freely available as part of the LISA library. PMID:19208162

  3. A Self-Alignment Algorithm for SINS Based on Gravitational Apparent Motion and Sensor Data Denoising

    PubMed Central

    Liu, Yiting; Xu, Xiaosu; Liu, Xixiang; Yao, Yiqing; Wu, Liang; Sun, Jin

    2015-01-01

    Initial alignment is always a key topic and difficult to achieve in an inertial navigation system (INS). In this paper a novel self-initial alignment algorithm is proposed using gravitational apparent motion vectors at three different moments and vector-operation. Simulation and analysis showed that this method easily suffers from the random noise contained in accelerometer measurements which are used to construct apparent motion directly. Aiming to resolve this problem, an online sensor data denoising method based on a Kalman filter is proposed and a novel reconstruction method for apparent motion is designed to avoid the collinearity among vectors participating in the alignment solution. Simulation, turntable tests and vehicle tests indicate that the proposed alignment algorithm can fulfill initial alignment of strapdown INS (SINS) under both static and swinging conditions. The accuracy can either reach or approach the theoretical values determined by sensor precision under static or swinging conditions. PMID:25923932

  4. A facile method to align carbon nanotubes on polymeric membrane substrate

    PubMed Central

    Zhao, Haiyang; Zhou, Zhijun; Dong, Hang; Zhang, Lin; Chen, Huanlin; Hou, Lian

    2013-01-01

    The alignment of carbon nanotubes (CNT) is the fundamental requirement to ensure their excellent functions but seems to be desolated in recent years. A facile method, hot-press combined with peel-off (HPPO), is introduced here, through which CNT can be successfully vertically aligned on the polymeric membrane substrate. Shear force and mechanical stretch are proposed to be the main forces to align the tubes perpendicular to the substrate surface during the peel-off process. The alignment of CNT keeps its orientation in a thin hybrid membrane by dip-coating cellulose acetate dope solution. It is expected that the stable alignment of CNT by HPPO would contribute to the realization of its potential applications. PMID:24326297

  5. A Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm

    PubMed Central

    Guo, Xinyu; Wang, Hong; Devabhaktuni, Vijay

    2012-01-01

    A design of systolic array-based Field Programmable Gate Array (FPGA) parallel architecture for Basic Local Alignment Search Tool (BLAST) Algorithm is proposed. BLAST is a heuristic biological sequence alignment algorithm which has been used by bioinformatics experts. In contrast to other designs that detect at most one hit in one-clock-cycle, our design applies a Multiple Hits Detection Module which is a pipelining systolic array to search multiple hits in a single-clock-cycle. Further, we designed a Hits Combination Block which combines overlapping hits from systolic array into one hit. These implementations completed the first and second step of BLAST architecture and achieved significant speedup comparing with previously published architectures. PMID:25969747

  6. The Virtual Space Telescope: A New Class of Science Missions

    NASA Technical Reports Server (NTRS)

    Shah, Neerav; Calhoun, Philip

    2016-01-01

    Many science investigations proposed by GSFC require two spacecraft alignment across a long distance to form a virtual space telescope. Forming a Virtual Space telescope requires advances in Guidance, Navigation, and Control (GNC) enabling the distribution of monolithic telescopes across multiple space platforms. The capability to align multiple spacecraft to an intertial target is at a low maturity state and we present a roadmap to advance the system-level capability to be flight ready in preparation of various science applications. An engineering proof of concept, called the CANYVAL-X CubeSat MIssion is presented. CANYVAL-X's advancement will decrease risk for a potential starshade mission that would fly with WFIRST.

  7. Generating Models of Surgical Procedures using UMLS Concepts and Multiple Sequence Alignment

    PubMed Central

    Meng, Frank; D’Avolio, Leonard W.; Chen, Andrew A.; Taira, Ricky K.; Kangarloo, Hooshang

    2005-01-01

    Surgical procedures can be viewed as a process composed of a sequence of steps performed on, by, or with the patient’s anatomy. This sequence is typically the pattern followed by surgeons when generating surgical report narratives for documenting surgical procedures. This paper describes a methodology for semi-automatically deriving a model of conducted surgeries, utilizing a sequence of derived Unified Medical Language System (UMLS) concepts for representing surgical procedures. A multiple sequence alignment was computed from a collection of such sequences and was used for generating the model. These models have the potential of being useful in a variety of informatics applications such as information retrieval and automatic document generation. PMID:16779094

  8. R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server.

    PubMed

    Cannone, Jamie J; Sweeney, Blake A; Petrov, Anton I; Gutell, Robin R; Zirbel, Craig L; Leontis, Neocles

    2015-07-01

    The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Unsupervised parameter optimization for automated retention time alignment of severely shifted gas chromatographic data using the piecework alignment algorithm.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pierce, Karisa M.; Wright, Bob W.; Synovec, Robert E.

    2007-02-02

    First, simulated chromatographic separations with declining retention time precision were used to study the performance of the piecewise retention time alignment algorithm and to demonstrate an unsupervised parameter optimization method. The average correlation coefficient between the first chromatogram and every other chromatogram in the data set was used to optimize the alignment parameters. This correlation method does not require a training set, so it is unsupervised and automated. This frees the user from needing to provide class information and makes the alignment algorithm more generally applicable to classifying completely unknown data sets. For a data set of simulated chromatograms wheremore » the average chromatographic peak was shifted past two neighboring peaks between runs, the average correlation coefficient of the raw data was 0.46 ± 0.25. After automated, optimized piecewise alignment, the average correlation coefficient was 0.93 ± 0.02. Additionally, a relative shift metric and principal component analysis (PCA) were used to independently quantify and categorize the alignment performance, respectively. The relative shift metric was defined as four times the standard deviation of a given peak’s retention time in all of the chromatograms, divided by the peak-width-at-base. The raw simulated data sets that were studied contained peaks with average relative shifts ranging between 0.3 and 3.0. Second, a “real” data set of gasoline separations was gathered using three different GC methods to induce severe retention time shifting. In these gasoline separations, retention time precision improved ~8 fold following alignment. Finally, piecewise alignment and the unsupervised correlation optimization method were applied to severely shifted GC separations of reformate distillation fractions. The effect of piecewise alignment on peak heights and peak areas is also reported. Piecewise alignment either did not change the peak height, or caused it to slightly decrease. The average relative difference in peak height after piecewise alignment was –0.20%. Piecewise alignment caused the peak areas to either stay the same, slightly increase, or slightly decrease. The average absolute relative difference in area after piecewise alignment was 0.15%.« less

  10. MultiPhyl: a high-throughput phylogenomics webserver using distributed computing

    PubMed Central

    Keane, Thomas M.; Naughton, Thomas J.; McInerney, James O.

    2007-01-01

    With the number of fully sequenced genomes increasing steadily, there is greater interest in performing large-scale phylogenomic analyses from large numbers of individual gene families. Maximum likelihood (ML) has been shown repeatedly to be one of the most accurate methods for phylogenetic construction. Recently, there have been a number of algorithmic improvements in maximum-likelihood-based tree search methods. However, it can still take a long time to analyse the evolutionary history of many gene families using a single computer. Distributed computing refers to a method of combining the computing power of multiple computers in order to perform some larger overall calculation. In this article, we present the first high-throughput implementation of a distributed phylogenetics platform, MultiPhyl, capable of using the idle computational resources of many heterogeneous non-dedicated machines to form a phylogenetics supercomputer. MultiPhyl allows a user to upload hundreds or thousands of amino acid or nucleotide alignments simultaneously and perform computationally intensive tasks such as model selection, tree searching and bootstrapping of each of the alignments using many desktop machines. The program implements a set of 88 amino acid models and 56 nucleotide maximum likelihood models and a variety of statistical methods for choosing between alternative models. A MultiPhyl webserver is available for public use at: http://www.cs.nuim.ie/distributed/multiphyl.php. PMID:17553837

  11. Robust, Globally Consistent, and Fully-automatic Multi-image Registration and Montage Synthesis for 3-D Multi-channel Images

    PubMed Central

    Tsai, Chia-Ling; Lister, James P.; Bjornsson, Christopher J; Smith, Karen; Shain, William; Barnes, Carol A.; Roysam, Badrinath

    2013-01-01

    The need to map regions of brain tissue that are much wider than the field of view of the microscope arises frequently. One common approach is to collect a series of overlapping partial views, and align them to synthesize a montage covering the entire region of interest. We present a method that advances this approach in multiple ways. Our method (1) produces a globally consistent joint registration of an unorganized collection of 3-D multi-channel images with or without stage micrometer data; (2) produces accurate registrations withstanding changes in scale, rotation, translation and shear by using a 3-D affine transformation model; (3) achieves complete automation, and does not require any parameter settings; (4) handles low and variable overlaps (5 – 15%) between adjacent images, minimizing the number of images required to cover a tissue region; (5) has the self-diagnostic ability to recognize registration failures instead of delivering incorrect results; (6) can handle a broad range of biological images by exploiting generic alignment cues from multiple fluorescence channels without requiring segmentation; and (7) is computationally efficient enough to run on desktop computers regardless of the number of images. The algorithm was tested with several tissue samples of at least 50 image tiles, involving over 5,000 image pairs. It correctly registered all image pairs with an overlap greater than 7%, correctly recognized all failures, and successfully joint-registered all images for all tissue samples studied. This algorithm is disseminated freely to the community as included with the FARSIGHT toolkit for microscopy (www.farsight-toolkit.org). PMID:21361958

  12. Base-By-Base: single nucleotide-level analysis of whole viral genome alignments.

    PubMed

    Brodie, Ryan; Smith, Alex J; Roper, Rachel L; Tcherepanov, Vasily; Upton, Chris

    2004-07-14

    With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes) is not feasible without new bioinformatics tools. A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1) rapidly identify and correct alignment errors in large, multiple genome alignments; and 2) generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs) to retrieve detailed annotation information about the aligned genomes or use information from text files. Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.

  13. ZnO-based multiple channel and multiple gate FinMOSFETs

    NASA Astrophysics Data System (ADS)

    Lee, Ching-Ting; Huang, Hung-Lin; Tseng, Chun-Yen; Lee, Hsin-Ying

    2016-02-01

    In recent years, zinc oxide (ZnO)-based metal-oxide-semiconductor field-effect transistors (MOSFETs) have attracted much attention, because ZnO-based semiconductors possess several advantages, including large exciton binding energy, nontoxicity, biocompatibility, low material cost, and wide direct bandgap. Moreover, the ZnO-based MOSFET is one of most potential devices, due to the applications in microwave power amplifiers, logic circuits, large scale integrated circuits, and logic swing. In this study, to enhance the performances of the ZnO-based MOSFETs, the ZnObased multiple channel and multiple gate structured FinMOSFETs were fabricated using the simple laser interference photolithography method and the self-aligned photolithography method. The multiple channel structure possessed the additional sidewall depletion width control ability to improve the channel controllability, because the multiple channel sidewall portions were surrounded by the gate electrode. Furthermore, the multiple gate structure had a shorter distance between source and gate and a shorter gate length between two gates to enhance the gate operating performances. Besides, the shorter distance between source and gate could enhance the electron velocity in the channel fin structure of the multiple gate structure. In this work, ninety one channels and four gates were used in the FinMOSFETs. Consequently, the drain-source saturation current (IDSS) and maximum transconductance (gm) of the ZnO-based multiple channel and multiple gate structured FinFETs operated at a drain-source voltage (VDS) of 10 V and a gate-source voltage (VGS) of 0 V were respectively improved from 11.5 mA/mm to 13.7 mA/mm and from 4.1 mS/mm to 6.9 mS/mm in comparison with that of the conventional ZnO-based single channel and single gate MOSFETs.

  14. Thin concentrator photovoltaic module with micro-solar cells which are mounted by self-align method using surface tension of melted solder

    NASA Astrophysics Data System (ADS)

    Hayashi, Nobuhiko; Terauchi, Masaharu; Aya, Youichirou; Kanayama, Shutetsu; Nishitani, Hikaru; Nakagawa, Tohru; Takase, Michihiko

    2017-09-01

    We are developing a thin and lightweight CPV module using small size lens system made from poly methyl methacrylate (PMMA) with a short focal length and micro-solar cells to decrease the transporting and the installing costs of CPV systems. In order to achieve high conversion efficiency in CPV modules using micro-solar cells, the micro-solar cells need to be mounted accurately to the irradiated region of the concentrated sunlight. In this study, we have successfully developed self-align method thanks to the surface tension of the melted solder even utilizing commercially available surface-mounting technology (SMT). Solar cells were self-aligned to the specified positions of the circuit board by this self-align method with accuracy within ±10 µm. We actually fabricated CPV modules using this self-align method and demonstrated high conversion efficiency of our CPV module.

  15. Phylo: A Citizen Science Approach for Improving Multiple Sequence Alignment

    PubMed Central

    Kam, Alfred; Kwak, Daniel; Leung, Clarence; Wu, Chu; Zarour, Eleyine; Sarmenta, Luis; Blanchette, Mathieu; Waldispühl, Jérôme

    2012-01-01

    Background Comparative genomics, or the study of the relationships of genome structure and function across different species, offers a powerful tool for studying evolution, annotating genomes, and understanding the causes of various genetic disorders. However, aligning multiple sequences of DNA, an essential intermediate step for most types of analyses, is a difficult computational task. In parallel, citizen science, an approach that takes advantage of the fact that the human brain is exquisitely tuned to solving specific types of problems, is becoming increasingly popular. There, instances of hard computational problems are dispatched to a crowd of non-expert human game players and solutions are sent back to a central server. Methodology/Principal Findings We introduce Phylo, a human-based computing framework applying “crowd sourcing” techniques to solve the Multiple Sequence Alignment (MSA) problem. The key idea of Phylo is to convert the MSA problem into a casual game that can be played by ordinary web users with a minimal prior knowledge of the biological context. We applied this strategy to improve the alignment of the promoters of disease-related genes from up to 44 vertebrate species. Since the launch in November 2010, we received more than 350,000 solutions submitted from more than 12,000 registered users. Our results show that solutions submitted contributed to improving the accuracy of up to 70% of the alignment blocks considered. Conclusions/Significance We demonstrate that, combined with classical algorithms, crowd computing techniques can be successfully used to help improving the accuracy of MSA. More importantly, we show that an NP-hard computational problem can be embedded in casual game that can be easily played by people without significant scientific training. This suggests that citizen science approaches can be used to exploit the billions of “human-brain peta-flops” of computation that are spent every day playing games. Phylo is available at: http://phylo.cs.mcgill.ca. PMID:22412834

  16. Population entropies estimates of proteins

    NASA Astrophysics Data System (ADS)

    Low, Wai Yee

    2017-05-01

    The Shannon entropy equation provides a way to estimate variability of amino acids sequences in a multiple sequence alignment of proteins. Knowledge of protein variability is useful in many areas such as vaccine design, identification of antibody binding sites, and exploration of protein 3D structural properties. In cases where the population entropies of a protein are of interest but only a small sample size can be obtained, a method based on linear regression and random subsampling can be used to estimate the population entropy. This method is useful for comparisons of entropies where the actual sequence counts differ and thus, correction for alignment size bias is needed. In the current work, an R based package named EntropyCorrect that enables estimation of population entropy is presented and an empirical study on how well this new algorithm performs on simulated dataset of various combinations of population and sample sizes is discussed. The package is available at https://github.com/lloydlow/EntropyCorrect. This article, which was originally published online on 12 May 2017, contained an error in Eq. (1), where the summation sign was missing. The corrected equation appears in the Corrigendum attached to the pdf.

  17. Quantiprot - a Python package for quantitative analysis of protein sequences.

    PubMed

    Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold

    2017-07-17

    The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.

  18. Controllable growth of vertically aligned graphene on C-face SiC

    DOE PAGES

    Liu, Yu; Chen, Lianlian; Hilliard, Donovan; ...

    2016-10-06

    We investigated how to control the growth of vertically aligned graphene on C-face SiC by varying the processing conditions. It is found that, the growth rate scales with the annealing temperature and the graphene height is proportional to the annealing time. Temperature gradient and crystalline quality of the SiC substrates influence their vaporization. The partial vapor pressure is crucial as it can interfere with further vaporization. A growth mechanism is proposed in terms of physical vapor transport. The monolayer character of vertically aligned graphene is verified by Raman and X-ray absorption spectroscopy. With the processed samples, d 0 magnetism ismore » realized and negative magnetoresistance is observed after Cu implantation. We also prove that multiple carriers exist in vertically aligned graphene.« less

  19. Controllable growth of vertically aligned graphene on C-face SiC

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Yu; Chen, Lianlian; Hilliard, Donovan

    We investigated how to control the growth of vertically aligned graphene on C-face SiC by varying the processing conditions. It is found that, the growth rate scales with the annealing temperature and the graphene height is proportional to the annealing time. Temperature gradient and crystalline quality of the SiC substrates influence their vaporization. The partial vapor pressure is crucial as it can interfere with further vaporization. A growth mechanism is proposed in terms of physical vapor transport. The monolayer character of vertically aligned graphene is verified by Raman and X-ray absorption spectroscopy. With the processed samples, d 0 magnetism ismore » realized and negative magnetoresistance is observed after Cu implantation. We also prove that multiple carriers exist in vertically aligned graphene.« less

  20. Pediatric distal femur fixation by proximal humeral plate.

    PubMed

    Abdelgawad, Amr Atef; Kanlic, Enes M

    2013-12-01

    Distal femoral metaphyseal fractures are common injuries in children. Multiple treatment options have been described for this type of injury. For older children with distal metaphyseal fracture, there is still no optimal method of fixation. We propose that the commonly used proximal humeral plate can provide good method of fixation for this fracture in adolescents. Two children (12 and 14 years old) with distal metaphyseal femoral fracture were treated with proximal humeral plate. We describe the surgical technique and postoperative management. The two children healed with good alignment and full range of motion of the knee. No external immobilization (other than knee immobilizer for the first 2 weeks) was used. We concluded that proximal humeral plate can provide adequate fixation for teenagers with distal femoral metaphyseal fracture. It is readily available; provide multiple options for screw fixation in the distal part of the fracture and fits easily on the distal part of the femur proximal to the physis. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.

Top