multiple structure alignment: Topics by Science.gov

Sample records for multiple structure alignment

mTM-align: a server for fast protein structure database search and multiple protein structure alignment.

PubMed

Dong, Runze; Pan, Shuo; Peng, Zhenling; Zhang, Yang; Yang, Jianyi

2018-05-21

With the rapid increase of the number of protein structures in the Protein Data Bank, it becomes urgent to develop algorithms for efficient protein structure comparisons. In this article, we present the mTM-align server, which consists of two closely related modules: one for structure database search and the other for multiple structure alignment. The database search is speeded up based on a heuristic algorithm and a hierarchical organization of the structures in the database. The multiple structure alignment is performed using the recently developed algorithm mTM-align. Benchmark tests demonstrate that our algorithms outperform other peering methods for both modules, in terms of speed and accuracy. One of the unique features for the server is the interplay between database search and multiple structure alignment. The server provides service not only for performing fast database search, but also for making accurate multiple structure alignment with the structures found by the search. For the database search, it takes about 2-5 min for a structure of a medium size (∼300 residues). For the multiple structure alignment, it takes a few seconds for ∼10 structures of medium sizes. The server is freely available at: http://yanglab.nankai.edu.cn/mTM-align/.
SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments

PubMed Central

Di Tommaso, Paolo; Bussotti, Giovanni; Kemena, Carsten; Capriotti, Emidio; Chatzou, Maria; Prieto, Pablo; Notredame, Cedric

2014-01-01

This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee. PMID:24972831
Progressive structure-based alignment of homologous proteins: Adopting sequence comparison strategies.

PubMed

Joseph, Agnel Praveen; Srinivasan, Narayanaswamy; de Brevern, Alexandre G

2012-09-01

Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a 1D sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
Matt: local flexibility aids protein multiple structure alignment.

PubMed

Menke, Matthew; Berger, Bonnie; Cowen, Lenore

2008-01-01

Even when there is agreement on what measure a protein multiple structure alignment should be optimizing, finding the optimal alignment is computationally prohibitive. One approach used by many previous methods is aligned fragment pair chaining, where short structural fragments from all the proteins are aligned against each other optimally, and the final alignment chains these together in geometrically consistent ways. Ye and Godzik have recently suggested that adding geometric flexibility may help better model protein structures in a variety of contexts. We introduce the program Matt (Multiple Alignment with Translations and Twists), an aligned fragment pair chaining algorithm that, in intermediate steps, allows local flexibility between fragments: small translations and rotations are temporarily allowed to bring sets of aligned fragments closer, even if they are physically impossible under rigid body transformations. After a dynamic programming assembly guided by these "bent" alignments, geometric consistency is restored in the final step before the alignment is output. Matt is tested against other recent multiple protein structure alignment programs on the popular Homstrad and SABmark benchmark datasets. Matt's global performance is competitive with the other programs on Homstrad, but outperforms the other programs on SABmark, a benchmark of multiple structure alignments of proteins with more distant homology. On both datasets, Matt demonstrates an ability to better align the ends of alpha-helices and beta-strands, an important characteristic of any structure alignment program intended to help construct a structural template library for threading approaches to the inverse protein-folding problem. The related question of whether Matt alignments can be used to distinguish distantly homologous structure pairs from pairs of proteins that are not homologous is also considered. For this purpose, a p-value score based on the length of the common core and average root mean squared deviation (RMSD) of Matt alignments is shown to largely separate decoys from homologous protein structures in the SABmark benchmark dataset. We postulate that Matt's strong performance comes from its ability to model proteins in different conformational states and, perhaps even more important, its ability to model backbone distortions in more distantly related proteins.
CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments.

PubMed

Zhou, Carol L Ecale

2015-01-01

In order to better define regions of similarity among related protein structures, it is useful to identify the residue-residue correspondences among proteins. Few codes exist for constructing a one-to-many multiple sequence alignment derived from a set of structure or sequence alignments, and a need was evident for creating such a tool for combining pairwise structure alignments that would allow for insertion of gaps in the reference structure. This report describes a new Python code, CombAlign, which takes as input a set of pairwise sequence alignments (which may be structure based) and generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA). The use and utility of CombAlign was demonstrated by generating gapped MSSAs using sets of pairwise structure-based sequence alignments between structure models of the matrix protein (VP40) and pre-small/secreted glycoprotein (sGP) of Reston Ebolavirus and the corresponding proteins of several other filoviruses. The gapped MSSAs revealed structure-based residue-residue correspondences, which enabled identification of structurally similar versus differing regions in the Reston proteins compared to each of the other corresponding proteins. CombAlign is a new Python code that generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA) given a set of pairwise sequence alignments (which may be structure based). CombAlign has utility in assisting the user in distinguishing structurally conserved versus divergent regions on a reference protein structure relative to other closely related proteins. CombAlign was developed in Python 2.6, and the source code is available for download from the GitHub code repository.
MultiSETTER: web server for multiple RNA structure comparison.

PubMed

Čech, Petr; Hoksza, David; Svozil, Daniel

2015-08-12

Understanding the architecture and function of RNA molecules requires methods for comparing and analyzing their tertiary and quaternary structures. While structural superposition of short RNAs is achievable in a reasonable time, large structures represent much bigger challenge. Therefore, we have developed a fast and accurate algorithm for RNA pairwise structure superposition called SETTER and implemented it in the SETTER web server. However, though biological relationships can be inferred by a pairwise structure alignment, key features preserved by evolution can be identified only from a multiple structure alignment. Thus, we extended the SETTER algorithm to the alignment of multiple RNA structures and developed the MultiSETTER algorithm. In this paper, we present the updated version of the SETTER web server that implements a user friendly interface to the MultiSETTER algorithm. The server accepts RNA structures either as the list of PDB IDs or as user-defined PDB files. After the superposition is computed, structures are visualized in 3D and several reports and statistics are generated. To the best of our knowledge, the MultiSETTER web server is the first publicly available tool for a multiple RNA structure alignment. The MultiSETTER server offers the visual inspection of an alignment in 3D space which may reveal structural and functional relationships not captured by other multiple alignment methods based either on a sequence or on secondary structure motifs.
A novel approach to multiple sequence alignment using hadoop data grids.

PubMed

Sudha Sadasivam, G; Baktavatchalam, G

2010-01-01

Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.
FASMA: a service to format and analyze sequences in multiple alignments.

PubMed

Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M

2007-12-01

Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.
Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns.

PubMed

Ortuño, Francisco M; Valenzuela, Olga; Rojas, Fernando; Pomares, Hector; Florido, Javier P; Urquiza, Jose M; Rojas, Ignacio

2013-09-01

Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, above all when sequences are less similar. Consequently, researchers and biologists do not agree about which is the most suitable way to evaluate MSAs. Recent evaluations tend to use more complex scores including further biological features. Among them, 3D structures are increasingly being used to evaluate alignments. Because structures are more conserved in proteins than sequences, scores with structural information are better suited to evaluate more distant relationships between sequences. The proposed multiobjective algorithm, based on the non-dominated sorting genetic algorithm, aims to jointly optimize three objectives: STRIKE score, non-gaps percentage and totally conserved columns. It was significantly assessed on the BAliBASE benchmark according to the Kruskal-Wallis test (P < 0.01). This algorithm also outperforms other aligners, such as ClustalW, Multiple Sequence Alignment Genetic Algorithm (MSA-GA), PRRP, DIALIGN, Hidden Markov Model Training (HMMT), Pattern-Induced Multi-sequence Alignment (PIMA), MULTIALIGN, Sequence Alignment Genetic Algorithm (SAGA), PILEUP, Rubber Band Technique Genetic Algorithm (RBT-GA) and Vertical Decomposition Genetic Algorithm (VDGA), according to the Wilcoxon signed-rank test (P < 0.05), whereas it shows results not significantly different to 3D-COFFEE (P > 0.05) with the advantage of being able to use less structures. Structural information is included within the objective function to evaluate more accurately the obtained alignments. The source code is available at http://www.ugr.es/~fortuno/MOSAStrE/MO-SAStrE.zip.
CAB-Align: A Flexible Protein Structure Alignment Method Based on the Residue-Residue Contact Area.

PubMed

Terashi, Genki; Takeda-Shitaka, Mayuko

2015-01-01

Proteins are flexible, and this flexibility has an essential functional role. Flexibility can be observed in loop regions, rearrangements between secondary structure elements, and conformational changes between entire domains. However, most protein structure alignment methods treat protein structures as rigid bodies. Thus, these methods fail to identify the equivalences of residue pairs in regions with flexibility. In this study, we considered that the evolutionary relationship between proteins corresponds directly to the residue-residue physical contacts rather than the three-dimensional (3D) coordinates of proteins. Thus, we developed a new protein structure alignment method, contact area-based alignment (CAB-align), which uses the residue-residue contact area to identify regions of similarity. The main purpose of CAB-align is to identify homologous relationships at the residue level between related protein structures. The CAB-align procedure comprises two main steps: First, a rigid-body alignment method based on local and global 3D structure superposition is employed to generate a sufficient number of initial alignments. Then, iterative dynamic programming is executed to find the optimal alignment. We evaluated the performance and advantages of CAB-align based on four main points: (1) agreement with the gold standard alignment, (2) alignment quality based on an evolutionary relationship without 3D coordinate superposition, (3) consistency of the multiple alignments, and (4) classification agreement with the gold standard classification. Comparisons of CAB-align with other state-of-the-art protein structure alignment methods (TM-align, FATCAT, and DaliLite) using our benchmark dataset showed that CAB-align performed robustly in obtaining high-quality alignments and generating consistent multiple alignments with high coverage and accuracy rates, and it performed extremely well when discriminating between homologous and nonhomologous pairs of proteins in both single and multi-domain comparisons. The CAB-align software is freely available to academic users as stand-alone software at http://www.pharm.kitasato-u.ac.jp/bmd/bmd/Publications.html.
PFAAT version 2.0: a tool for editing, annotating, and analyzing multiple sequence alignments.

PubMed

Caffrey, Daniel R; Dana, Paul H; Mathur, Vidhya; Ocano, Marco; Hong, Eun-Jong; Wang, Yaoyu E; Somaroo, Shyamal; Caffrey, Brian E; Potluri, Shobha; Huang, Enoch S

2007-10-11

By virtue of their shared ancestry, homologous sequences are similar in their structure and function. Consequently, multiple sequence alignments are routinely used to identify trends that relate to function. This type of analysis is particularly productive when it is combined with structural and phylogenetic analysis. Here we describe the release of PFAAT version 2.0, a tool for editing, analyzing, and annotating multiple sequence alignments. Support for multiple annotations is a key component of this release as it provides a framework for most of the new functionalities. The sequence annotations are accessible from the alignment and tree, where they are typically used to label sequences or hyperlink them to related databases. Sequence annotations can be created manually or extracted automatically from UniProt entries. Once a multiple sequence alignment is populated with sequence annotations, sequences can be easily selected and sorted through a sophisticated search dialog. The selected sequences can be further analyzed using statistical methods that explicitly model relationships between the sequence annotations and residue properties. Residue annotations are accessible from the alignment viewer and are typically used to designate binding sites or properties for a particular residue. Residue annotations are also searchable, and allow one to quickly select alignment columns for further sequence analysis, e.g. computing percent identities. Other features include: novel algorithms to compute sequence conservation, mapping conservation scores to a 3D structure in Jmol, displaying secondary structure elements, and sorting sequences by residue composition. PFAAT provides a framework whereby end-users can specify knowledge for a protein family in the form of annotation. The annotations can be combined with sophisticated analysis to test hypothesis that relate to sequence, structure and function.
Development and application of an algorithm to compute weighted multiple glycan alignments.

PubMed

Hosoda, Masae; Akune, Yukie; Aoki-Kinoshita, Kiyoko F

2017-05-01

A glycan consists of monosaccharides linked by glycosidic bonds, has branches and forms complex molecular structures. Databases have been developed to store large amounts of glycan-binding experiments, including glycan arrays with glycan-binding proteins. However, there are few bioinformatics techniques to analyze large amounts of data for glycans because there are few tools that can handle the complexity of glycan structures. Thus, we have developed the MCAW (Multiple Carbohydrate Alignment with Weights) tool that can align multiple glycan structures, to aid in the understanding of their function as binding recognition molecules. We have described in detail the first algorithm to perform multiple glycan alignments by modeling glycans as trees. To test our tool, we prepared several data sets, and as a result, we found that the glycan motif could be successfully aligned without any prior knowledge applied to the tool, and the known recognition binding sites of glycans could be aligned at a high rate amongst all our datasets tested. We thus claim that our tool is able to find meaningful glycan recognition and binding patterns using data obtained by glycan-binding experiments. The development and availability of an effective multiple glycan alignment tool opens possibilities for many other glycoinformatics analysis, making this work a big step towards furthering glycomics analysis. http://www.rings.t.soka.ac.jp. kkiyoko@soka.ac.jp. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.
Parallel seed-based approach to multiple protein structure similarities detection

DOE PAGES

Chapuis, Guillaume; Le Boudic-Jamin, Mathilde; Andonov, Rumen; ...

2015-01-01

Finding similarities between protein structures is a crucial task in molecular biology. Most of the existing tools require proteins to be aligned in order-preserving way and only find single alignments even when multiple similar regions exist. We propose a new seed-based approach that discovers multiple pairs of similar regions. Its computational complexity is polynomial and it comes with a quality guarantee—the returned alignments have both root mean squared deviations (coordinate-based as well as internal-distances based) lower than a given threshold, if such exist. We do not require the alignments to be order preserving (i.e., we consider nonsequential alignments), which makesmore » our algorithm suitable for detecting similar domains when comparing multidomain proteins as well as to detect structural repetitions within a single protein. Because the search space for nonsequential alignments is much larger than for sequential ones, the computational burden is addressed by extensive use of parallel computing techniques: a coarse-grain level parallelism making use of available CPU cores for computation and a fine-grain level parallelism exploiting bit-level concurrency as well as vector instructions.« less
Prediction of β-turns in proteins from multiple alignment using neural network

PubMed Central

Kaur, Harpreet; Raghava, Gajendra Pal Singh

2003-01-01

A neural network-based method has been developed for the prediction of β-turns in proteins by using multiple sequence alignment. Two feed-forward back-propagation networks with a single hidden layer are used where the first-sequence structure network is trained with the multiple sequence alignment in the form of PSI-BLAST–generated position-specific scoring matrices. The initial predictions from the first network and PSIPRED-predicted secondary structure are used as input to the second structure-structure network to refine the predictions obtained from the first net. A significant improvement in prediction accuracy has been achieved by using evolutionary information contained in the multiple sequence alignment. The final network yields an overall prediction accuracy of 75.5% when tested by sevenfold cross-validation on a set of 426 nonhomologous protein chains. The corresponding Qpred, Qobs, and Matthews correlation coefficient values are 49.8%, 72.3%, and 0.43, respectively, and are the best among all the previously published β-turn prediction methods. The Web server BetaTPred2 (http://www.imtech.res.in/raghava/betatpred2/) has been developed based on this approach. PMID:12592033
Robust prediction of consensus secondary structures using averaged base pairing probability matrices.

PubMed

Kiryu, Hisanori; Kin, Taishin; Asai, Kiyoshi

2007-02-15

Recent transcriptomic studies have revealed the existence of a considerable number of non-protein-coding RNA transcripts in higher eukaryotic cells. To investigate the functional roles of these transcripts, it is of great interest to find conserved secondary structures from multiple alignments on a genomic scale. Since multiple alignments are often created using alignment programs that neglect the special conservation patterns of RNA secondary structures for computational efficiency, alignment failures can cause potential risks of overlooking conserved stem structures. We investigated the dependence of the accuracy of secondary structure prediction on the quality of alignments. We compared three algorithms that maximize the expected accuracy of secondary structures as well as other frequently used algorithms. We found that one of our algorithms, called McCaskill-MEA, was more robust against alignment failures than others. The McCaskill-MEA method first computes the base pairing probability matrices for all the sequences in the alignment and then obtains the base pairing probability matrix of the alignment by averaging over these matrices. The consensus secondary structure is predicted from this matrix such that the expected accuracy of the prediction is maximized. We show that the McCaskill-MEA method performs better than other methods, particularly when the alignment quality is low and when the alignment consists of many sequences. Our model has a parameter that controls the sensitivity and specificity of predictions. We discussed the uses of that parameter for multi-step screening procedures to search for conserved secondary structures and for assigning confidence values to the predicted base pairs. The C++ source code that implements the McCaskill-MEA algorithm and the test dataset used in this paper are available at http://www.ncrna.org/papers/McCaskillMEA/. Supplementary data are available at Bioinformatics online.
Fine-tuning structural RNA alignments in the twilight zone.

PubMed

Bremges, Andreas; Schirmer, Stefanie; Giegerich, Robert

2010-04-30

A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index.
Differential evolution-simulated annealing for multiple sequence alignment

NASA Astrophysics Data System (ADS)

Addawe, R. C.; Addawe, J. M.; Sueño, M. R. K.; Magadia, J. C.

2017-10-01

Multiple sequence alignments (MSA) are used in the analysis of molecular evolution and sequence structure relationships. In this paper, a hybrid algorithm, Differential Evolution - Simulated Annealing (DESA) is applied in optimizing multiple sequence alignments (MSAs) based on structural information, non-gaps percentage and totally conserved columns. DESA is a robust algorithm characterized by self-organization, mutation, crossover, and SA-like selection scheme of the strategy parameters. Here, the MSA problem is treated as a multi-objective optimization problem of the hybrid evolutionary algorithm, DESA. Thus, we name the algorithm as DESA-MSA. Simulated sequences and alignments were generated to evaluate the accuracy and efficiency of DESA-MSA using different indel sizes, sequence lengths, deletion rates and insertion rates. The proposed hybrid algorithm obtained acceptable solutions particularly for the MSA problem evaluated based on the three objectives.
Sequence harmony: detecting functional specificity from alignments

PubMed Central

Feenstra, K. Anton; Pirovano, Walter; Krab, Klaas; Heringa, Jaap

2007-01-01

Multiple sequence alignments are often used for the identification of key specificity-determining residues within protein families. We present a web server implementation of the Sequence Harmony (SH) method previously introduced. SH accurately detects subfamily specific positions from a multiple alignment by scoring compositional differences between subfamilies, without imposing conservation. The SH web server allows a quick selection of subtype specific sites from a multiple alignment given a subfamily grouping. In addition, it allows the predicted sites to be directly mapped onto a protein structure and displayed. We demonstrate the use of the SH server using the family of plant mitochondrial alternative oxidases (AOX). In addition, we illustrate the usefulness of combining sequence and structural information by showing that the predicted sites are clustered into a few distinct regions in an AOX homology model. The SH web server can be accessed at www.ibi.vu.nl/programs/seqharmwww. PMID:17584793
Fine-tuning structural RNA alignments in the twilight zone

PubMed Central

2010-01-01

Background A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Results Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Conclusions Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index. PMID:20433706
The Application of the Weighted k-Partite Graph Problem to the Multiple Alignment for Metabolic Pathways.

PubMed

Chen, Wenbin; Hendrix, William; Samatova, Nagiza F

2017-12-01

The problem of aligning multiple metabolic pathways is one of very challenging problems in computational biology. A metabolic pathway consists of three types of entities: reactions, compounds, and enzymes. Based on similarities between enzymes, Tohsato et al. gave an algorithm for aligning multiple metabolic pathways. However, the algorithm given by Tohsato et al. neglects the similarities among reactions, compounds, enzymes, and pathway topology. How to design algorithms for the alignment problem of multiple metabolic pathways based on the similarity of reactions, compounds, and enzymes? It is a difficult computational problem. In this article, we propose an algorithm for the problem of aligning multiple metabolic pathways based on the similarities among reactions, compounds, enzymes, and pathway topology. First, we compute a weight between each pair of like entities in different input pathways based on the entities' similarity score and topological structure using Ay et al.'s methods. We then construct a weighted k-partite graph for the reactions, compounds, and enzymes. We extract a mapping between these entities by solving the maximum-weighted k-partite matching problem by applying a novel heuristic algorithm. By analyzing the alignment results of multiple pathways in different organisms, we show that the alignments found by our algorithm correctly identify common subnetworks among multiple pathways.

GeneSilico protein structure prediction meta-server.

PubMed

Kurowski, Michal A; Bujnicki, Janusz M

2003-07-01

Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta.
GeneSilico protein structure prediction meta-server

PubMed Central

Kurowski, Michal A.; Bujnicki, Janusz M.

2003-01-01

Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta. PMID:12824313
Evolutionary profiles derived from the QR factorization of multiple structural alignments gives an economy of information.

PubMed

O'Donoghue, Patrick; Luthey-Schulten, Zaida

2005-02-25

We present a new algorithm, based on the multidimensional QR factorization, to remove redundancy from a multiple structural alignment by choosing representative protein structures that best preserve the phylogenetic tree topology of the homologous group. The classical QR factorization with pivoting, developed as a fast numerical solution to eigenvalue and linear least-squares problems of the form Ax=b, was designed to re-order the columns of A by increasing linear dependence. Removing the most linear dependent columns from A leads to the formation of a minimal basis set which well spans the phase space of the problem at hand. By recasting the problem of redundancy in multiple structural alignments into this framework, in which the matrix A now describes the multiple alignment, we adapted the QR factorization to produce a minimal basis set of protein structures which best spans the evolutionary (phase) space. The non-redundant and representative profiles obtained from this procedure, termed evolutionary profiles, are shown in initial results to outperform well-tested profiles in homology detection searches over a large sequence database. A measure of structural similarity between homologous proteins, Q(H), is presented. By properly accounting for the effect and presence of gaps, a phylogenetic tree computed using this metric is shown to be congruent with the maximum-likelihood sequence-based phylogeny. The results indicate that evolutionary information is indeed recoverable from the comparative analysis of protein structure alone. Applications of the QR ordering and this structural similarity metric to analyze the evolution of structure among key, universally distributed proteins involved in translation, and to the selection of representatives from an ensemble of NMR structures are also discussed.
Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring

PubMed Central

2012-01-01

Background Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. Results The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Conclusions Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family. PMID:22793672
Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring.

PubMed

Durston, Kirk K; Chiu, David Ky; Wong, Andrew Kc; Li, Gary Cl

2012-07-13

Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family.
AlignMe—a membrane protein sequence alignment web server

PubMed Central

Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.

2014-01-01

We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425
Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization.

PubMed

Bauer, Markus; Klau, Gunnar W; Reinert, Knut

2007-07-27

The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. We present a graph-based representation for sequence-structure alignments, which we model as an integer linear program (ILP). We sketch how we compute an optimal or near-optimal solution to the ILP using methods from combinatorial optimization, and present results on a recently published benchmark set for RNA alignments. The implementation of our algorithm yields better alignments in terms of two published scores than the other programs that we tested: This is especially the case with an increasing number of input sequences. Our program LARA is freely available for academic purposes from http://www.planet-lisa.net.
SATCHMO-JS: a webserver for simultaneous protein multiple sequence alignment and phylogenetic tree construction.

PubMed

Hagopian, Raffi; Davidson, John R; Datta, Ruchira S; Samad, Bushra; Jarvis, Glen R; Sjölander, Kimmen

2010-07-01

We present the jump-start simultaneous alignment and tree construction using hidden Markov models (SATCHMO-JS) web server for simultaneous estimation of protein multiple sequence alignments (MSAs) and phylogenetic trees. The server takes as input a set of sequences in FASTA format, and outputs a phylogenetic tree and MSA; these can be viewed online or downloaded from the website. SATCHMO-JS is an extension of the SATCHMO algorithm, and employs a divide-and-conquer strategy to jump-start SATCHMO at a higher point in the phylogenetic tree, reducing the computational complexity of the progressive all-versus-all HMM-HMM scoring and alignment. Results on a benchmark dataset of 983 structurally aligned pairs from the PREFAB benchmark dataset show that SATCHMO-JS provides a statistically significant improvement in alignment accuracy over MUSCLE, Multiple Alignment using Fast Fourier Transform (MAFFT), ClustalW and the original SATCHMO algorithm. The SATCHMO-JS webserver is available at http://phylogenomics.berkeley.edu/satchmo-js. The datasets used in these experiments are available for download at http://phylogenomics.berkeley.edu/satchmo-js/supplementary/.
Accelerated probabilistic inference of RNA structure evolution

PubMed Central

Holmes, Ian

2005-01-01

Background Pairwise stochastic context-free grammars (Pair SCFGs) are powerful tools for evolutionary analysis of RNA, including simultaneous RNA sequence alignment and secondary structure prediction, but the associated algorithms are intensive in both CPU and memory usage. The same problem is faced by other RNA alignment-and-folding algorithms based on Sankoff's 1985 algorithm. It is therefore desirable to constrain such algorithms, by pre-processing the sequences and using this first pass to limit the range of structures and/or alignments that can be considered. Results We demonstrate how flexible classes of constraint can be imposed, greatly reducing the computational costs while maintaining a high quality of structural homology prediction. Any score-attributed context-free grammar (e.g. energy-based scoring schemes, or conditionally normalized Pair SCFGs) is amenable to this treatment. It is now possible to combine independent structural and alignment constraints of unprecedented general flexibility in Pair SCFG alignment algorithms. We outline several applications to the bioinformatics of RNA sequence and structure, including Waterman-Eggert N-best alignments and progressive multiple alignment. We evaluate the performance of the algorithm on test examples from the RFAM database. Conclusion A program, Stemloc, that implements these algorithms for efficient RNA sequence alignment and structure prediction is available under the GNU General Public License. PMID:15790387
Simple chained guide trees give high-quality protein multiple sequence alignments

PubMed Central

Boyce, Kieran; Sievers, Fabian; Higgins, Desmond G.

2014-01-01

Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be benchmarked with known structures, simple chained guide trees give the most accurate alignments. These also happen to be the fastest and simplest guide trees to construct, computationally. Such guide trees have a striking effect on the accuracy of alignments produced by some of the most widely used alignment packages. There is a marked increase in accuracy and a marked decrease in computational time, once the number of sequences goes much above a few hundred. This is true, even if the order of sequences in the guide tree is random. PMID:25002495
Evaluation of sequence alignments and oligonucleotide probes with respect to three-dimensional structure of ribosomal RNA using ARB software package

PubMed Central

Kumar, Yadhu; Westram, Ralf; Kipfer, Peter; Meier, Harald; Ludwig, Wolfgang

2006-01-01

Background Availability of high-resolution RNA crystal structures for the 30S and 50S ribosomal subunits and the subsequent validation of comparative secondary structure models have prompted the biologists to use three-dimensional structure of ribosomal RNA (rRNA) for evaluating sequence alignments of rRNA genes. Furthermore, the secondary and tertiary structural features of rRNA are highly useful and successfully employed in designing rRNA targeted oligonucleotide probes intended for in situ hybridization experiments. RNA3D, a program to combine sequence alignment information with three-dimensional structure of rRNA was developed. Integration into ARB software package, which is used extensively by the scientific community for phylogenetic analysis and molecular probe designing, has substantially extended the functionality of ARB software suite with 3D environment. Results Three-dimensional structure of rRNA is visualized in OpenGL 3D environment with the abilities to change the display and overlay information onto the molecule, dynamically. Phylogenetic information derived from the multiple sequence alignments can be overlaid onto the molecule structure in a real time. Superimposition of both statistical and non-statistical sequence associated information onto the rRNA 3D structure can be done using customizable color scheme, which is also applied to a textual sequence alignment for reference. Oligonucleotide probes designed by ARB probe design tools can be mapped onto the 3D structure along with the probe accessibility models for evaluation with respect to secondary and tertiary structural conformations of rRNA. Conclusion Visualization of three-dimensional structure of rRNA in an intuitive display provides the biologists with the greater possibilities to carry out structure based phylogenetic analysis. Coupled with secondary structure models of rRNA, RNA3D program aids in validating the sequence alignments of rRNA genes and evaluating probe target sites. Superimposition of the information derived from the multiple sequence alignment onto the molecule dynamically allows the researchers to observe any sequence inherited characteristics (phylogenetic information) in real-time environment. The extended ARB software package is made freely available for the scientific community via . PMID:16672074
Silicon Alignment Pins: An Easy Way to Realize a Wafer-To-Wafer Alignment

NASA Technical Reports Server (NTRS)

Peralta, Alejandro (Inventor); Gill, John J. (Inventor); Toda, Risaku (Inventor); Lin, Robert H. (Inventor); Jung-Kubiak, Cecile (Inventor); Reck, Theodore (Inventor); Thomas, Bertrand (Inventor); Siles, Jose V. (Inventor); Lee, Choonsup (Inventor); Chattopadhyay, Goutam (Inventor)

2016-01-01

A silicon alignment pin is used to align successive layers of components made in semiconductor chips and/or metallic components to make easier the assembly of devices having a layered structure. The pin is made as a compressible structure which can be squeezed to reduce its outer diameter, have one end fit into a corresponding alignment pocket or cavity defined in a layer of material to be assembled into a layered structure, and then allowed to expand to produce an interference fit with the cavity. The other end can then be inserted into a corresponding cavity defined in a surface of a second layer of material that mates with the first layer. The two layers are in registry when the pin is mated to both. Multiple layers can be assembled to create a multilayer structure. Examples of such devices are presented.
A greedy, graph-based algorithm for the alignment of multiple homologous gene lists.

PubMed

Fostier, Jan; Proost, Sebastian; Dhoedt, Bart; Saeys, Yvan; Demeester, Piet; Van de Peer, Yves; Vandepoele, Klaas

2011-03-15

Many comparative genomics studies rely on the correct identification of homologous genomic regions using accurate alignment tools. In such case, the alphabet of the input sequences consists of complete genes, rather than nucleotides or amino acids. As optimal multiple sequence alignment is computationally impractical, a progressive alignment strategy is often employed. However, such an approach is susceptible to the propagation of alignment errors in early pairwise alignment steps, especially when dealing with strongly diverged genomic regions. In this article, we present a novel accurate and efficient greedy, graph-based algorithm for the alignment of multiple homologous genomic segments, represented as ordered gene lists. Based on provable properties of the graph structure, several heuristics are developed to resolve local alignment conflicts that occur due to gene duplication and/or rearrangement events on the different genomic segments. The performance of the algorithm is assessed by comparing the alignment results of homologous genomic segments in Arabidopsis thaliana to those obtained by using both a progressive alignment method and an earlier graph-based implementation. Especially for datasets that contain strongly diverged segments, the proposed method achieves a substantially higher alignment accuracy, and proves to be sufficiently fast for large datasets including a few dozens of eukaryotic genomes. http://bioinformatics.psb.ugent.be/software. The algorithm is implemented as a part of the i-ADHoRe 3.0 package.
Retrieving transient conformational molecular structure information from inner-shell photoionization of laser-aligned molecules

PubMed Central

Wang, Xu; Le, Anh-Thu; Yu, Chao; Lucchese, R. R.; Lin, C. D.

2016-01-01

We discuss a scheme to retrieve transient conformational molecular structure information using photoelectron angular distributions (PADs) that have averaged over partial alignments of isolated molecules. The photoelectron is pulled out from a localized inner-shell molecular orbital by an X-ray photon. We show that a transient change in the atomic positions from their equilibrium will lead to a sensitive change in the alignment-averaged PADs, which can be measured and used to retrieve the former. Exploiting the experimental convenience of changing the photon polarization direction, we show that it is advantageous to use PADs obtained from multiple photon polarization directions. A simple single-scattering model is proposed and benchmarked to describe the photoionization process and to do the retrieval using a multiple-parameter fitting method. PMID:27025410
Retrieving transient conformational molecular structure information from inner-shell photoionization of laser-aligned molecules

NASA Astrophysics Data System (ADS)

Wang, Xu; Le, Anh-Thu; Yu, Chao; Lucchese, R. R.; Lin, C. D.

2016-03-01

We discuss a scheme to retrieve transient conformational molecular structure information using photoelectron angular distributions (PADs) that have averaged over partial alignments of isolated molecules. The photoelectron is pulled out from a localized inner-shell molecular orbital by an X-ray photon. We show that a transient change in the atomic positions from their equilibrium will lead to a sensitive change in the alignment-averaged PADs, which can be measured and used to retrieve the former. Exploiting the experimental convenience of changing the photon polarization direction, we show that it is advantageous to use PADs obtained from multiple photon polarization directions. A simple single-scattering model is proposed and benchmarked to describe the photoionization process and to do the retrieval using a multiple-parameter fitting method.
Phylo-mLogo: an interactive and hierarchical multiple-logo visualization tool for alignment of many sequences

PubMed Central

Shih, Arthur Chun-Chieh; Lee, DT; Peng, Chin-Lin; Wu, Yu-Wei

2007-01-01

Background When aligning several hundreds or thousands of sequences, such as epidemic virus sequences or homologous/orthologous sequences of some big gene families, to reconstruct the epidemiological history or their phylogenies, how to analyze and visualize the alignment results of many sequences has become a new challenge for computational biologists. Although there are several tools available for visualization of very long sequence alignments, few of them are applicable to the alignments of many sequences. Results A multiple-logo alignment visualization tool, called Phylo-mLogo, is presented in this paper. Phylo-mLogo calculates the variabilities and homogeneities of alignment sequences by base frequencies or entropies. Different from the traditional representations of sequence logos, Phylo-mLogo not only displays the global logo patterns of the whole alignment of multiple sequences, but also demonstrates their local homologous logos for each clade hierarchically. In addition, Phylo-mLogo also allows the user to focus only on the analysis of some important, structurally or functionally constrained sites in the alignment selected by the user or by built-in automatic calculation. Conclusion With Phylo-mLogo, the user can symbolically and hierarchically visualize hundreds of aligned sequences simultaneously and easily check the changes of their amino acid sites when analyzing many homologous/orthologous or influenza virus sequences. More information of Phylo-mLogo can be found at URL . PMID:17319966
Dinucleotide controlled null models for comparative RNA gene prediction.

PubMed

Gesell, Tanja; Washietl, Stefan

2008-05-27

Comparative prediction of RNA structures can be used to identify functional noncoding RNAs in genomic screens. It was shown recently by Babak et al. [BMC Bioinformatics. 8:33] that RNA gene prediction programs can be biased by the genomic dinucleotide content, in particular those programs using a thermodynamic folding model including stacking energies. As a consequence, there is need for dinucleotide-preserving control strategies to assess the significance of such predictions. While there have been randomization algorithms for single sequences for many years, the problem has remained challenging for multiple alignments and there is currently no algorithm available. We present a program called SISSIz that simulates multiple alignments of a given average dinucleotide content. Meeting additional requirements of an accurate null model, the randomized alignments are on average of the same sequence diversity and preserve local conservation and gap patterns. We make use of a phylogenetic substitution model that includes overlapping dependencies and site-specific rates. Using fast heuristics and a distance based approach, a tree is estimated under this model which is used to guide the simulations. The new algorithm is tested on vertebrate genomic alignments and the effect on RNA structure predictions is studied. In addition, we directly combined the new null model with the RNAalifold consensus folding algorithm giving a new variant of a thermodynamic structure based RNA gene finding program that is not biased by the dinucleotide content. SISSIz implements an efficient algorithm to randomize multiple alignments preserving dinucleotide content. It can be used to get more accurate estimates of false positive rates of existing programs, to produce negative controls for the training of machine learning based programs, or as standalone RNA gene finding program. Other applications in comparative genomics that require randomization of multiple alignments can be considered. SISSIz is available as open source C code that can be compiled for every major platform and downloaded here: http://sourceforge.net/projects/sissiz.
Evolutionary profiles from the QR factorization of multiple sequence alignments

PubMed Central

Sethi, Anurag; O'Donoghue, Patrick; Luthey-Schulten, Zaida

2005-01-01

We present an algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of the homologous group. The method, based on the multidimensional QR factorization of numerically encoded multiple sequence alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. We observe a general trend that these smaller, more evolutionarily balanced profiles have comparable and, in many cases, better performance in database searches than conventional profiles containing hundreds of sequences, constructed in an iterative and computationally intensive procedure. For more diverse families or superfamilies, with sequence identity <30%, structural alignments, based purely on the geometry of the protein structures, provide better alignments than pure sequence-based methods. Merging the structure and sequence information allows the construction of accurate profiles for distantly related groups. These structure-based profiles outperformed other sequence-based methods for finding distant homologs and were used to identify a putative class II cysteinyl-tRNA synthetase (CysRS) in several archaea that eluded previous annotation studies. Phylogenetic analysis showed the putative class II CysRSs to be a monophyletic group and homology modeling revealed a constellation of active site residues similar to that in the known class I CysRS. PMID:15741270
RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation.

PubMed

Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin

2017-01-21

RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/ .
R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server

PubMed Central

Cannone, Jamie J.; Sweeney, Blake A.; Petrov, Anton I.; Gutell, Robin R.; Zirbel, Craig L.; Leontis, Neocles

2015-01-01

The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. PMID:26048960

JDet: interactive calculation and visualization of function-related conservation patterns in multiple sequence alignments and structures.

PubMed

Muth, Thilo; García-Martín, Juan A; Rausell, Antonio; Juan, David; Valencia, Alfonso; Pazos, Florencio

2012-02-15

We have implemented in a single package all the features required for extracting, visualizing and manipulating fully conserved positions as well as those with a family-dependent conservation pattern in multiple sequence alignments. The program allows, among other things, to run different methods for extracting these positions, combine the results and visualize them in protein 3D structures and sequence spaces. JDet is a multiplatform application written in Java. It is freely available, including the source code, at http://csbg.cnb.csic.es/JDet. The package includes two of our recently developed programs for detecting functional positions in protein alignments (Xdet and S3Det), and support for other methods can be added as plug-ins. A help file and a guided tutorial for JDet are also available.
Retrieving transient conformational molecular structure information from inner-shell photoionization of laser-aligned molecules

DOE PAGES

Wang, Xu; Le, Anh -Thu; Yu, Chao; ...

2016-03-30

We discuss a scheme to retrieve transient conformational molecular structure information using photoelectron angular distributions (PADs) that have averaged over partial alignments of isolated molecules. The photoelectron is pulled out from a localized inner-shell molecular orbital by an X-ray photon. We show that a transient change in the atomic positions from their equilibrium will lead to a sensitive change in the alignment-averaged PADs, which can be measured and used to retrieve the former. Exploiting the experimental convenience of changing the photon polarization direction, we show that it is advantageous to use PADs obtained from multiple photon polarization directions. Lastly, amore » simple single-scattering model is proposed and benchmarked to describe the photoionization process and to do the retrieval using a multiple-parameter fitting method.« less
Aligning the unalignable: bacteriophage whole genome alignments.

PubMed

Bérard, Sèverine; Chateau, Annie; Pompidor, Nicolas; Guertin, Paul; Bergeron, Anne; Swenson, Krister M

2016-01-13

In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressive Mauve aligner - which implements a partial order strategy, but whose alignments are linearized - shows a greatly improved interactive graphic display, while avoiding misalignments. Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha).
Structural phylogeny by profile extraction and multiple superimposition using electrostatic congruence as a discriminator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chakraborty, Sandeep; Rao, Basuthkar J.; Baker, Nathan A.

2013-04-01

Phylogenetic analysis of proteins using multiple sequence alignment (MSA) assumes an underlying evolutionary relationship in these proteins which occasionally remains undetected due to considerable sequence divergence. Structural alignment programs have been developed to unravel such fuzzy relationships. However, none of these structure based methods have used electrostatic properties to discriminate between spatially equivalent residues. We present a methodology for MSA of a set of related proteins with known structures using electrostatic properties as an additional discriminator (STEEP). STEEP first extracts a profile, then generates a multiple structural superimposition providing a consolidated spatial framework for comparing residues and finally emits themore » MSA. Residues that are aligned differently by including or excluding electrostatic properties can be targeted by directed evolution experiments to transform the enzymatic properties of one protein into another. We have compared STEEP results to those obtained from a MSA program (ClustalW) and a structural alignment method (MUSTANG) for chymotrypsin serine proteases. Subsequently, we used PhyML to generate phylogenetic trees for the serine and metallo-β-lactamase superfamilies from the STEEP generated MSA, and corroborated the accepted relationships in these superfamilies. We have observed that STEEP acts as a functional classifier when electrostatic congruence is used as a discriminator, and thus identifies potential targets for directed evolution experiments. In summary, STEEP is unique among phylogenetic methods for its ability to use electrostatic congruence to specify mutations that might be the source of the functional divergence in a protein family. Based on our results, we also hypothesize that the active site and its close vicinity contains enough information to infer the correct phylogeny for related proteins.« less
Sequence to Structure (S2S): display, manipulate and interconnect RNA data from sequence to structure.

PubMed

Jossinet, Fabrice; Westhof, Eric

2005-08-01

Efficient RNA sequence manipulations (such as multiple alignments) need to be constrained by rules of RNA structure folding. The structural knowledge has increased dramatically in the last years with the accumulation of several large RNA structures similar to those of the bacterial ribosome subunits. However, no tool in the RNA community provides an easy way to link and integrate progress made at the sequence level using the available three-dimensional information. Sequence to Structure (S2S) proposes a framework in which an user can easily display, manipulate and interconnect heterogeneous RNA data, such as multiple sequence alignments, secondary and tertiary structures. S2S has been implemented using the Java language and has been developed and tested under UNIX systems, such as Linux and MacOSX. S2S is available at http://bioinformatics.org/S2S/.
AntiClustal: Multiple Sequence Alignment by antipole clustering and linear approximate 1-median computation.

PubMed

Di Pietro, C; Di Pietro, V; Emmanuele, G; Ferro, A; Maugeri, T; Modica, E; Pigola, G; Pulvirenti, A; Purrello, M; Ragusa, M; Scalia, M; Shasha, D; Travali, S; Zimmitti, V

2003-01-01

In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.
YAHA: fast and flexible long-read alignment with optimal breakpoint detection.

PubMed

Faust, Gregory G; Hall, Ira M

2012-10-01

With improved short-read assembly algorithms and the recent development of long-read sequencers, split mapping will soon be the preferred method for structural variant (SV) detection. Yet, current alignment tools are not well suited for this. We present YAHA, a fast and flexible hash-based aligner. YAHA is as fast and accurate as BWA-SW at finding the single best alignment per query and is dramatically faster and more sensitive than both SSAHA2 and MegaBLAST at finding all possible alignments. Unlike other aligners that report all, or one, alignment per query, or that use simple heuristics to select alignments, YAHA uses a directed acyclic graph to find the optimal set of alignments that cover a query using a biologically relevant breakpoint penalty. YAHA can also report multiple mappings per defined segment of the query. We show that YAHA detects more breakpoints in less time than BWA-SW across all SV classes, and especially excels at complex SVs comprising multiple breakpoints. YAHA is currently supported on 64-bit Linux systems. Binaries and sample data are freely available for download from http://faculty.virginia.edu/irahall/YAHA. imh4y@virginia.edu.
SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein–Protein Interactions

PubMed Central

Jefferson, Emily R.; Walsh, Thomas P.; Roberts, Timothy J.; Barton, Geoffrey J.

2007-01-01

SNAPPI-DB, a high performance database of Structures, iNterfaces and Alignments of Protein–Protein Interactions, and its associated Java Application Programming Interface (API) is described. SNAPPI-DB contains structural data, down to the level of atom co-ordinates, for each structure in the Protein Data Bank (PDB) together with associated data including SCOP, CATH, Pfam, SWISSPROT, InterPro, GO terms, Protein Quaternary Structures (PQS) and secondary structure information. Domain–domain interactions are stored for multiple domain definitions and are classified by their Superfamily/Family pair and interaction interface. Each set of classified domain–domain interactions has an associated multiple structure alignment for each partner. The API facilitates data access via PDB entries, domains and domain–domain interactions. Rapid development, fast database access and the ability to perform advanced queries without the requirement for complex SQL statements are provided via an object oriented database and the Java Data Objects (JDO) API. SNAPPI-DB contains many features which are not available in other databases of structural protein–protein interactions. It has been applied in three studies on the properties of protein–protein interactions and is currently being employed to train a protein–protein interaction predictor and a functional residue predictor. The database, API and manual are available for download at: . PMID:17202171
R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server.

PubMed

Cannone, Jamie J; Sweeney, Blake A; Petrov, Anton I; Gutell, Robin R; Zirbel, Craig L; Leontis, Neocles

2015-07-01

The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Information Compression, Multiple Alignment, and the Representation and Processing of Knowledge in the Brain

PubMed Central

Wolff, J. Gerard

2016-01-01

The SP theory of intelligence, with its realization in the SP computer model, aims to simplify and integrate observations and concepts across artificial intelligence, mainstream computing, mathematics, and human perception and cognition, with information compression as a unifying theme. This paper describes how abstract structures and processes in the theory may be realized in terms of neurons, their interconnections, and the transmission of signals between neurons. This part of the SP theory—SP-neural—is a tentative and partial model for the representation and processing of knowledge in the brain. Empirical support for the SP theory—outlined in the paper—provides indirect support for SP-neural. In the abstract part of the SP theory (SP-abstract), all kinds of knowledge are represented with patterns, where a pattern is an array of atomic symbols in one or two dimensions. In SP-neural, the concept of a “pattern” is realized as an array of neurons called a pattern assembly, similar to Hebb's concept of a “cell assembly” but with important differences. Central to the processing of information in SP-abstract is information compression via the matching and unification of patterns (ICMUP) and, more specifically, information compression via the powerful concept of multiple alignment, borrowed and adapted from bioinformatics. Processes such as pattern recognition, reasoning and problem solving are achieved via the building of multiple alignments, while unsupervised learning is achieved by creating patterns from sensory information and also by creating patterns from multiple alignments in which there is a partial match between one pattern and another. It is envisaged that, in SP-neural, short-lived neural structures equivalent to multiple alignments will be created via an inter-play of excitatory and inhibitory neural signals. It is also envisaged that unsupervised learning will be achieved by the creation of pattern assemblies from sensory information and from the neural equivalents of multiple alignments, much as in the non-neural SP theory—and significantly different from the “Hebbian” kinds of learning which are widely used in the kinds of artificial neural network that are popular in computer science. The paper discusses several associated issues, with relevant empirical evidence. PMID:27857695
Information Compression, Multiple Alignment, and the Representation and Processing of Knowledge in the Brain.

PubMed

Wolff, J Gerard

2016-01-01

The SP theory of intelligence , with its realization in the SP computer model , aims to simplify and integrate observations and concepts across artificial intelligence, mainstream computing, mathematics, and human perception and cognition, with information compression as a unifying theme. This paper describes how abstract structures and processes in the theory may be realized in terms of neurons, their interconnections, and the transmission of signals between neurons. This part of the SP theory- SP-neural -is a tentative and partial model for the representation and processing of knowledge in the brain. Empirical support for the SP theory-outlined in the paper-provides indirect support for SP-neural. In the abstract part of the SP theory (SP-abstract), all kinds of knowledge are represented with patterns , where a pattern is an array of atomic symbols in one or two dimensions. In SP-neural, the concept of a "pattern" is realized as an array of neurons called a pattern assembly , similar to Hebb's concept of a "cell assembly" but with important differences. Central to the processing of information in SP-abstract is information compression via the matching and unification of patterns (ICMUP) and, more specifically, information compression via the powerful concept of multiple alignment , borrowed and adapted from bioinformatics. Processes such as pattern recognition, reasoning and problem solving are achieved via the building of multiple alignments, while unsupervised learning is achieved by creating patterns from sensory information and also by creating patterns from multiple alignments in which there is a partial match between one pattern and another. It is envisaged that, in SP-neural, short-lived neural structures equivalent to multiple alignments will be created via an inter-play of excitatory and inhibitory neural signals. It is also envisaged that unsupervised learning will be achieved by the creation of pattern assemblies from sensory information and from the neural equivalents of multiple alignments, much as in the non-neural SP theory-and significantly different from the "Hebbian" kinds of learning which are widely used in the kinds of artificial neural network that are popular in computer science. The paper discusses several associated issues, with relevant empirical evidence.
Genome alignment with graph data structures: a comparison

PubMed Central

2014-01-01

Background Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analyses such as rearrangement studies and phylogenetic inference. Graphs have proven to be a powerful tool for coping with the complexity of genome-scale sequence alignments. The potential of graphs to intuitively represent all aspects of genome alignments led to the development of graph-based approaches for genome alignment. These approaches construct a graph from a set of local alignments, and derive a genome alignment through identification and removal of graph substructures that indicate errors in the alignment. Results We compare the structures of commonly used graphs in terms of their abilities to represent alignment information. We describe how the graphs can be transformed into each other, and identify and classify graph substructures common to one or more graphs. Based on previous approaches, we compile a list of modifications that remove these substructures. Conclusion We show that crucial pieces of alignment information, associated with inversions and duplications, are not visible in the structure of all graphs. If we neglect vertex or edge labels, the graphs differ in their information content. Still, many ideas are shared among all graph-based approaches. Based on these findings, we outline a conceptual framework for graph-based genome alignment that can assist in the development of future genome alignment tools. PMID:24712884
Improved measurements of RNA structure conservation with generalized centroid estimators.

PubMed

Okada, Yohei; Saito, Yutaka; Sato, Kengo; Sakakibara, Yasubumi

2011-01-01

Identification of non-protein-coding RNAs (ncRNAs) in genomes is a crucial task for not only molecular cell biology but also bioinformatics. Secondary structures of ncRNAs are employed as a key feature of ncRNA analysis since biological functions of ncRNAs are deeply related to their secondary structures. Although the minimum free energy (MFE) structure of an RNA sequence is regarded as the most stable structure, MFE alone could not be an appropriate measure for identifying ncRNAs since the free energy is heavily biased by the nucleotide composition. Therefore, instead of MFE itself, several alternative measures for identifying ncRNAs have been proposed such as the structure conservation index (SCI) and the base pair distance (BPD), both of which employ MFE structures. However, these measurements are unfortunately not suitable for identifying ncRNAs in some cases including the genome-wide search and incur high false discovery rate. In this study, we propose improved measurements based on SCI and BPD, applying generalized centroid estimators to incorporate the robustness against low quality multiple alignments. Our experiments show that our proposed methods achieve higher accuracy than the original SCI and BPD for not only human-curated structural alignments but also low quality alignments produced by CLUSTAL W. Furthermore, the centroid-based SCI on CLUSTAL W alignments is more accurate than or comparable with that of the original SCI on structural alignments generated with RAF, a high quality structural aligner, for which twofold expensive computational time is required on average. We conclude that our methods are more suitable for genome-wide alignments which are of low quality from the point of view on secondary structures than the original SCI and BPD.
Evolutionarily conserved regions and hydrophobic contacts at the superfamily level: The case of the fold-type I, pyridoxal-5′-phosphate-dependent enzymes

PubMed Central

Paiardini, Alessandro; Bossa, Francesco; Pascarella, Stefano

2004-01-01

The wealth of biological information provided by structural and genomic projects opens new prospects of understanding life and evolution at the molecular level. In this work, it is shown how computational approaches can be exploited to pinpoint protein structural features that remain invariant upon long evolutionary periods in the fold-type I, PLP-dependent enzymes. A nonredundant set of 23 superposed crystallographic structures belonging to this superfamily was built. Members of this family typically display high-structural conservation despite low-sequence identity. For each structure, a multiple-sequence alignment of orthologous sequences was obtained, and the 23 alignments were merged using the structural information to obtain a comprehensive multiple alignment of 921 sequences of fold-type I enzymes. The structurally conserved regions (SCRs), the evolutionarily conserved residues, and the conserved hydrophobic contacts (CHCs) were extracted from this data set, using both sequence and structural information. The results of this study identified a structural pattern of hydrophobic contacts shared by all of the superfamily members of fold-type I enzymes and involved in native interactions. This profile highlights the presence of a nucleus for this fold, in which residues participating in the most conserved native interactions exhibit preferential evolutionary conservation, that correlates significantly (r = 0.70) with the extent of mean hydrophobic contact value of their apolar fraction. PMID:15498941
Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

PubMed

Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

2011-01-01

The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
Sub-Diffraction Limited Writing based on Laser Induced Periodic Surface Structures (LIPSS).

PubMed

He, Xiaolong; Datta, Anurup; Nam, Woongsik; Traverso, Luis M; Xu, Xianfan

2016-10-10

Controlled fabrication of single and multiple nanostructures far below the diffraction limit using a method based on laser induced periodic surface structure (LIPSS) is presented. In typical LIPSS, multiple lines with a certain spatial periodicity, but often not well-aligned, were produced. In this work, well-controlled and aligned nanowires and nanogrooves with widths as small as 40 nm and 60 nm with desired orientation and length are fabricated. Moreover, single nanowire and nanogroove were fabricated based on the same mechanism for forming multiple, periodic structures. Combining numerical modeling and AFM/SEM analyses, it was found these nanostructures were formed through the interference between the incident laser radiation and the surface plasmons, the mechanism for forming LIPSS on a dielectric surface using a high power femtosecond laser. We expect that our method, in particular, the fabrication of single nanowires and nanogrooves could be a promising alternative for fabrication of nanoscale devices due to its simplicity, flexibility, and versatility.
Sub-Diffraction Limited Writing based on Laser Induced Periodic Surface Structures (LIPSS)

PubMed Central

He, Xiaolong; Datta, Anurup; Nam, Woongsik; Traverso, Luis M.; Xu, Xianfan

2016-01-01

Controlled fabrication of single and multiple nanostructures far below the diffraction limit using a method based on laser induced periodic surface structure (LIPSS) is presented. In typical LIPSS, multiple lines with a certain spatial periodicity, but often not well-aligned, were produced. In this work, well-controlled and aligned nanowires and nanogrooves with widths as small as 40 nm and 60 nm with desired orientation and length are fabricated. Moreover, single nanowire and nanogroove were fabricated based on the same mechanism for forming multiple, periodic structures. Combining numerical modeling and AFM/SEM analyses, it was found these nanostructures were formed through the interference between the incident laser radiation and the surface plasmons, the mechanism for forming LIPSS on a dielectric surface using a high power femtosecond laser. We expect that our method, in particular, the fabrication of single nanowires and nanogrooves could be a promising alternative for fabrication of nanoscale devices due to its simplicity, flexibility, and versatility. PMID:27721428
A statistically harmonized alignment-classification in image space enables accurate and robust alignment of noisy images in single particle analysis.

PubMed

Kawata, Masaaki; Sato, Chikara

2007-06-01

In determining the three-dimensional (3D) structure of macromolecular assemblies in single particle analysis, a large representative dataset of two-dimensional (2D) average images from huge number of raw images is a key for high resolution. Because alignments prior to averaging are computationally intensive, currently available multireference alignment (MRA) software does not survey every possible alignment. This leads to misaligned images, creating blurred averages and reducing the quality of the final 3D reconstruction. We present a new method, in which multireference alignment is harmonized with classification (multireference multiple alignment: MRMA). This method enables a statistical comparison of multiple alignment peaks, reflecting the similarities between each raw image and a set of reference images. Among the selected alignment candidates for each raw image, misaligned images are statistically excluded, based on the principle that aligned raw images of similar projections have a dense distribution around the correctly aligned coordinates in image space. This newly developed method was examined for accuracy and speed using model image sets with various signal-to-noise ratios, and with electron microscope images of the Transient Receptor Potential C3 and the sodium channel. In every data set, the newly developed method outperformed conventional methods in robustness against noise and in speed, creating 2D average images of higher quality. This statistically harmonized alignment-classification combination should greatly improve the quality of single particle analysis.
Four RNA families with functional transient structures

PubMed Central

Zhu, Jing Yun A; Meyer, Irmtraud M

2015-01-01

Protein-coding and non-coding RNA transcripts perform a wide variety of cellular functions in diverse organisms. Several of their functional roles are expressed and modulated via RNA structure. A given transcript, however, can have more than a single functional RNA structure throughout its life, a fact which has been previously overlooked. Transient RNA structures, for example, are only present during specific time intervals and cellular conditions. We here introduce four RNA families with transient RNA structures that play distinct and diverse functional roles. Moreover, we show that these transient RNA structures are structurally well-defined and evolutionarily conserved. Since Rfam annotates one structure for each family, there is either no annotation for these transient structures or no such family. Thus, our alignments either significantly update and extend the existing Rfam families or introduce a new RNA family to Rfam. For each of the four RNA families, we compile a multiple-sequence alignment based on experimentally verified transient and dominant (dominant in terms of either the thermodynamic stability and/or attention received so far) RNA secondary structures using a combination of automated search via covariance model and manual curation. The first alignment is the Trp operon leader which regulates the operon transcription in response to tryptophan abundance through alternative structures. The second alignment is the HDV ribozyme which we extend to the 5′ flanking sequence. This flanking sequence is involved in the regulation of the transcript's self-cleavage activity. The third alignment is the 5′ UTR of the maturation protein from Levivirus which contains a transient structure that temporarily postpones the formation of the final inhibitory structure to allow translation of maturation protein. The fourth and last alignment is the SAM riboswitch which regulates the downstream gene expression by assuming alternative structures upon binding of SAM. All transient and dominant structures are mapped to our new alignments introduced here. PMID:25751035
Four RNA families with functional transient structures.

PubMed

Zhu, Jing Yun A; Meyer, Irmtraud M

2015-01-01

Protein-coding and non-coding RNA transcripts perform a wide variety of cellular functions in diverse organisms. Several of their functional roles are expressed and modulated via RNA structure. A given transcript, however, can have more than a single functional RNA structure throughout its life, a fact which has been previously overlooked. Transient RNA structures, for example, are only present during specific time intervals and cellular conditions. We here introduce four RNA families with transient RNA structures that play distinct and diverse functional roles. Moreover, we show that these transient RNA structures are structurally well-defined and evolutionarily conserved. Since Rfam annotates one structure for each family, there is either no annotation for these transient structures or no such family. Thus, our alignments either significantly update and extend the existing Rfam families or introduce a new RNA family to Rfam. For each of the four RNA families, we compile a multiple-sequence alignment based on experimentally verified transient and dominant (dominant in terms of either the thermodynamic stability and/or attention received so far) RNA secondary structures using a combination of automated search via covariance model and manual curation. The first alignment is the Trp operon leader which regulates the operon transcription in response to tryptophan abundance through alternative structures. The second alignment is the HDV ribozyme which we extend to the 5' flanking sequence. This flanking sequence is involved in the regulation of the transcript's self-cleavage activity. The third alignment is the 5' UTR of the maturation protein from Levivirus which contains a transient structure that temporarily postpones the formation of the final inhibitory structure to allow translation of maturation protein. The fourth and last alignment is the SAM riboswitch which regulates the downstream gene expression by assuming alternative structures upon binding of SAM. All transient and dominant structures are mapped to our new alignments introduced here.

Diffeomorphic functional brain surface alignment: Functional demons.

PubMed

Nenning, Karl-Heinz; Liu, Hesheng; Ghosh, Satrajit S; Sabuncu, Mert R; Schwartz, Ernst; Langs, Georg

2017-08-01

Aligning brain structures across individuals is a central prerequisite for comparative neuroimaging studies. Typically, registration approaches assume a strong association between the features used for alignment, such as macro-anatomy, and the variable observed, such as functional activation or connectivity. Here, we propose to use the structure of intrinsic resting state fMRI signal correlation patterns as a basis for alignment of the cortex in functional studies. Rather than assuming the spatial correspondence of functional structures between subjects, we have identified locations with similar connectivity profiles across subjects. We mapped functional connectivity relationships within the brain into an embedding space, and aligned the resulting maps of multiple subjects. We then performed a diffeomorphic alignment of the cortical surfaces, driven by the corresponding features in the joint embedding space. Results show that functional alignment based on resting state fMRI identifies functionally homologous regions across individuals with higher accuracy than alignment based on the spatial correspondence of anatomy. Further, functional alignment enables measurement of the strength of the anatomo-functional link across the cortex, and reveals the uneven distribution of this link. Stronger anatomo-functional dissociation was found in higher association areas compared to primary sensory- and motor areas. Functional alignment based on resting state features improves group analysis of task based functional MRI data, increasing statistical power and improving the delineation of task-specific core regions. Finally, a comparison of the anatomo-functional dissociation between cohorts is demonstrated with a group of left and right handed subjects. Copyright © 2017 Elsevier Inc. All rights reserved.
A simplified implementation of edge detection in MATLAB is faster and more sensitive than fast fourier transform for actin fiber alignment quantification.

PubMed

Kemeny, Steven Frank; Clyne, Alisa Morss

2011-04-01

Fiber alignment plays a critical role in the structure and function of cells and tissues. While fiber alignment quantification is important to experimental analysis and several different methods for quantifying fiber alignment exist, many studies focus on qualitative rather than quantitative analysis perhaps due to the complexity of current fiber alignment methods. Speed and sensitivity were compared in edge detection and fast Fourier transform (FFT) for measuring actin fiber alignment in cells exposed to shear stress. While edge detection using matrix multiplication was consistently more sensitive than FFT, image processing time was significantly longer. However, when MATLAB functions were used to implement edge detection, MATLAB's efficient element-by-element calculations and fast filtering techniques reduced computation cost 100 times compared to the matrix multiplication edge detection method. The new computation time was comparable to the FFT method, and MATLAB edge detection produced well-distributed fiber angle distributions that statistically distinguished aligned and unaligned fibers in half as many sample images. When the FFT sensitivity was improved by dividing images into smaller subsections, processing time grew larger than the time required for MATLAB edge detection. Implementation of edge detection in MATLAB is simpler, faster, and more sensitive than FFT for fiber alignment quantification.
Assessment of Homology Templates and an Anesthetic Binding Site within the γ-Aminobutyric Acid Receptor

PubMed Central

Bertaccini, Edward J.; Yoluk, Ozge; Lindahl, Erik R.; Trudell, James R.

2013-01-01

Background Anesthetics mediate portions of their activity via modulation of the γ-aminobutyric acid receptor (GABAaR). While its molecular structure remains unknown, significant progress has been made towards understanding its interactions with anesthetics via molecular modeling. Methods The structure of the torpedo acetylcholine receptor (nAChRα), the structures of the α4 and β2 subunits of the human nAChR, the structures of the eukaryotic glutamate-gated chloride channel (GluCl), and the prokaryotic pH sensing channels, from Gloeobacter violaceus and Erwinia chrysanthemi, were aligned with the SAlign and 3DMA algorithms. A multiple sequence alignment from these structures and those of the GABAaR was performed with ClustalW. The Modeler and Rosetta algorithms independently created three-dimensional constructs of the GABAaR from the GluCl template. The CDocker algorithm docked a congeneric series of propofol derivatives into the binding pocket and scored calculated binding affinities for correlation with known GABAaR potentiation EC50’s. Results Multiple structure alignments of templates revealed a clear consensus of residue locations relevant to anesthetic effects except for torpedo nAChR. Within the GABAaR models generated from GluCl, the residues notable for modulating anesthetic action within transmembrane segments 1, 2, and 3 converged on the intersubunit interface between alpha and beta subunits. Docking scores of a propofol derivative series into this binding site showed strong linear correlation with GABAaR potentiation EC50. Conclusion Consensus structural alignment based on homologous templates revealed an intersubunit anesthetic binding cavity within the transmembrane domain of the GABAaR, which showed correlation of ligand docking scores with experimentally measured GABAaR potentiation. PMID:23770602
Assessment of homology templates and an anesthetic binding site within the γ-aminobutyric acid receptor.

PubMed

Bertaccini, Edward J; Yoluk, Ozge; Lindahl, Erik R; Trudell, James R

2013-11-01

Anesthetics mediate portions of their activity via modulation of the γ-aminobutyric acid receptor (GABAaR). Although its molecular structure remains unknown, significant progress has been made toward understanding its interactions with anesthetics via molecular modeling. The structure of the torpedo acetylcholine receptor (nAChRα), the structures of the α4 and β2 subunits of the human nAChR, the structures of the eukaryotic glutamate-gated chloride channel (GluCl), and the prokaryotic pH-sensing channels, from Gloeobacter violaceus and Erwinia chrysanthemi, were aligned with the SAlign and 3DMA algorithms. A multiple sequence alignment from these structures and those of the GABAaR was performed with ClustalW. The Modeler and Rosetta algorithms independently created three-dimensional constructs of the GABAaR from the GluCl template. The CDocker algorithm docked a congeneric series of propofol derivatives into the binding pocket and scored calculated binding affinities for correlation with known GABAaR potentiation EC50s. Multiple structure alignments of templates revealed a clear consensus of residue locations relevant to anesthetic effects except for torpedo nAChR. Within the GABAaR models generated from GluCl, the residues notable for modulating anesthetic action within transmembrane segments 1, 2, and 3 converged on the intersubunit interface between α and β subunits. Docking scores of a propofol derivative series into this binding site showed strong linear correlation with GABAaR potentiation EC50. Consensus structural alignment based on homologous templates revealed an intersubunit anesthetic binding cavity within the transmembrane domain of the GABAaR, which showed a correlation of ligand docking scores with experimentally measured GABAaR potentiation.
Protein contact prediction by integrating deep multiple sequence alignments, coevolution and machine learning.

PubMed

Adhikari, Badri; Hou, Jie; Cheng, Jianlin

2018-03-01

In this study, we report the evaluation of the residue-residue contacts predicted by our three different methods in the CASP12 experiment, focusing on studying the impact of multiple sequence alignment, residue coevolution, and machine learning on contact prediction. The first method (MULTICOM-NOVEL) uses only traditional features (sequence profile, secondary structure, and solvent accessibility) with deep learning to predict contacts and serves as a baseline. The second method (MULTICOM-CONSTRUCT) uses our new alignment algorithm to generate deep multiple sequence alignment to derive coevolution-based features, which are integrated by a neural network method to predict contacts. The third method (MULTICOM-CLUSTER) is a consensus combination of the predictions of the first two methods. We evaluated our methods on 94 CASP12 domains. On a subset of 38 free-modeling domains, our methods achieved an average precision of up to 41.7% for top L/5 long-range contact predictions. The comparison of the three methods shows that the quality and effective depth of multiple sequence alignments, coevolution-based features, and machine learning integration of coevolution-based features and traditional features drive the quality of predicted protein contacts. On the full CASP12 dataset, the coevolution-based features alone can improve the average precision from 28.4% to 41.6%, and the machine learning integration of all the features further raises the precision to 56.3%, when top L/5 predicted long-range contacts are evaluated. And the correlation between the precision of contact prediction and the logarithm of the number of effective sequences in alignments is 0.66. © 2017 Wiley Periodicals, Inc.
De novo identification of highly diverged protein repeats by probabilistic consistency.

PubMed

Biegert, A; Söding, J

2008-03-15

An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID
Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

PubMed Central

2011-01-01

Background Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. Results In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0. Conclusions The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research. PMID:21867510
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures.

PubMed

Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

2009-01-01

ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures

PubMed Central

Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

2009-01-01

ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/ PMID:18971256
Rapid detection, classification and accurate alignment of up to a million or more related protein sequences.

PubMed

Neuwald, Andrew F

2009-08-01

The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical. This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. Supplementary data are available at Bioinformatics online.
Passively aligned multichannel fiber-pigtailing of planar integrated optical waveguides

NASA Astrophysics Data System (ADS)

Kremmel, Johannes; Lamprecht, Tobias; Crameri, Nino; Michler, Markus

2017-02-01

A silicon device to simplify the coupling of multiple single-mode fibers to embedded single-mode waveguides has been developed. The silicon device features alignment structures that enable a passive alignment of fibers to integrated waveguides. For passive alignment, precisely machined V-grooves on a silicon device are used and the planar lightwave circuit board features high-precision structures acting as a mechanical stop. The approach has been tested for up to eight fiber-to-waveguide connections. The alignment approach, the design, and the fabrication of the silicon device as well as the assembly process are presented. The characterization of the fiber-to-waveguide link reveals total coupling losses of (0.45±0.20 dB) per coupling interface, which is significantly lower than the values reported in earlier works. Subsequent climate tests reveal that the coupling losses remain stable during thermal cycling but increases significantly during an 85°C/85 Rh-test. All applied fabrication and bonding steps have been performed using standard MOEMS fabrication and packaging processes.
PF2fit: Polar Fast Fourier Matched Alignment of Atomistic Structures with 3D Electron Microscopy Maps.

PubMed

Bettadapura, Radhakrishna; Rasheed, Muhibur; Vollrath, Antje; Bajaj, Chandrajit

2015-10-01

There continue to be increasing occurrences of both atomistic structure models in the PDB (possibly reconstructed from X-ray diffraction or NMR data), and 3D reconstructed cryo-electron microscopy (3D EM) maps (albeit at coarser resolution) of the same or homologous molecule or molecular assembly, deposited in the EMDB. To obtain the best possible structural model of the molecule at the best achievable resolution, and without any missing gaps, one typically aligns (match and fits) the atomistic structure model with the 3D EM map. We discuss a new algorithm and generalized framework, named PF(2) fit (Polar Fast Fourier Fitting) for the best possible structural alignment of atomistic structures with 3D EM. While PF(2) fit enables only a rigid, six dimensional (6D) alignment method, it augments prior work on 6D X-ray structure and 3D EM alignment in multiple ways: Scoring. PF(2) fit includes a new scoring scheme that, in addition to rewarding overlaps between the volumes occupied by the atomistic structure and 3D EM map, rewards overlaps between the volumes complementary to them. We quantitatively demonstrate how this new complementary scoring scheme improves upon existing approaches. PF(2) fit also includes two scoring functions, the non-uniform exterior penalty and the skeleton-secondary structure score, and implements the scattering potential score as an alternative to traditional Gaussian blurring. Search. PF(2) fit utilizes a fast polar Fourier search scheme, whose main advantage is the ability to search over uniformly and adaptively sampled subsets of the space of rigid-body motions. PF(2) fit also implements a new reranking search and scoring methodology that considerably improves alignment metrics in results obtained from the initial search.
PF2 fit: Polar Fast Fourier Matched Alignment of Atomistic Structures with 3D Electron Microscopy Maps

PubMed Central

Bettadapura, Radhakrishna; Rasheed, Muhibur; Vollrath, Antje; Bajaj, Chandrajit

2015-01-01

There continue to be increasing occurrences of both atomistic structure models in the PDB (possibly reconstructed from X-ray diffraction or NMR data), and 3D reconstructed cryo-electron microscopy (3D EM) maps (albeit at coarser resolution) of the same or homologous molecule or molecular assembly, deposited in the EMDB. To obtain the best possible structural model of the molecule at the best achievable resolution, and without any missing gaps, one typically aligns (match and fits) the atomistic structure model with the 3D EM map. We discuss a new algorithm and generalized framework, named PF2 fit (Polar Fast Fourier Fitting) for the best possible structural alignment of atomistic structures with 3D EM. While PF2 fit enables only a rigid, six dimensional (6D) alignment method, it augments prior work on 6D X-ray structure and 3D EM alignment in multiple ways: Scoring. PF2 fit includes a new scoring scheme that, in addition to rewarding overlaps between the volumes occupied by the atomistic structure and 3D EM map, rewards overlaps between the volumes complementary to them. We quantitatively demonstrate how this new complementary scoring scheme improves upon existing approaches. PF2 fit also includes two scoring functions, the non-uniform exterior penalty and the skeleton-secondary structure score, and implements the scattering potential score as an alternative to traditional Gaussian blurring. Search. PF2 fit utilizes a fast polar Fourier search scheme, whose main advantage is the ability to search over uniformly and adaptively sampled subsets of the space of rigid-body motions. PF2 fit also implements a new reranking search and scoring methodology that considerably improves alignment metrics in results obtained from the initial search. PMID:26469938
Protein structure modeling for CASP10 by multiple layers of global optimization.

PubMed

Joo, Keehyoung; Lee, Juyong; Sim, Sangjin; Lee, Sun Young; Lee, Kiho; Heo, Seungryong; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung

2014-02-01

In the template-based modeling (TBM) category of CASP10 experiment, we introduced a new protocol called protein modeling system (PMS) to generate accurate protein structures in terms of side-chains as well as backbone trace. In the new protocol, a global optimization algorithm, called conformational space annealing (CSA), is applied to the three layers of TBM procedure: multiple sequence-structure alignment, 3D chain building, and side-chain re-modeling. For 3D chain building, we developed a new energy function which includes new distance restraint terms of Lorentzian type (derived from multiple templates), and new energy terms that combine (physical) energy terms such as dynamic fragment assembly (DFA) energy, DFIRE statistical potential energy, hydrogen bonding term, etc. These physical energy terms are expected to guide the structure modeling especially for loop regions where no template structures are available. In addition, we developed a new quality assessment method based on random forest machine learning algorithm to screen templates, multiple alignments, and final models. For TBM targets of CASP10, we find that, due to the combination of three stages of CSA global optimizations and quality assessment, the modeling accuracy of PMS improves at each additional stage of the protocol. It is especially noteworthy that the side-chains of the final PMS models are far more accurate than the models in the intermediate steps. Copyright © 2013 Wiley Periodicals, Inc.
Simultaneous phylogeny reconstruction and multiple sequence alignment

PubMed Central

Yue, Feng; Shi, Jian; Tang, Jijun

2009-01-01

Background A phylogeny is the evolutionary history of a group of organisms. To date, sequence data is still the most used data type for phylogenetic reconstruction. Before any sequences can be used for phylogeny reconstruction, they must be aligned, and the quality of the multiple sequence alignment has been shown to affect the quality of the inferred phylogeny. At the same time, all the current multiple sequence alignment programs use a guide tree to produce the alignment and experiments showed that good guide trees can significantly improve the multiple alignment quality. Results We devise a new algorithm to simultaneously align multiple sequences and search for the phylogenetic tree that leads to the best alignment. We also implemented the algorithm as a C program package, which can handle both DNA and protein data and can take simple cost model as well as complex substitution matrices, such as PAM250 or BLOSUM62. The performance of the new method are compared with those from other popular multiple sequence alignment tools, including the widely used programs such as ClustalW and T-Coffee. Experimental results suggest that this method has good performance in terms of both phylogeny accuracy and alignment quality. Conclusion We present an algorithm to align multiple sequences and reconstruct the phylogenies that minimize the alignment score, which is based on an efficient algorithm to solve the median problems for three sequences. Our extensive experiments suggest that this method is very promising and can produce high quality phylogenies and alignments. PMID:19208110
MACSIMS : multiple alignment of complete sequences information management system

PubMed Central

Thompson, Julie D; Muller, Arnaud; Waterhouse, Andrew; Procter, Jim; Barton, Geoffrey J; Plewniak, Frédéric; Poch, Olivier

2006-01-01

Background In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family. Results MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis. Conclusion MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at . PMID:16792820
NoFold: RNA structure clustering without folding or alignment.

PubMed

Middleton, Sarah A; Kim, Junhyong

2014-11-01

Structures that recur across multiple different transcripts, called structure motifs, often perform a similar function-for example, recruiting a specific RNA-binding protein that then regulates translation, splicing, or subcellular localization. Identifying common motifs between coregulated transcripts may therefore yield significant insight into their binding partners and mechanism of regulation. However, as most methods for clustering structures are based on folding individual sequences or doing many pairwise alignments, this results in a tradeoff between speed and accuracy that can be problematic for large-scale data sets. Here we describe a novel method for comparing and characterizing RNA secondary structures that does not require folding or pairwise alignment of the input sequences. Our method uses the idea of constructing a distance function between two objects by their respective distances to a collection of empirical examples or models, which in our case consists of 1973 Rfam family covariance models. Using this as a basis for measuring structural similarity, we developed a clustering pipeline called NoFold to automatically identify and annotate structure motifs within large sequence data sets. We demonstrate that NoFold can simultaneously identify multiple structure motifs with an average sensitivity of 0.80 and precision of 0.98 and generally exceeds the performance of existing methods. We also perform a cross-validation analysis of the entire set of Rfam families, achieving an average sensitivity of 0.57. We apply NoFold to identify motifs enriched in dendritically localized transcripts and report 213 enriched motifs, including both known and novel structures. © 2014 Middleton and Kim; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
BlockLogo: visualization of peptide and sequence motif conservation

PubMed Central

Olsen, Lars Rønn; Kudahl, Ulrich Johan; Simon, Christian; Sun, Jing; Schönbach, Christian; Reinherz, Ellis L.; Zhang, Guang Lan; Brusic, Vladimir

2013-01-01

BlockLogo is a web-server application for visualization of protein and nucleotide fragments, continuous protein sequence motifs, and discontinuous sequence motifs using calculation of block entropy from multiple sequence alignments. The user input consists of a multiple sequence alignment, selection of motif positions, type of sequence, and output format definition. The output has BlockLogo along with the sequence logo, and a table of motif frequencies. We deployed BlockLogo as an online application and have demonstrated its utility through examples that show visualization of T-cell epitopes and B-cell epitopes (both continuous and discontinuous). Our additional example shows a visualization and analysis of structural motifs that determine specificity of peptide binding to HLA-DR molecules. The BlockLogo server also employs selected experimentally validated prediction algorithms to enable on-the-fly prediction of MHC binding affinity to 15 common HLA class I and class II alleles as well as visual analysis of discontinuous epitopes from multiple sequence alignments. It enables the visualization and analysis of structural and functional motifs that are usually described as regular expressions. It provides a compact view of discontinuous motifs composed of distant positions within biological sequences. BlockLogo is available at: http://research4.dfci.harvard.edu/cvc/blocklogo/ and http://methilab.bu.edu/blocklogo/ PMID:24001880
Towards Long-Range RNA Structure Prediction in Eukaryotic Genes.

PubMed

Pervouchine, Dmitri D

2018-06-15

The ability to form an intramolecular structure plays a fundamental role in eukaryotic RNA biogenesis. Proximate regions in the primary transcripts fold into a local secondary structure, which is then hierarchically assembled into a tertiary structure that is stabilized by RNA-binding proteins and long-range intramolecular base pairings. While the local RNA structure can be predicted reasonably well for short sequences, long-range structure at the scale of eukaryotic genes remains problematic from the computational standpoint. The aim of this review is to list functional examples of long-range RNA structures, to summarize current comparative methods of structure prediction, and to highlight their advances and limitations in the context of long-range RNA structures. Most comparative methods implement the “first-align-then-fold” principle, i.e., they operate on multiple sequence alignments, while functional RNA structures often reside in non-conserved parts of the primary transcripts. The opposite “first-fold-then-align” approach is currently explored to a much lesser extent. Developing novel methods in both directions will improve the performance of comparative RNA structure analysis and help discover novel long-range structures, their higher-order organization, and RNA⁻RNA interactions across the transcriptome.
Self-assembly of vertically aligned quantum ring-dot structure by Multiple Droplet Epitaxy

NASA Astrophysics Data System (ADS)

Elborg, Martin; Noda, Takeshi; Mano, Takaaki; Kuroda, Takashi; Yao, Yuanzhao; Sakuma, Yoshiki; Sakoda, Kazuaki

2017-11-01

We successfully grow vertically aligned quantum ring-dot structures by Multiple Droplet Epitaxy technique. The growth is achieved by depositing GaAs quantum rings in a first droplet epitaxy process which are subsequently covered by a thin AlGaAs barrier. In a second droplet epitaxy process, Ga droplets preferentially position in the center indentation of the ring as well as attached to the edge of the ring in [ 1 1 bar 0 ] direction. By designing the ring geometry, full selectivity for the center position of the ring is achieved where we crystallize the droplets into quantum dots. The geometry of the ring and dot as well as barrier layer can be controlled in separate growth steps. This technique offers great potential for creating complex quantum molecules for novel quantum information technologies.

HAL: a hierarchical format for storing and analyzing multiple genome alignments.

PubMed

Hickey, Glenn; Paten, Benedict; Earl, Dent; Zerbino, Daniel; Haussler, David

2013-05-15

Large multiple genome alignments and inferred ancestral genomes are ideal resources for comparative studies of molecular evolution, and advances in sequencing and computing technology are making them increasingly obtainable. These structures can provide a rich understanding of the genetic relationships between all subsets of species they contain. Current formats for storing genomic alignments, such as XMFA and MAF, are all indexed or ordered using a single reference genome, however, which limits the information that can be queried with respect to other species and clades. This loss of information grows with the number of species under comparison, as well as their phylogenetic distance. We present HAL, a compressed, graph-based hierarchical alignment format for storing multiple genome alignments and ancestral reconstructions. HAL graphs are indexed on all genomes they contain. Furthermore, they are organized phylogenetically, which allows for modular and parallel access to arbitrary subclades without fragmentation because of rearrangements that have occurred in other lineages. HAL graphs can be created or read with a comprehensive C++ API. A set of tools is also provided to perform basic operations, such as importing and exporting data, identifying mutations and coordinate mapping (liftover). All documentation and source code for the HAL API and tools are freely available at http://github.com/glennhickey/hal. hickey@soe.ucsc.edu or haussler@soe.ucsc.edu Supplementary data are available at Bioinformatics online.
Alignment of gold nanorods by angular photothermal depletion

DOE Office of Scientific and Technical Information (OSTI.GOV)

Taylor, Adam B.; Chow, Timothy T. Y.; Chon, James W. M., E-mail: jchon@swin.edu.au

2014-02-24

In this paper, we demonstrate that a high degree of alignment can be imposed upon randomly oriented gold nanorod films by angular photothermal depletion with linearly polarized laser irradiation. The photothermal reshaping of gold nanorods is observed to follow quadratic melting model rather than the threshold melting model, which distorts the angular and spectral hole created on 2D distribution map of nanorods to be an open crater shape. We have accounted these observations to the alignment procedures and demonstrated good agreement between experiment and simulations. The use of multiple laser depletion wavelengths allowed alignment criteria over a large range ofmore » aspect ratios, achieving 80% of the rods in the target angular range. We extend the technique to demonstrate post-alignment in a multilayer of randomly oriented gold nanorod films, with arbitrary control of alignment shown across the layers. Photothermal angular depletion alignment of gold nanorods is a simple, promising post-alignment method for creating future 3D or multilayer plasmonic nanorod based devices and structures.« less
Template-based protein structure modeling using the RaptorX web server.

PubMed

Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo

2012-07-19

A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world.
Template-based protein structure modeling using the RaptorX web server

PubMed Central

Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo

2016-01-01

A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world. PMID:22814390
MANGO: a new approach to multiple sequence alignment.

PubMed

Zhang, Zefeng; Lin, Hao; Li, Ming

2007-01-01

Multiple sequence alignment is a classical and challenging task for biological sequence analysis. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state of the art multiple sequence alignment programs suffer from the 'once a gap, always a gap' phenomenon. Is there a radically new way to do multiple sequence alignment? This paper introduces a novel and orthogonal multiple sequence alignment method, using multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds are provably significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks showing that MANGO compares favorably, in both accuracy and speed, against state-of-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, Prob-ConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0 and Kalign 2.0.
Flexible and Lightweight Pressure Sensor Based on Carbon Nanotube/Thermoplastic Polyurethane-Aligned Conductive Foam with Superior Compressibility and Stability.

PubMed

Huang, Wenju; Dai, Kun; Zhai, Yue; Liu, Hu; Zhan, Pengfei; Gao, Jiachen; Zheng, Guoqiang; Liu, Chuntai; Shen, Changyu

2017-12-06

Flexible and lightweight carbon nanotube (CNT)/thermoplastic polyurethane (TPU) conductive foam with a novel aligned porous structure was fabricated. The density of the aligned porous material was as low as 0.123 g·cm -3 . Homogeneous dispersion of CNTs was achieved through the skeleton of the foam, and an ultralow percolation threshold of 0.0023 vol % was obtained. Compared with the disordered foam, mechanical properties of the aligned foam were enhanced and the piezoresistive stability of the flexible foam was improved significantly. The compression strength of the aligned TPU foam increases by 30.7% at the strain of 50%, and the stress of the aligned foam is 22 times that of the disordered foam at the strain of 90%. Importantly, the resistance variation of the aligned foam shows a fascinating linear characteristic under the applied strain until 77%, which would benefit the application of the foam as a desired pressure sensor. During multiple cyclic compression-release measurements, the aligned conductive CNT/TPU foam represents excellent reversibility and reproducibility in terms of resistance. This nice capability benefits from the aligned porous structure composed of ladderlike cells along the orientation direction. Simultaneously, the human motion detections, such as walk, jump, squat, etc. were demonstrated by using our flexible pressure sensor. Because of the lightweight, flexibility, high compressibility, excellent reversibility, and reproducibility of the conductive aligned foam, the present study is capable of providing new insights into the fabrication of a high-performance pressure sensor.
Environmental constraints shaping constituent order in emerging communication systems: Structural iconicity, interactive alignment and conventionalization.

PubMed

Christensen, Peer; Fusaroli, Riccardo; Tylén, Kristian

2016-01-01

Where does linguistic structure come from? Recent gesture elicitation studies have indicated that constituent order (corresponding to for instance subject-verb-object, or SVO in English) may be heavily influenced by human cognitive biases constraining gesture production and transmission. Here we explore the alternative hypothesis that syntactic patterns are motivated by multiple environmental and social-interactional constraints that are external to the cognitive domain. In three experiments, we systematically investigate different motivations for structure in the gestural communication of simple transitive events. The first experiment indicates that, if participants communicate about different types of events, manipulation events (e.g. someone throwing a cake) and construction events (e.g. someone baking a cake), they spontaneously and systematically produce different constituent orders, SOV and SVO respectively, thus following the principle of structural iconicity. The second experiment shows that participants' choice of constituent order is also reliably influenced by social-interactional forces of interactive alignment, that is, the tendency to re-use an interlocutor's previous choice of constituent order, thus potentially overriding affordances for iconicity. Lastly, the third experiment finds that the relative frequency distribution of referent event types motivates the stabilization and conventionalization of a single constituent order for the communication of different types of events. Together, our results demonstrate that constituent order in emerging gestural communication systems is shaped and stabilized in response to multiple external environmental and social factors: structural iconicity, interactive alignment and distributional frequency. Copyright © 2015 Elsevier B.V. All rights reserved.
gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances.

PubMed

Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav

2016-01-01

Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos).
gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances

PubMed Central

Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav

2016-01-01

Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos). PMID:27846272
Evolutionary trade-offs and the structure of polymorphisms.

PubMed

Sheftel, Hila; Szekely, Pablo; Mayo, Avi; Sella, Guy; Alon, Uri

2018-05-26

Populations of organisms show genetic differences called polymorphisms. Understanding the effects of polymorphisms is important for biology and medicine. Here, we ask which polymorphisms occur at high frequency when organisms evolve under trade-offs between multiple tasks. Multiple tasks present a problem, because it is not possible to be optimal at all tasks simultaneously and hence compromises are necessary. Recent work indicates that trade-offs lead to a simple geometry of phenotypes in the space of traits: phenotypes fall on the Pareto front, which is shaped as a polytope: a line, triangle, tetrahedron etc. The vertices of these polytopes are the optimal phenotypes for a single task. Up to now, work on this Pareto approach has not considered its genetic underpinnings. Here, we address this by asking how the polymorphism structure of a population is affected by evolution under trade-offs. We simulate a multi-task selection scenario, in which the population evolves to the Pareto front: the line segment between two archetypes or the triangle between three archetypes. We find that polymorphisms that become prevalent in the population have pleiotropic phenotypic effects that align with the Pareto front. Similarly, epistatic effects between prevalent polymorphisms are parallel to the front. Alignment with the front occurs also for asexual mating. Alignment is reduced when drift or linkage is strong, and is replaced by a more complex structure in which many perpendicular allele effects cancel out. Aligned polymorphism structure allows mating to produce offspring that stand a good chance of being optimal multi-taskers in at least one of the locales available to the species.This article is part of the theme issue 'Self-organization in cell biology'. © 2018 The Author(s).
Multiple DNA and protein sequence alignment on a workstation and a supercomputer.

PubMed

Tajima, K

1988-11-01

This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.
Homeotropic alignment of multiple bent-core liquid crystal phases using a polydimethylsiloxane alignment layer

NASA Astrophysics Data System (ADS)

Carlson, Eric D.; Foley, Lee M.; Guzman, Edward; Korblova, Eva D.; Visvanathan, Rayshan; Ryu, SeongHo; Gim, Min-Jun; Tuchband, Michael R.; Yoon, Dong Ki; Clark, Noel A.; Walba, David M.

2017-08-01

The control of the molecular orientation of liquid crystals (LCs) is important in both understanding phase properties and the continuing development of new LC technologies including displays, organic transistors, and electro-optic devices. Many techniques have been developed for successfully inducing alignment of calamitic LCs, though these techniques typically do not translate to the alignment of bent-core liquid crystals (BCLCs). Some techniques have been utilized to align various phases of BCLCs, but these techniques are often unsuccessful for general alignment of multiple materials and/or multiple phases. Here, we demonstrate that glass cells treated with polydimethylsiloxane (PDMS) thin films induce high quality homeotropic alignment of multiple mesophases of four BCLCs. On cooling to the lowest temperature phase the homeotropic alignment is lost, and spherulitic growth is seen in crystal and crystal-like phases including the dark conglomerate (DC) and helical nanofilament (HNF) phases. Evidence of homeotropic alignment is observed using polarized optical microscopy. We speculate that the methyl groups on the surface of the PDMS films strongly interact with the aliphatic tails of each mesogens, resulting in homeotropic alignment.
Implied alignment: a synapomorphy-based multiple-sequence alignment method and its use in cladogram search

NASA Technical Reports Server (NTRS)

Wheeler, Ward C.

2003-01-01

A method to align sequence data based on parsimonious synapomorphy schemes generated by direct optimization (DO; earlier termed optimization alignment) is proposed. DO directly diagnoses sequence data on cladograms without an intervening multiple-alignment step, thereby creating topology-specific, dynamic homology statements. Hence, no multiple-alignment is required to generate cladograms. Unlike general and globally optimal multiple-alignment procedures, the method described here, implied alignment (IA), takes these dynamic homologies and traces them back through a single cladogram, linking the unaligned sequence positions in the terminal taxa via DO transformation series. These "lines of correspondence" link ancestor-descendent states and, when displayed as linearly arrayed columns without hypothetical ancestors, are largely indistinguishable from standard multiple alignment. Since this method is based on synapomorphy, the treatment of certain classes of insertion-deletion (indel) events may be different from that of other alignment procedures. As with all alignment methods, results are dependent on parameter assumptions such as indel cost and transversion:transition ratios. Such an IA could be used as a basis for phylogenetic search, but this would be questionable since the homologies derived from the implied alignment depend on its natal cladogram and any variance, between DO and IA + Search, due to heuristic approach. The utility of this procedure in heuristic cladogram searches using DO and the improvement of heuristic cladogram cost calculations are discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.
Sequence alignment visualization in HTML5 without Java.

PubMed

Gille, Christoph; Birgit, Weyand; Gille, Andreas

2014-01-01

Java has been extensively used for the visualization of biological data in the web. However, the Java runtime environment is an additional layer of software with an own set of technical problems and security risks. HTML in its new version 5 provides features that for some tasks may render Java unnecessary. Alignment-To-HTML is the first HTML-based interactive visualization for annotated multiple sequence alignments. The server side script interpreter can perform all tasks like (i) sequence retrieval, (ii) alignment computation, (iii) rendering, (iv) identification of a homologous structural models and (v) communication with BioDAS-servers. The rendered alignment can be included in web pages and is displayed in all browsers on all platforms including touch screen tablets. The functionality of the user interface is similar to legacy Java applets and includes color schemes, highlighting of conserved and variable alignment positions, row reordering by drag and drop, interlinked 3D visualization and sequence groups. Novel features are (i) support for multiple overlapping residue annotations, such as chemical modifications, single nucleotide polymorphisms and mutations, (ii) mechanisms to quickly hide residue annotations, (iii) export to MS-Word and (iv) sequence icons. Alignment-To-HTML, the first interactive alignment visualization that runs in web browsers without additional software, confirms that to some extend HTML5 is already sufficient to display complex biological data. The low speed at which programs are executed in browsers is still the main obstacle. Nevertheless, we envision an increased use of HTML and JavaScript for interactive biological software. Under GPL at: http://www.bioinformatics.org/strap/toHTML/.
StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase.

PubMed

Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L

2011-06-02

Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.
ASPIC: a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences.

PubMed

Bonizzoni, Paola; Rizzi, Raffaella; Pesole, Graziano

2005-10-05

Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs) to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems--hence the need to develop novel strategies. We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites. To avoid limitations of splice sites prediction (mainly, over-predictions) due to independent single EST alignments to the genomic sequence our approach performs a multiple alignment of transcript data to the genomic sequence based on the combined analysis of all available data. We recast the problem of predicting constitutive and alternative splicing as an optimization problem, where the optimal multiple transcript alignment minimizes the number of exons and hence of splice site observations. We have implemented a splice site predictor based on this algorithm in the software tool ASPIC (Alternative Splicing PredICtion). It is distinguished from other methods based on BLAST-like tools by the incorporation of entirely new ad hoc procedures for accurate and computationally efficient transcript alignment and adopts dynamic programming for the refinement of intron boundaries. ASPIC also provides the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events. The ASPIC web resource is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility. Extensive bench marking shows that ASPIC outperforms other existing methods in the detection of novel splicing isoforms and in the minimization of over-predictions. ASPIC also requires a lower computation time for processing a single gene and an EST cluster. The ASPIC web resource is available at http://aspic.algo.disco.unimib.it/aspic-devel/.
Using Distractor-Driven Standards-Based Multiple-Choice Assessments and Rasch Modeling to Investigate Hierarchies of Chemistry Misconceptions and Detect Structural Problems with Individual Items

ERIC Educational Resources Information Center

Herrmann-Abell, Cari F.; DeBoer, George E.

2011-01-01

Distractor-driven multiple-choice assessment items and Rasch modeling were used as diagnostic tools to investigate students' understanding of middle school chemistry ideas. Ninety-one items were developed according to a procedure that ensured content alignment to the targeted standards and construct validity. The items were administered to 13360…
Directionally Antagonistic Graphene Oxide-Polyurethane Hybrid Aerogel as a Sound Absorber.

PubMed

Oh, Jung-Hwan; Kim, Jieun; Lee, Hyeongrae; Kang, Yeonjune; Oh, Il-Kwon

2018-06-21

Innovative sound absorbers, the design of which is based on carbon nanotubes and graphene derivatives, could be used to make more efficient sound absorbing materials because of their excellent intrinsic mechanical and chemical properties. However, controlling the directional alignments of low-dimensional carbon nanomaterials, such as restacking, alignment, and dispersion, has been a challenging problem when developing sound absorbing forms. Herein, we present the directionally antagonistic graphene oxide-polyurethane hybrid aerogel we developed as a sound absorber, the physical properties of which differ according to the alignment of the microscopic graphene oxide sheets. This porous graphene sound absorber has a microporous hierarchical cellular structure with adjustable stiffness and improved sound absorption performance, thereby overcoming the restrictions of both geometric and function-orientated functions. Furthermore, by controlling the inner cell size and aligned structure of graphene oxide layers in this study, we achieved remarkable improvement of the sound absorption performance at low frequency. This improvement is attributed to multiple scattering of incident and reflection waves on the aligned porous surfaces, and air-viscous resistance damping inside interconnected structures between the urethane foam and the graphene oxide network. Two anisotropic sound absorbers based on the directionally antagonistic graphene oxide-polyurethane hybrid aerogels were fabricated. They show remarkable differences owing to the opposite alignment of graphene oxide layers inside the polyurethane foam and are expected to be appropriate for the engineering design of sound absorbers in consideration of the wave direction.
Analyses of the radiation of birnaviruses from diverse host phyla and of their evolutionary affinities with other double-stranded RNA and positive strand RNA viruses using robust structure-based multiple sequence alignments and advanced phylogenetic methods

PubMed Central

2013-01-01

Background Birnaviruses form a distinct family of double-stranded RNA viruses infecting animals as different as vertebrates, mollusks, insects and rotifers. With such a wide host range, they constitute a good model for studying the adaptation to the host. Additionally, several lines of evidence link birnaviruses to positive strand RNA viruses and suggest that phylogenetic analyses may provide clues about transition. Results We characterized the genome of a birnavirus from the rotifer Branchionus plicalitis. We used X-ray structures of RNA-dependent RNA polymerases and capsid proteins to obtain multiple structure alignments that allowed us to obtain reliable multiple sequence alignments and we employed “advanced” phylogenetic methods to study the evolutionary relationships between some positive strand and double-stranded RNA viruses. We showed that the rotifer birnavirus genome exhibited an organization remarkably similar to other birnaviruses. As this host was phylogenetically very distant from the other known species targeted by birnaviruses, we revisited the evolutionary pathways within the Birnaviridae family using phylogenetic reconstruction methods. We also applied a number of phylogenetic approaches based on structurally conserved domains/regions of the capsid and RNA-dependent RNA polymerase proteins to study the evolutionary relationships between birnaviruses, other double-stranded RNA viruses and positive strand RNA viruses. Conclusions We show that there is a good correlation between the phylogeny of the birnaviruses and that of their hosts at the phylum level using the RNA-dependent RNA polymerase (genomic segment B) on the one hand and a concatenation of the capsid protein, protease and ribonucleoprotein (genomic segment A) on the other hand. This correlation tends to vanish within phyla. The use of advanced phylogenetic methods and robust structure-based multiple sequence alignments allowed us to obtain a more accurate picture (in terms of probability of the tree topologies) of the evolutionary affinities between double-stranded RNA and positive strand RNA viruses. In particular, we were able to show that there exists a good statistical support for the claims that dsRNA viruses are not monophyletic and that viruses with permuted RdRps belong to a common evolution lineage as previously proposed by other groups. We also propose a tree topology with a good statistical support describing the evolutionary relationships between the Picornaviridae, Caliciviridae, Flaviviridae families and a group including the Alphatetraviridae, Nodaviridae, Permutotretraviridae, Birnaviridae, and Cystoviridae families. PMID:23865988
Swarm Observation of Field-Aligned Currents Associated With Multiple Auroral Arc Systems

NASA Astrophysics Data System (ADS)

Wu, J.; Knudsen, D. J.; Gillies, D. M.; Donovan, E. F.; Burchill, J. K.

2017-10-01

Auroral arcs occur in regions of upward field-aligned currents (FACs); however, the relation is not one to one, since kinetic energy of the current-carrying electrons is also important in the production of auroral luminosity. Multiple auroral arc systems provide an opportunity to study the relation between FACs and auroral brightness in detail. In this study, we have identified two types of FAC configurations in multiple parallel arc systems using ground-based optical data from the Time History of Events and Macroscale Interactions during Substorms all-sky imagers, magnetometers and electric field instruments on board the Swarm satellites. In "unipolar FAC" events, each arc is an intensification within a broad, unipolar current sheet and downward return currents occur outside of this broad sheet. In "multipolar FAC" events, multiple arc systems represent a collection of multiple up/down current pairs. By collecting 17 events with unipolar FAC and 12 events with multipolar FACs, we find that (1) unipolar FAC events occur most frequently between 20 and 21 magnetic local time and multipolar FAC events tend to occur around local midnight and within 1 h after substorm onset. (2) Arcs in unipolar FAC systems have a typical width of 10-20 km and a spacing of 25-50 km. Arcs in multipolar FAC systems are wider and more separated. (3) Upward currents with more arcs embedded have larger intensities and widths. (4) Electric fields are strong and highly structured on the edges of multiple arc system with unipolar FAC. The fact that arcs with unipolar FAC are much more highly structured than the associated currents suggests that arc multiplicity is indicative not of a structured generator deep in the magnetosphere, but rather of the magnetosphere-ionosphere coupling process.

MICA: Multiple interval-based curve alignment

NASA Astrophysics Data System (ADS)

Mann, Martin; Kahle, Hans-Peter; Beck, Matthias; Bender, Bela Johannes; Spiecker, Heinrich; Backofen, Rolf

2018-01-01

MICA enables the automatic synchronization of discrete data curves. To this end, characteristic points of the curves' shapes are identified. These landmarks are used within a heuristic curve registration approach to align profile pairs by mapping similar characteristics onto each other. In combination with a progressive alignment scheme, this enables the computation of multiple curve alignments. Multiple curve alignments are needed to derive meaningful representative consensus data of measured time or data series. MICA was already successfully applied to generate representative profiles of tree growth data based on intra-annual wood density profiles or cell formation data. The MICA package provides a command-line and graphical user interface. The R interface enables the direct embedding of multiple curve alignment computation into larger analyses pipelines. Source code, binaries and documentation are freely available at https://github.com/BackofenLab/MICA
Mango: multiple alignment with N gapped oligos.

PubMed

Zhang, Zefeng; Lin, Hao; Li, Ming

2008-06-01

Multiple sequence alignment is a classical and challenging task. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state-of-the-art works suffer from the "once a gap, always a gap" phenomenon. Is there a radically new way to do multiple sequence alignment? In this paper, we introduce a novel and orthogonal multiple sequence alignment method, using both multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole and tries to build the alignment vertically, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds have proved significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks, showing that MANGO compares favorably, in both accuracy and speed, against state-of-the-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, ProbConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0, and Kalign 2.0. We have further demonstrated the scalability of MANGO on very large datasets of repeat elements. MANGO can be downloaded at http://www.bioinfo.org.cn/mango/ and is free for academic usage.
Multilevel, multicomponent microarchitectures of vertically-aligned carbon nanotubes for diverse applications.

PubMed

Qu, Liangti; Vaia, Rich A; Dai, Liming

2011-02-22

A simple multiple contact transfer technique has been developed for controllable fabrication of multilevel, multicomponent microarchitectures of vertically aligned carbon nanotubes (VA-CNTs). Three dimensional (3-D) multicomponent micropatterns of aligned single-walled carbon nanotubes (SWNTs) and multiwalled carbon nanotubes (MWNTs) have been fabricated, which can be used to develop a newly designed touch sensor with reversible electrical responses for potential applications in electronic devices, as demonstrated in this study. The demonstrated dependence of light diffraction on structural transfiguration of the resultant CNT micropattern also indicates their potential for optical devices. Further introduction of various components with specific properties (e.g., ZnO nanorods) into the CNT micropatterns enabled us to tailor such surface characteristics as wettability and light response. Owing to the highly generic nature of the multiple contact transfer strategy, the methodology developed here could provide a general approach for interposing a large variety of multicomponent elements (e.g., nanotubes, nanorods/wires, photonic crystals, etc.) onto a single chip for multifunctional device applications.
DNA Multiple Sequence Alignment Guided by Protein Domains: The MSA-PAD 2.0 Method.

PubMed

Balech, Bachir; Monaco, Alfonso; Perniola, Michele; Santamaria, Monica; Donvito, Giacinto; Vicario, Saverio; Maggi, Giorgio; Pesole, Graziano

2018-01-01

Multiple sequence alignment (MSA) is a fundamental component in many DNA sequence analyses including metagenomics studies and phylogeny inference. When guided by protein profiles, DNA multiple alignments assume a higher precision and robustness. Here we present details of the use of the upgraded version of MSA-PAD (2.0), which is a DNA multiple sequence alignment framework able to align DNA sequences coding for single/multiple protein domains guided by PFAM or user-defined annotations. MSA-PAD has two alignment strategies, called "Gene" and "Genome," accounting for coding domains order and genomic rearrangements, respectively. Novel options were added to the present version, where the MSA can be guided by protein profiles provided by the user. This allows MSA-PAD 2.0 to run faster and to add custom protein profiles sometimes not present in PFAM database according to the user's interest. MSA-PAD 2.0 is currently freely available as a Web application at https://recasgateway.cloud.ba.infn.it/ .
Score distributions of gapped multiple sequence alignments down to the low-probability tail

NASA Astrophysics Data System (ADS)

Fieth, Pascal; Hartmann, Alexander K.

2016-08-01

Assessing the significance of alignment scores of optimally aligned DNA or amino acid sequences can be achieved via the knowledge of the score distribution of random sequences. But this requires obtaining the distribution in the biologically relevant high-scoring region, where the probabilities are exponentially small. For gapless local alignments of infinitely long sequences this distribution is known analytically to follow a Gumbel distribution. Distributions for gapped local alignments and global alignments of finite lengths can only be obtained numerically. To obtain result for the small-probability region, specific statistical mechanics-based rare-event algorithms can be applied. In previous studies, this was achieved for pairwise alignments. They showed that, contrary to results from previous simple sampling studies, strong deviations from the Gumbel distribution occur in case of finite sequence lengths. Here we extend the studies to multiple sequence alignments with gaps, which are much more relevant for practical applications in molecular biology. We study the distributions of scores over a large range of the support, reaching probabilities as small as 10-160, for global and local (sum-of-pair scores) multiple alignments. We find that even after suitable rescaling, eliminating the sequence-length dependence, the distributions for multiple alignment differ from the pairwise alignment case. Furthermore, we also show that the previously discussed Gaussian correction to the Gumbel distribution needs to be refined, also for the case of pairwise alignments.
A Novel Center Star Multiple Sequence Alignment Algorithm Based on Affine Gap Penalty and K-Band

NASA Astrophysics Data System (ADS)

Zou, Quan; Shan, Xiao; Jiang, Yi

Multiple sequence alignment is one of the most important topics in computational biology, but it cannot deal with the large data so far. As the development of copy-number variant(CNV) and Single Nucleotide Polymorphisms(SNP) research, many researchers want to align numbers of similar sequences for detecting CNV and SNP. In this paper, we propose a novel multiple sequence alignment algorithm based on affine gap penalty and k-band. It can align more quickly and accurately, that will be helpful for mining CNV and SNP. Experiments prove the performance of our algorithm.
DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors.

PubMed

Schmollinger, Martin; Nieselt, Kay; Kaufmann, Michael; Morgenstern, Burkhard

2004-09-09

Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a) pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b) For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.
Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm.

PubMed

Rani, R Ranjani; Ramyachitra, D

2016-12-01

Multiple sequence alignment (MSA) is a widespread approach in computational biology and bioinformatics. MSA deals with how the sequences of nucleotides and amino acids are sequenced with possible alignment and minimum number of gaps between them, which directs to the functional, evolutionary and structural relationships among the sequences. Still the computation of MSA is a challenging task to provide an efficient accuracy and statistically significant results of alignments. In this work, the Bacterial Foraging Optimization Algorithm was employed to align the biological sequences which resulted in a non-dominated optimal solution. It employs Multi-objective, such as: Maximization of Similarity, Non-gap percentage, Conserved blocks and Minimization of gap penalty. BAliBASE 3.0 benchmark database was utilized to examine the proposed algorithm against other methods In this paper, two algorithms have been proposed: Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC) and Bacterial Foraging Optimization Algorithm. It was found that Hybrid Genetic Algorithm with Artificial Bee Colony performed better than the existing optimization algorithms. But still the conserved blocks were not obtained using GA-ABC. Then BFO was used for the alignment and the conserved blocks were obtained. The proposed Multi-Objective Bacterial Foraging Optimization Algorithm (MO-BFO) was compared with widely used MSA methods Clustal Omega, Kalign, MUSCLE, MAFFT, Genetic Algorithm (GA), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO) and Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC). The final results show that the proposed MO-BFO algorithm yields better alignment than most widely used methods. Copyright Â© 2016 Elsevier Ireland Ltd. All rights reserved.
Patterned growth of individual and multiple vertically aligned carbon nanofibers

NASA Astrophysics Data System (ADS)

Merkulov, V. I.; Lowndes, D. H.; Wei, Y. Y.; Eres, G.; Voelkl, E.

2000-06-01

The results of studies of patterned growth of vertically aligned carbon nanofibers (VACNFs) prepared by plasma-enhanced chemical vapor deposition are reported. Nickel (Ni) dots of various diameters and Ni lines with variable widths and shapes were fabricated using electron beam lithography and evaporation, and served for catalytic growth of VACNFs whose structure was determined by high resolution transmission electron microscopy. It is found that upon plasma pre-etching and heating up to 600-700 °C, thin films of Ni break into droplets which initiate the growth of VACNFs. Above a critical dot size multiple droplets are formed, and consequently multiple VACNFs grow from a single evaporated dot. For dot sizes smaller than the critical size only one droplet is formed, resulting in a single VACNF. In the case of a patterned line, the growth mechanism is similar to that from a dot. VACNFs grow along the line, and above a critical linewidth multiple VACNFs are produced across the line. The mechanism of the formation of single and multiple catalyst droplets and subsequently of VACNFs is discussed.
Automated protein structure modeling in CASP9 by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement

PubMed Central

Xu, Dong; Zhang, Jian; Roy, Ambrish; Zhang, Yang

2011-01-01

I-TASSER is an automated pipeline for protein tertiary structure prediction using multiple threading alignments and iterative structure assembly simulations. In CASP9 experiments, two new algorithms, QUARK and FG-MD, were added to the I-TASSER pipeline for improving the structural modeling accuracy. QUARK is a de novo structure prediction algorithm used for structure modeling of proteins that lack detectable template structures. For distantly homologous targets, QUARK models are found useful as a reference structure for selecting good threading alignments and guiding the I-TASSER structure assembly simulations. FG-MD is an atomic-level structural refinement program that uses structural fragments collected from the PDB structures to guide molecular dynamics simulation and improve the local structure of predicted model, including hydrogen-bonding networks, torsion angles and steric clashes. Despite considerable progress in both the template-based and template-free structure modeling, significant improvements on protein target classification, domain parsing, model selection, and ab initio folding of beta-proteins are still needed to further improve the I-TASSER pipeline. PMID:22069036
AlexSys: a knowledge-based expert system for multiple sequence alignment construction and analysis

PubMed Central

Aniba, Mohamed Radhouene; Poch, Olivier; Marchler-Bauer, Aron; Thompson, Julie Dawn

2010-01-01

Multiple sequence alignment (MSA) is a cornerstone of modern molecular biology and represents a unique means of investigating the patterns of conservation and diversity in complex biological systems. Many different algorithms have been developed to construct MSAs, but previous studies have shown that no single aligner consistently outperforms the rest. This has led to the development of a number of ‘meta-methods’ that systematically run several aligners and merge the output into one single solution. Although these methods generally produce more accurate alignments, they are inefficient because all the aligners need to be run first and the choice of the best solution is made a posteriori. Here, we describe the development of a new expert system, AlexSys, for the multiple alignment of protein sequences. AlexSys incorporates an intelligent inference engine to automatically select an appropriate aligner a priori, depending only on the nature of the input sequences. The inference engine was trained on a large set of reference multiple alignments, using a novel machine learning approach. Applying AlexSys to a test set of 178 alignments, we show that the expert system represents a good compromise between alignment quality and running time, making it suitable for high throughput projects. AlexSys is freely available from http://alnitak.u-strasbg.fr/∼aniba/alexsys. PMID:20530533
Image Alignment for Multiple Camera High Dynamic Range Microscopy.

PubMed

Eastwood, Brian S; Childs, Elisabeth C

2012-01-09

This paper investigates the problem of image alignment for multiple camera high dynamic range (HDR) imaging. HDR imaging combines information from images taken with different exposure settings. Combining information from multiple cameras requires an alignment process that is robust to the intensity differences in the images. HDR applications that use a limited number of component images require an alignment technique that is robust to large exposure differences. We evaluate the suitability for HDR alignment of three exposure-robust techniques. We conclude that image alignment based on matching feature descriptors extracted from radiant power images from calibrated cameras yields the most accurate and robust solution. We demonstrate the use of this alignment technique in a high dynamic range video microscope that enables live specimen imaging with a greater level of detail than can be captured with a single camera.
Image Alignment for Multiple Camera High Dynamic Range Microscopy

PubMed Central

Eastwood, Brian S.; Childs, Elisabeth C.

2012-01-01

This paper investigates the problem of image alignment for multiple camera high dynamic range (HDR) imaging. HDR imaging combines information from images taken with different exposure settings. Combining information from multiple cameras requires an alignment process that is robust to the intensity differences in the images. HDR applications that use a limited number of component images require an alignment technique that is robust to large exposure differences. We evaluate the suitability for HDR alignment of three exposure-robust techniques. We conclude that image alignment based on matching feature descriptors extracted from radiant power images from calibrated cameras yields the most accurate and robust solution. We demonstrate the use of this alignment technique in a high dynamic range video microscope that enables live specimen imaging with a greater level of detail than can be captured with a single camera. PMID:22545028
Embedding strategies for effective use of information from multiple sequence alignments.

PubMed Central

Henikoff, S.; Henikoff, J. G.

1997-01-01

We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain. PMID:9070452
SPHERE: SPherical Harmonic Elastic REgistration of HARDI Data

PubMed Central

Yap, Pew-Thian; Chen, Yasheng; An, Hongyu; Yang, Yang; Gilmore, John H.; Lin, Weili

2010-01-01

In contrast to the more common Diffusion Tensor Imaging (DTI), High Angular Resolution Diffusion Imaging (HARDI) allows superior delineation of angular microstructures of brain white matter, and makes possible multiple-fiber modeling of each voxel for better characterization of brain connectivity. However, the complex orientation information afforded by HARDI makes registration of HARDI images more complicated than scalar images. In particular, the question of how much orientation information is needed for satisfactory alignment has not been sufficiently addressed. Low order orientation representation is generally more robust than high order representation, although the latter provides more information for correct alignment of fiber pathways. However, high order representation, when naïvely utilized, might not necessarily be conducive to improving registration accuracy since similar structures with significant orientation differences prior to proper alignment might be mistakenly taken as non-matching structures. We present in this paper a HARDI registration algorithm, called SPherical Harmonic Elastic REgistration (SPHERE), which in a principled means hierarchically extracts orientation information from HARDI data for structural alignment. The image volumes are first registered using robust, relatively direction invariant features derived from the Orientation Distribution Function (ODF), and the alignment is then further refined using spherical harmonic (SH) representation with gradually increasing orders. This progression from non-directional, single-directional to multi-directional representation provides a systematic means of extracting directional information given by diffusion-weighted imaging. Coupled with a template-subject-consistent soft-correspondence-matching scheme, this approach allows robust and accurate alignment of HARDI data. Experimental results show marked increase in accuracy over a state-of-the-art DTI registration algorithm. PMID:21147231
Mechanoresponsive, omni-directional and local matrix-degrading actin protrusions in human mesenchymal stem cells microencapsulated in a 3D collagen matrix.

PubMed

Ho, Fu Chak; Zhang, Wei; Li, Yuk Yin; Chan, Barbara Pui

2015-01-01

Cells are known to respond to multiple niche signals including extracellular matrix and mechanical loading. In others and our own studies, mechanical loading has been shown to induce the formation of cell alignment in 3D collagen matrix with random meshwork, challenging our traditional understanding on the necessity of having aligned substrates as the prerequisite of alignment formation. This motivates our adventure in deciphering the mechanism of loading-induced cell alignment and hence the discovery of the novel protrusive functional structure at the cell-matrix interface. Here we report the formation of mechanoresponsive, omni-directional and local matrix-degrading actin protrusions in human mesenchymal stem cells (hMSCs) microencapsulated in collagen following a shifted actin assembly/disassembly balance. These actin protrusive structures exhibit morphological and compositional similarity to filopodia and invadopodia but differ from them in stability, abundance, signaling and function. Without ruling out the possibility that these structures may comprise special subsets of filopodia and invadopodia, we propose to name them as mechanopodia so as to reveal their mechano-inductive mechanism. We also suggest that more intensive investigations are needed to delineate the functional significance and physiological relevance of these structures. This work identifies a brand new target for cell-matrix interaction and mechanoregulation studies. Copyright © 2015 Elsevier Ltd. All rights reserved.
MultiSeq: unifying sequence and structure data for evolutionary analysis

PubMed Central

Roberts, Elijah; Eargle, John; Wright, Dan; Luthey-Schulten, Zaida

2006-01-01

Background Since the publication of the first draft of the human genome in 2000, bioinformatic data have been accumulating at an overwhelming pace. Currently, more than 3 million sequences and 35 thousand structures of proteins and nucleic acids are available in public databases. Finding correlations in and between these data to answer critical research questions is extremely challenging. This problem needs to be approached from several directions: information science to organize and search the data; information visualization to assist in recognizing correlations; mathematics to formulate statistical inferences; and biology to analyze chemical and physical properties in terms of sequence and structure changes. Results Here we present MultiSeq, a unified bioinformatics analysis environment that allows one to organize, display, align and analyze both sequence and structure data for proteins and nucleic acids. While special emphasis is placed on analyzing the data within the framework of evolutionary biology, the environment is also flexible enough to accommodate other usage patterns. The evolutionary approach is supported by the use of predefined metadata, adherence to standard ontological mappings, and the ability for the user to adjust these classifications using an electronic notebook. MultiSeq contains a new algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of a homologous group of distantly related proteins. The method, based on the multidimensional QR factorization of multiple sequence and structure alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. Conclusion MultiSeq is a major extension of the Multiple Alignment tool that is provided as part of VMD, a structural visualization program for analyzing molecular dynamics simulations. Both are freely distributed by the NIH Resource for Macromolecular Modeling and Bioinformatics and MultiSeq is included with VMD starting with version 1.8.5. The MultiSeq website has details on how to download and use the software: PMID:16914055
Dynamo Catalogue: Geometrical tools and data management for particle picking in subtomogram averaging of cryo-electron tomograms.

PubMed

Castaño-Díez, Daniel; Kudryashev, Mikhail; Stahlberg, Henning

2017-02-01

Cryo electron tomography allows macromolecular complexes within vitrified, intact, thin cells or sections thereof to be visualized, and structural analysis to be performed in situ by averaging over multiple copies of the same molecules. Image processing for subtomogram averaging is specific and cumbersome, due to the large amount of data and its three dimensional nature and anisotropic resolution. Here, we streamline data processing for subtomogram averaging by introducing an archiving system, Dynamo Catalogue. This system manages tomographic data from multiple tomograms and allows visual feedback during all processing steps, including particle picking, extraction, alignment and classification. The file structure of a processing project file structure includes logfiles of performed operations, and can be backed up and shared between users. Command line commands, database queries and a set of GUIs give the user versatile control over the process. Here, we introduce a set of geometric tools that streamline particle picking from simple (filaments, spheres, tubes, vesicles) and complex geometries (arbitrary 2D surfaces, rare instances on proteins with geometric restrictions, and 2D and 3D crystals). Advanced functionality, such as manual alignment and subboxing, is useful when initial templates are generated for alignment and for project customization. Dynamo Catalogue is part of the open source package Dynamo and includes tools to ensure format compatibility with the subtomogram averaging functionalities of other packages, such as Jsubtomo, PyTom, PEET, EMAN2, XMIPP and Relion. Copyright © 2016. Published by Elsevier Inc.
Using reconfigurable hardware to accelerate multiple sequence alignment with ClustalW.

PubMed

Oliver, Tim; Schmidt, Bertil; Nathan, Darran; Clemens, Ralf; Maskell, Douglas

2005-08-15

Aligning hundreds of sequences using progressive alignment tools such as ClustalW requires several hours on state-of-the-art workstations. We present a new approach to compute multiple sequence alignments in far shorter time using reconfigurable hardware. This results in an implementation of ClustalW with significant runtime savings on a standard off-the-shelf FPGA.
Formation of bulk refractive index structures

DOEpatents

Potter, Jr., Barrett George; Potter, Kelly Simmons; Wheeler, David R.; Jamison, Gregory M.

2003-07-15

A method of making a stacked three-dimensional refractive index structure in photosensitive materials using photo-patterning where first determined is the wavelength at which a photosensitive material film exhibits a change in refractive index upon exposure to optical radiation, a portion of the surfaces of the photosensitive material film is optically irradiated, the film is marked to produce a registry mark. Multiple films are produced and aligned using the registry marks to form a stacked three-dimensional refractive index structure.

Quantification of Cardiomyocyte Alignment from Three-Dimensional (3D) Confocal Microscopy of Engineered Tissue.

PubMed

Kowalski, William J; Yuan, Fangping; Nakane, Takeichiro; Masumoto, Hidetoshi; Dwenger, Marc; Ye, Fei; Tinney, Joseph P; Keller, Bradley B

2017-08-01

Biological tissues have complex, three-dimensional (3D) organizations of cells and matrix factors that provide the architecture necessary to meet morphogenic and functional demands. Disordered cell alignment is associated with congenital heart disease, cardiomyopathy, and neurodegenerative diseases and repairing or replacing these tissues using engineered constructs may improve regenerative capacity. However, optimizing cell alignment within engineered tissues requires quantitative 3D data on cell orientations and both efficient and validated processing algorithms. We developed an automated method to measure local 3D orientations based on structure tensor analysis and incorporated an adaptive subregion size to account for multiple scales. Our method calculates the statistical concentration parameter, κ, to quantify alignment, as well as the traditional orientational order parameter. We validated our method using synthetic images and accurately measured principal axis and concentration. We then applied our method to confocal stacks of cleared, whole-mount engineered cardiac tissues generated from human-induced pluripotent stem cells or embryonic chick cardiac cells and quantified cardiomyocyte alignment. We found significant differences in alignment based on cellular composition and tissue geometry. These results from our synthetic images and confocal data demonstrate the efficiency and accuracy of our method to measure alignment in 3D tissues.
Aligning Metabolic Pathways Exploiting Binary Relation of Reactions.

PubMed

Huang, Yiran; Zhong, Cheng; Lin, Hai Xiang; Huang, Jing

2016-01-01

Metabolic pathway alignment has been widely used to find one-to-one and/or one-to-many reaction mappings to identify the alternative pathways that have similar functions through different sets of reactions, which has important applications in reconstructing phylogeny and understanding metabolic functions. The existing alignment methods exhaustively search reaction sets, which may become infeasible for large pathways. To address this problem, we present an effective alignment method for accurately extracting reaction mappings between two metabolic pathways. We show that connected relation between reactions can be formalized as binary relation of reactions in metabolic pathways, and the multiplications of zero-one matrices for binary relations of reactions can be accomplished in finite steps. By utilizing the multiplications of zero-one matrices for binary relation of reactions, we efficiently obtain reaction sets in a small number of steps without exhaustive search, and accurately uncover biologically relevant reaction mappings. Furthermore, we introduce a measure of topological similarity of nodes (reactions) by comparing the structural similarity of the k-neighborhood subgraphs of the nodes in aligning metabolic pathways. We employ this similarity metric to improve the accuracy of the alignments. The experimental results on the KEGG database show that when compared with other state-of-the-art methods, in most cases, our method obtains better performance in the node correctness and edge correctness, and the number of the edges of the largest common connected subgraph for one-to-one reaction mappings, and the number of correct one-to-many reaction mappings. Our method is scalable in finding more reaction mappings with better biological relevance in large metabolic pathways.
Measuring the distance between multiple sequence alignments.

PubMed

Blackburne, Benjamin P; Whelan, Simon

2012-02-15

Multiple sequence alignment (MSA) is a core method in bioinformatics. The accuracy of such alignments may influence the success of downstream analyses such as phylogenetic inference, protein structure prediction, and functional prediction. The importance of MSA has lead to the proliferation of MSA methods, with different objective functions and heuristics to search for the optimal MSA. Different methods of inferring MSAs produce different results in all but the most trivial cases. By measuring the differences between inferred alignments, we may be able to develop an understanding of how these differences (i) relate to the objective functions and heuristics used in MSA methods, and (ii) affect downstream analyses. We introduce four metrics to compare MSAs, which include the position in a sequence where a gap occurs or the location on a phylogenetic tree where an insertion or deletion (indel) event occurs. We use both real and synthetic data to explore the information given by these metrics and demonstrate how the different metrics in combination can yield more information about MSA methods and the differences between them. MetAl is a free software implementation of these metrics in Haskell. Source and binaries for Windows, Linux and Mac OS X are available from http://kumiho.smith.man.ac.uk/whelan/software/metal/.
CoSMoS: Conserved Sequence Motif Search in the proteome

PubMed Central

Liu, Xiao I; Korde, Neeraj; Jakob, Ursula; Leichert, Lars I

2006-01-01

Background With the ever-increasing number of gene sequences in the public databases, generating and analyzing multiple sequence alignments becomes increasingly time consuming. Nevertheless it is a task performed on a regular basis by researchers in many labs. Results We have now created a database called CoSMoS to find the occurrences and at the same time evaluate the significance of sequence motifs and amino acids encoded in the whole genome of the model organism Escherichia coli K12. We provide a precomputed set of multiple sequence alignments for each individual E. coli protein with all of its homologues in the RefSeq database. The alignments themselves, information about the occurrence of sequence motifs together with information on the conservation of each of the more than 1.3 million amino acids encoded in the E. coli genome can be accessed via the web interface of CoSMoS. Conclusion CoSMoS is a valuable tool to identify highly conserved sequence motifs, to find regions suitable for mutational studies in functional analyses and to predict important structural features in E. coli proteins. PMID:16433915
Analysis of multiple internal reflections in a parallel aligned liquid crystal on silicon SLM.

PubMed

Martínez, José Luis; Moreno, Ignacio; del Mar Sánchez-López, María; Vargas, Asticio; García-Martínez, Pascuala

2014-10-20

Multiple internal reflection effects on the optical modulation of a commercial reflective parallel-aligned liquid-crystal on silicon (PAL-LCoS) spatial light modulator (SLM) are analyzed. The display is illuminated with different wavelengths and different angles of incidence. Non-negligible Fabry-Perot (FP) effect is observed due to the sandwiched LC layer structure. A simplified physical model that quantitatively accounts for the observed phenomena is proposed. It is shown how the expected pure phase modulation response is substantially modified in the following aspects: 1) a coupled amplitude modulation, 2) a non-linear behavior of the phase modulation, 3) some amount of unmodulated light, and 4) a reduction of the effective phase modulation as the angle of incidence increases. Finally, it is shown that multiple reflections can be useful since the effect of a displayed diffraction grating is doubled on a beam that is reflected twice through the LC layer, thus rendering gratings with doubled phase modulation depth.
Alignment and integration of complex networks by hypergraph-based spectral clustering

NASA Astrophysics Data System (ADS)

Michoel, Tom; Nachtergaele, Bruno

2012-11-01

Complex networks possess a rich, multiscale structure reflecting the dynamical and functional organization of the systems they model. Often there is a need to analyze multiple networks simultaneously, to model a system by more than one type of interaction, or to go beyond simple pairwise interactions, but currently there is a lack of theoretical and computational methods to address these problems. Here we introduce a framework for clustering and community detection in such systems using hypergraph representations. Our main result is a generalization of the Perron-Frobenius theorem from which we derive spectral clustering algorithms for directed and undirected hypergraphs. We illustrate our approach with applications for local and global alignment of protein-protein interaction networks between multiple species, for tripartite community detection in folksonomies, and for detecting clusters of overlapping regulatory pathways in directed networks.
Alignment and integration of complex networks by hypergraph-based spectral clustering.

PubMed

Michoel, Tom; Nachtergaele, Bruno

2012-11-01

Complex networks possess a rich, multiscale structure reflecting the dynamical and functional organization of the systems they model. Often there is a need to analyze multiple networks simultaneously, to model a system by more than one type of interaction, or to go beyond simple pairwise interactions, but currently there is a lack of theoretical and computational methods to address these problems. Here we introduce a framework for clustering and community detection in such systems using hypergraph representations. Our main result is a generalization of the Perron-Frobenius theorem from which we derive spectral clustering algorithms for directed and undirected hypergraphs. We illustrate our approach with applications for local and global alignment of protein-protein interaction networks between multiple species, for tripartite community detection in folksonomies, and for detecting clusters of overlapping regulatory pathways in directed networks.
Acquired midfoot deformity and function in individuals with diabetes and peripheral neuropathy.

PubMed

Hastings, Mary K; Mueller, Michael J; Woodburn, James; Strube, Michael J; Commean, Paul; Johnson, Jeffrey E; Cheuy, Victor; Sinacore, David R

2016-02-01

Diabetes mellitus related medial column foot deformity is a major contributor to ulceration and amputation. However, little is known about the relationship between medial column alignment and function and the integrity of the soft tissues that support and move the medial column. The purposes of this study were to determine the predictors of medial column alignment and function in people with diabetes and peripheral neuropathy. 23 participants with diabetes and neuropathy had radiographs, heel rise kinematics, magnetic resonance imaging and isokinetic muscle testing to measure: 1) medial column alignment (Meary's angle--the angle between the 1st metatarsal longitudinal axis and the talar head and neck), 2) medial column function (forefoot relative to hindfoot plantarflexion during heel rise), 3) intrinsic foot muscle and fat volume, ratio of posterior tibialis to flexor digitorum tendon volume, 4) plantar fascia function (Meary's angle change from toes flat to extended) and 5) plantarflexor peak torque. Predictors of medial column alignment and function were determined using simultaneous entry multiple regression. Posterior tibialis to flexor digitorum tendon volume ratio and intrinsic foot muscle volume were significant predictors of medial column alignment (P<.05), accounting for 44% of the variance. Intrinsic foot fat volume and plantarflexor peak torque were significant predictors of medial column function (P<.05), accounting for 37% of the variance. Deterioration of medial column supporting structures predicted alignment and function. Prospective research is required to monitor alignment, structure, and function over time to inform early intervention strategies to prevent deformity, ulceration, and amputation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Accelerating large-scale protein structure alignments with graphics processing units

PubMed Central

2012-01-01

Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132
Claustrum of the short-tailed fruit bat, Carollia perspicillata: Alignment of cellular orientation and functional connectivity.

PubMed

Orman, Rena; Kollmar, Richard; Stewart, Mark

2017-04-15

The claustrum is a gray-matter structure that underlies neocortex and reciprocates connections with cortical and subcortical targets. In lower mammals, the claustrum is directly adjacent to neocortex, making the definition of claustral boundaries challenging. Latexin, an endogenous inhibitor of metallocarboxypeptidases, localizes to claustral cells, enabling a clear delineation of claustrum. Given its proportionately large claustrum, we hypothesized that the short-tailed fruit bat, Carollia perspicillata, can be a useful model for claustral structure-function relations. We used latexin immunohistochemistry to identify claustral boundaries and intrinsic structure and multielectrode recordings from brain slices to explore intrinsic excitatory connectivity of the claustrum. Carollia's claustrum contains cells whose intrinsic connectivity and alignment permit the generation of spontaneous, synchronous population events and mirror their pattern of spread in disinhibited brain slices over millimeters. Carollia shows cellular alignment and spontaneous population-activity spread along both horizontal and dorsoventral axes. Carollia claustrum possesses intrinsic excitatory connectivity sufficient to: 1) generate single, spontaneous, synchronized burst discharges, 2) support activity spread along axes where claustral cells are aligned, and 3), because of multiple axes for cell alignment, support activity spread along both rostrocaudal and dorsoventral axes. The smaller event sizes in bat claustrum compared with rat claustrum are consistent with events occurring in population subsets rather than the full claustral cell population. The overall size of claustrum, its pronounced vascularity, and its more complex intrinsic connectivity than rat suggest that the bat is an animal model for claustral structure and function that will permit unique access to claustrum's processing capabilities. J. Comp. Neurol. 525:1459-1474, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Representation of Gravity-Aligned Scene Structure in Ventral Pathway Visual Cortex.

PubMed

Vaziri, Siavash; Connor, Charles E

2016-03-21

The ventral visual pathway in humans and non-human primates is known to represent object information, including shape and identity [1]. Here, we show the ventral pathway also represents scene structure aligned with the gravitational reference frame in which objects move and interact. We analyzed shape tuning of recently described macaque monkey ventral pathway neurons that prefer scene-like stimuli to objects [2]. Individual neurons did not respond to a single shape class, but to a variety of scene elements that are typically aligned with gravity: large planes in the orientation range of ground surfaces under natural viewing conditions, planes in the orientation range of ceilings, and extended convex and concave edges in the orientation range of wall/floor/ceiling junctions. For a given neuron, these elements tended to share a common alignment in eye-centered coordinates. Thus, each neuron integrated information about multiple gravity-aligned structures as they would be seen from a specific eye and head orientation. This eclectic coding strategy provides only ambiguous information about individual structures but explicit information about the environmental reference frame and the orientation of gravity in egocentric coordinates. In the ventral pathway, this could support perceiving and/or predicting physical events involving objects subject to gravity, recognizing object attributes like animacy based on movement not caused by gravity, and/or stabilizing perception of the world against changes in head orientation [3-5]. Our results, like the recent discovery of object weight representation [6], imply that the ventral pathway is involved not just in recognition, but also in physical understanding of objects and scenes. Copyright © 2016 Elsevier Ltd. All rights reserved.
Dry contact transfer printing of aligned carbon nanotube patterns and characterization of their optical properties for diameter distribution and alignment.

PubMed

Pint, Cary L; Xu, Ya-Qiong; Moghazy, Sharief; Cherukuri, Tonya; Alvarez, Noe T; Haroz, Erik H; Mahzooni, Salma; Doorn, Stephen K; Kono, Junichiro; Pasquali, Matteo; Hauge, Robert H

2010-02-23

A scalable and facile approach is demonstrated where as-grown patterns of well-aligned structures composed of single-walled carbon nanotubes (SWNT) synthesized via water-assisted chemical vapor deposition (CVD) can be transferred, or printed, to any host surface in a single dry, room-temperature step using the growth substrate as a stamp. We demonstrate compatibility of this process with multiple transfers for large-scale device and specifically tailored pattern fabrication. Utilizing this transfer approach, anisotropic optical properties of the SWNT films are probed via polarized absorption, Raman, and photoluminescence spectroscopies. Using a simple model to describe optical transitions in the large SWNT species present in the aligned samples, polarized absorption data are demonstrated as an effective tool for accurate assignment of the diameter distribution from broad absorption features located in the infrared. This can be performed on either well-aligned samples or unaligned doped samples, allowing simple and rapid feedback of the SWNT diameter distribution that can be challenging and time-consuming to obtain in other optical methods. Furthermore, we discuss challenges in accurately characterizing alignment in structures of long versus short carbon nanotubes through optical techniques, where SWNT length makes a difference in the information obtained in such measurements. This work provides new insight to the efficient transfer and optical properties of an emerging class of long, large diameter SWNT species typically produced in the CVD process.
StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zemla, A; Lang, D; Kostova, T

2010-11-29

Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less
PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

PubMed

Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

2015-05-01

We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.
Template based protein structure modeling by global optimization in CASP11.

PubMed

Joo, Keehyoung; Joung, InSuk; Lee, Sun Young; Kim, Jong Yun; Cheng, Qianyi; Manavalan, Balachandran; Joung, Jong Young; Heo, Seungryong; Lee, Juyong; Nam, Mikyung; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung

2016-09-01

For the template-based modeling (TBM) of CASP11 targets, we have developed three new protein modeling protocols (nns for server prediction and LEE and LEER for human prediction) by improving upon our previous CASP protocols (CASP7 through CASP10). We applied the powerful global optimization method of conformational space annealing to three stages of optimization, including multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain remodeling. For more successful fold recognition, a new alignment method called CRFalign was developed. It can incorporate sensitive positional and environmental dependence in alignment scores as well as strong nonlinear correlations among various features. Modifications and adjustments were made to the form of the energy function and weight parameters pertaining to the chain building procedure. For the side-chain remodeling step, residue-type dependence was introduced to the cutoff value that determines the entry of a rotamer to the side-chain modeling library. The improved performance of the nns server method is attributed to successful fold recognition achieved by combining several methods including CRFalign and to the current modeling formulation that can incorporate native-like structural aspects present in multiple templates. The LEE protocol is identical to the nns one except that CASP11-released server models are used as templates. The success of LEE in utilizing CASP11 server models indicates that proper template screening and template clustering assisted by appropriate cluster ranking promises a new direction to enhance protein 3D modeling. Proteins 2016; 84(Suppl 1):221-232. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
LAMBADA and InflateGRO2: efficient membrane alignment and insertion of membrane proteins for molecular dynamics simulations.

PubMed

Schmidt, Thomas H; Kandt, Christian

2012-10-22

At the beginning of each molecular dynamics membrane simulation stands the generation of a suitable starting structure which includes the working steps of aligning membrane and protein and seamlessly accommodating the protein in the membrane. Here we introduce two efficient and complementary methods based on pre-equilibrated membrane patches, automating these steps. Using a voxel-based cast of the coarse-grained protein, LAMBADA computes a hydrophilicity profile-derived scoring function based on which the optimal rotation and translation operations are determined to align protein and membrane. Employing an entirely geometrical approach, LAMBADA is independent from any precalculated data and aligns even large membrane proteins within minutes on a regular workstation. LAMBADA is the first tool performing the entire alignment process automatically while providing the user with the explicit 3D coordinates of the aligned protein and membrane. The second tool is an extension of the InflateGRO method addressing the shortcomings of its predecessor in a fully automated workflow. Determining the exact number of overlapping lipids based on the area occupied by the protein and restricting expansion, compression and energy minimization steps to a subset of relevant lipids through automatically calculated and system-optimized operation parameters, InflateGRO2 yields optimal lipid packing and reduces lipid vacuum exposure to a minimum preserving as much of the equilibrated membrane structure as possible. Applicable to atomistic and coarse grain structures in MARTINI format, InflateGRO2 offers high accuracy, fast performance, and increased application flexibility permitting the easy preparation of systems exhibiting heterogeneous lipid composition as well as embedding proteins into multiple membranes. Both tools can be used separately, in combination with other methods, or in tandem permitting a fully automated workflow while retaining a maximum level of usage control and flexibility. To assess the performance of both methods, we carried out test runs using 22 membrane proteins of different size and transmembrane structure.
Defining and predicting structurally conserved regions in protein superfamilies

PubMed Central

Huang, Ivan K.; Grishin, Nick V.

2013-01-01

Motivation: The structures of homologous proteins are generally better conserved than their sequences. This phenomenon is demonstrated by the prevalence of structurally conserved regions (SCRs) even in highly divergent protein families. Defining SCRs requires the comparison of two or more homologous structures and is affected by their availability and divergence, and our ability to deduce structurally equivalent positions among them. In the absence of multiple homologous structures, it is necessary to predict SCRs of a protein using information from only a set of homologous sequences and (if available) a single structure. Accurate SCR predictions can benefit homology modelling and sequence alignment. Results: Using pairwise DaliLite alignments among a set of homologous structures, we devised a simple measure of structural conservation, termed structural conservation index (SCI). SCI was used to distinguish SCRs from non-SCRs. A database of SCRs was compiled from 386 SCOP superfamilies containing 6489 protein domains. Artificial neural networks were then trained to predict SCRs with various features deduced from a single structure and homologous sequences. Assessment of the predictions via a 5-fold cross-validation method revealed that predictions based on features derived from a single structure perform similarly to ones based on homologous sequences, while combining sequence and structural features was optimal in terms of accuracy (0.755) and Matthews correlation coefficient (0.476). These results suggest that even without information from multiple structures, it is still possible to effectively predict SCRs for a protein. Finally, inspection of the structures with the worst predictions pinpoints difficulties in SCR definitions. Availability: The SCR database and the prediction server can be found at http://prodata.swmed.edu/SCR. Contact: 91huangi@gmail.com or grishin@chop.swmed.edu Supplementary information: Supplementary data are available at Bioinformatics Online PMID:23193223
Superposition and alignment of labeled point clouds.

PubMed

Fober, Thomas; Glinca, Serghei; Klebe, Gerhard; Hüllermeier, Eyke

2011-01-01

Geometric objects are often represented approximately in terms of a finite set of points in three-dimensional euclidean space. In this paper, we extend this representation to what we call labeled point clouds. A labeled point cloud is a finite set of points, where each point is not only associated with a position in three-dimensional space, but also with a discrete class label that represents a specific property. This type of model is especially suitable for modeling biomolecules such as proteins and protein binding sites, where a label may represent an atom type or a physico-chemical property. Proceeding from this representation, we address the question of how to compare two labeled points clouds in terms of their similarity. Using fuzzy modeling techniques, we develop a suitable similarity measure as well as an efficient evolutionary algorithm to compute it. Moreover, we consider the problem of establishing an alignment of the structures in the sense of a one-to-one correspondence between their basic constituents. From a biological point of view, alignments of this kind are of great interest, since mutually corresponding molecular constituents offer important information about evolution and heredity, and can also serve as a means to explain a degree of similarity. In this paper, we therefore develop a method for computing pairwise or multiple alignments of labeled point clouds. To this end, we proceed from an optimal superposition of the corresponding point clouds and construct an alignment which is as much as possible in agreement with the neighborhood structure established by this superposition. We apply our methods to the structural analysis of protein binding sites.
Simultaneous gene finding in multiple genomes.

PubMed

König, Stefanie; Romoth, Lars W; Gerischer, Lizzy; Stanke, Mario

2016-11-15

As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or-if not-where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach. The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances. The method is implemented in C ++ as part of Augustus and available open source at http://bioinf.uni-greifswald.de/augustus/ CONTACT: stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
COACH: profile-profile alignment of protein families using hidden Markov models.

PubMed

Edgar, Robert C; Sjölander, Kimmen

2004-05-22

Alignments of two multiple-sequence alignments, or statistical models of such alignments (profiles), have important applications in computational biology. The increased amount of information in a profile versus a single sequence can lead to more accurate alignments and more sensitive homolog detection in database searches. Several profile-profile alignment methods have been proposed and have been shown to improve sensitivity and alignment quality compared with sequence-sequence methods (such as BLAST) and profile-sequence methods (e.g. PSI-BLAST). Here we present a new approach to profile-profile alignment we call Comparison of Alignments by Constructing Hidden Markov Models (HMMs) (COACH). COACH aligns two multiple sequence alignments by constructing a profile HMM from one alignment and aligning the other to that HMM. We compare the alignment accuracy of COACH with two recently published methods: Yona and Levitt's prof_sim and Sadreyev and Grishin's COMPASS. On two sets of reference alignments selected from the FSSP database, we find that COACH is able, on average, to produce alignments giving the best coverage or the fewest errors, depending on the chosen parameter settings. COACH is freely available from www.drive5.com/lobster

Using structure to explore the sequence alignment space of remote homologs.

PubMed

Kuziemko, Andrew; Honig, Barry; Petrey, Donald

2011-10-01

Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP) are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is "optimal" in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are "suboptimal" in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for "modelability", we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended.
High-speed multiple sequence alignment on a reconfigurable platform.

PubMed

Oliver, Tim; Schmidt, Bertil; Maskell, Douglas; Nathan, Darran; Clemens, Ralf

2006-01-01

Progressive alignment is a widely used approach to compute multiple sequence alignments (MSAs). However, aligning several hundred sequences by popular progressive alignment tools requires hours on sequential computers. Due to the rapid growth of sequence databases biologists have to compute MSAs in a far shorter time. In this paper we present a new approach to MSA on reconfigurable hardware platforms to gain high performance at low cost. We have constructed a linear systolic array to perform pairwise sequence distance computations using dynamic programming. This results in an implementation with significant runtime savings on a standard FPGA.
PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences

PubMed Central

Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong

2015-01-01

Abstract We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate—slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory. PMID:25549288
Fast alignment-free sequence comparison using spaced-word frequencies.

PubMed

Leimeister, Chris-Andre; Boden, Marcus; Horwege, Sebastian; Lindner, Sebastian; Morgenstern, Burkhard

2014-07-15

Alignment-free methods for sequence comparison are increasingly used for genome analysis and phylogeny reconstruction; they circumvent various difficulties of traditional alignment-based approaches. In particular, alignment-free methods are much faster than pairwise or multiple alignments. They are, however, less accurate than methods based on sequence alignment. Most alignment-free approaches work by comparing the word composition of sequences. A well-known problem with these methods is that neighbouring word matches are far from independent. To reduce the statistical dependency between adjacent word matches, we propose to use 'spaced words', defined by patterns of 'match' and 'don't care' positions, for alignment-free sequence comparison. We describe a fast implementation of this approach using recursive hashing and bit operations, and we show that further improvements can be achieved by using multiple patterns instead of single patterns. To evaluate our approach, we use spaced-word frequencies as a basis for fast phylogeny reconstruction. Using real-world and simulated sequence data, we demonstrate that our multiple-pattern approach produces better phylogenies than approaches relying on contiguous words. Our program is freely available at http://spaced.gobics.de/. © The Author 2014. Published by Oxford University Press.
Prediction of protein secondary structure content for the twilight zone sequences.

PubMed

Homaeian, Leila; Kurgan, Lukasz A; Ruan, Jishou; Cios, Krzysztof J; Chen, Ke

2007-11-15

Secondary protein structure carries information about local structural arrangements, which include three major conformations: alpha-helices, beta-strands, and coils. Significant majority of successful methods for prediction of the secondary structure is based on multiple sequence alignment. However, multiple alignment fails to provide accurate results when a sequence comes from the twilight zone, that is, it is characterized by low (<30%) homology. To this end, we propose a novel method for prediction of secondary structure content through comprehensive sequence representation, called PSSC-core. The method uses a multiple linear regression model and introduces a comprehensive feature-based sequence representation to predict amount of helices and strands for sequences from the twilight zone. The PSSC-core method was tested and compared with two other state-of-the-art prediction methods on a set of 2187 twilight zone sequences. The results indicate that our method provides better predictions for both helix and strand content. The PSSC-core is shown to provide statistically significantly better results when compared with the competing methods, reducing the prediction error by 5-7% for helix and 7-9% for strand content predictions. The proposed feature-based sequence representation uses a comprehensive set of physicochemical properties that are custom-designed for each of the helix and strand content predictions. It includes composition and composition moment vectors, frequency of tetra-peptides associated with helical and strand conformations, various property-based groups like exchange groups, chemical groups of the side chains and hydrophobic group, auto-correlations based on hydrophobicity, side-chain masses, hydropathy, and conformational patterns for beta-sheets. The PSSC-core method provides an alternative for predicting the secondary structure content that can be used to validate and constrain results of other structure prediction methods. At the same time, it also provides useful insight into design of successful protein sequence representations that can be used in developing new methods related to prediction of different aspects of the secondary protein structure. (c) 2007 Wiley-Liss, Inc.
New nurse transition: success through aligning multiple identities.

PubMed

Leong, Yee Mun Jessica; Crossman, Joanna

2015-01-01

The purpose of this paper is to explore the perceptions of new nurses in Singapore of their experiences of role transition and to examine the implications for managers in terms of employee training, development and retention. This qualitative study was conducted using a constructivist grounded theory approach. In total 26 novice nurses and five preceptors (n=31) from five different hospitals participated in the study. Data were collected from semi-structured interviews and reflective journal entries and analysed using the constant comparative method. The findings revealed that novice nurses remained emotionally and physically challenged when experiencing role transition. Two major constructs appear to play an important part in the transition process; learning how to Fit in and aligning personal with professional and organisational identities. The findings highlight factors that facilitate or impede Fitting in and aligning these identities. Although the concept of Fitting in and its relation to the attrition of novice nurses has been explored in global studies, that relationship has not yet been theorised as the dynamic alignment of multiple identities. Also, whilst most research around Fitting in, identity and retention has been conducted in western countries, little is known about these issues and their interrelationship in the context of Singapore. The study should inform decision making by healthcare organisations, nurse managers and nursing training institutions with respect to improving the transition experience of novice nurses.
KinView: A visual comparative sequence analysis tool for integrated kinome research

PubMed Central

McSkimming, Daniel Ian; Dastgheib, Shima; Baffi, Timothy R.; Byrne, Dominic P.; Ferries, Samantha; Scott, Steven Thomas; Newton, Alexandra C.; Eyers, Claire E.; Kochut, Krzysztof J.; Eyers, Patrick A.

2017-01-01

Multiple sequence alignments (MSAs) are a fundamental analysis tool used throughout biology to investigate relationships between protein sequence, structure, function, evolutionary history, and patterns of disease-associated variants. However, their widespread application in systems biology research is currently hindered by the lack of user-friendly tools to simultaneously visualize, manipulate and query the information conceptualized in large sequence alignments, and the challenges in integrating MSAs with multiple orthogonal data such as cancer variants and post-translational modifications, which are often stored in heterogeneous data sources and formats. Here, we present the Multiple Sequence Alignment Ontology (MSAOnt), which represents a profile or consensus alignment in an ontological format. Subsets of the alignment are easily selected through the SPARQL Protocol and RDF Query Language for downstream statistical analysis or visualization. We have also created the Kinome Viewer (KinView), an interactive integrative visualization that places eukaryotic protein kinase cancer variants in the context of natural sequence variation and experimentally determined post-translational modifications, which play central roles in the regulation of cellular signaling pathways. Using KinView, we identified differential phosphorylation patterns between tyrosine and serine/threonine kinases in the activation segment, a major kinase regulatory region that is often mutated in proliferative diseases. We discuss cancer variants that disrupt phosphorylation sites in the activation segment, and show how KinView can be used as a comparative tool to identify differences and similarities in natural variation, cancer variants and post-translational modifications between kinase groups, families and subfamilies. Based on KinView comparisons, we identify and experimentally characterize a regulatory tyrosine (Y177PLK4) in the PLK4 C-terminal activation segment region termed the P+1 loop. To further demonstrate the application of KinView in hypothesis generation and testing, we formulate and validate a hypothesis explaining a novel predicted loss-of-function variant (D523NPKCβ) in the regulatory spine of PKCβ, a recently identified tumor suppressor kinase. KinView provides a novel, extensible interface for performing comparative analyses between subsets of kinases and for integrating multiple types of residue specific annotations in user friendly formats. PMID:27731453
A dynamic programming approach for the alignment of signal peaks in multiple gas chromatography-mass spectrometry experiments.

PubMed

Robinson, Mark D; De Souza, David P; Keen, Woon Wai; Saunders, Eleanor C; McConville, Malcolm J; Speed, Terence P; Likić, Vladimir A

2007-10-29

Gas chromatography-mass spectrometry (GC-MS) is a robust platform for the profiling of certain classes of small molecules in biological samples. When multiple samples are profiled, including replicates of the same sample and/or different sample states, one needs to account for retention time drifts between experiments. This can be achieved either by the alignment of chromatographic profiles prior to peak detection, or by matching signal peaks after they have been extracted from chromatogram data matrices. Automated retention time correction is particularly important in non-targeted profiling studies. A new approach for matching signal peaks based on dynamic programming is presented. The proposed approach relies on both peak retention times and mass spectra. The alignment of more than two peak lists involves three steps: (1) all possible pairs of peak lists are aligned, and similarity of each pair of peak lists is estimated; (2) the guide tree is built based on the similarity between the peak lists; (3) peak lists are progressively aligned starting with the two most similar peak lists, following the guide tree until all peak lists are exhausted. When two or more experiments are performed on different sample states and each consisting of multiple replicates, peak lists within each set of replicate experiments are aligned first (within-state alignment), and subsequently the resulting alignments are aligned themselves (between-state alignment). When more than two sets of replicate experiments are present, the between-state alignment also employs the guide tree. We demonstrate the usefulness of this approach on GC-MS metabolic profiling experiments acquired on wild-type and mutant Leishmania mexicana parasites. We propose a progressive method to match signal peaks across multiple GC-MS experiments based on dynamic programming. A sensitive peak similarity function is proposed to balance peak retention time and peak mass spectra similarities. This approach can produce the optimal alignment between an arbitrary number of peak lists, and models explicitly within-state and between-state peak alignment. The accuracy of the proposed method was close to the accuracy of manually-curated peak matching, which required tens of man-hours for the analyzed data sets. The proposed approach may offer significant advantages for processing of high-throughput metabolomics data, especially when large numbers of experimental replicates and multiple sample states are analyzed.
Amino acid sequence analysis of the annexin super-gene family of proteins.

PubMed

Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J

1991-06-15

The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.
From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction

PubMed Central

Cocco, Simona; Monasson, Remi; Weigt, Martin

2013-01-01

Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant ‘patterns’ of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold. PMID:23990764
Ion Acceleration by Multiple Reflections at Martian Bow Shock

NASA Astrophysics Data System (ADS)

Yamauchi, M.; Futaana, Y.; Fedorov, A.; Frahm, R. A.; Dubinin, E.; Lundin, R.; Sauvaud, J.-A.; Winningham, J. D.; Barabash, S.; Holmström, H.

2012-04-01

The ion mass analyzer (IMA) on board Mars Express revealed bundled structures of ions in the energy domain within a distance of a proton gyroradius from the Martian bow shock. Seven prominent traversals during 2005 were examined when the energy-bunched structure was observed together with pick-up ions of exospheric origin, the latter of which is used to determine the local magnetic field orientation from its circular trajectory in velocity space. These seven traversals include different bow shock configurations: (a) quasi-perpendicular shock with its specular direction of the solar wind more perpendicular to the magnetic field (QT), (b) quasi-perpendicular shock with its specular reflection direction of the solar wind more along the magnetic field (FS), and (c) quasi-parallel (QL) shock. In all seven cases, the velocity components of the energy-bunched structure are consistent with multiple specular reflections of the solar wind at the bow shock up to at least two reflections. The accelerated solar wind ions after two specular reflections have large parallel components with respect to the magnetic field for the QL shock whereas the field-aligned speed is much smaller than the perpendicular speed for the QT shock. The reflected ions escape into the solar wind when and only when the reflection is in the field-aligned direction.
Iterative refinement of structure-based sequence alignments by Seed Extension

PubMed Central

Kim, Changhoon; Tai, Chin-Hsien; Lee, Byungkook

2009-01-01

Background Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension) that iteratively refines a structure-based sequence alignment. Results RSE uses SE (Seed Extension) in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs. Conclusion RSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs. PMID:19589133
Multiscale Currents Observed by MMS in the Flow Braking Region

NASA Astrophysics Data System (ADS)

Nakamura, Rumi; Varsani, Ali; Genestreti, Kevin J.; Le Contel, Olivier; Nakamura, Takuma; Baumjohann, Wolfgang; Nagai, Tsugunobu; Artemyev, Anton; Birn, Joachim; Sergeev, Victor A.; Apatenkov, Sergey; Ergun, Robert E.; Fuselier, Stephen A.; Gershman, Daniel J.; Giles, Barbara J.; Khotyaintsev, Yuri V.; Lindqvist, Per-Arne; Magnes, Werner; Mauk, Barry; Petrukovich, Anatoli; Russell, Christopher T.; Stawarz, Julia; Strangeway, Robert J.; Anderson, Brian; Burch, James L.; Bromund, Ken R.; Cohen, Ian; Fischer, David; Jaynes, Allison; Kepko, Laurence; Le, Guan; Plaschke, Ferdinand; Reeves, Geoff; Singer, Howard J.; Slavin, James A.; Torbert, Roy B.; Turner, Drew L.

2018-02-01

We present characteristics of current layers in the off-equatorial near-Earth plasma sheet boundary observed with high time-resolution measurements from the Magnetospheric Multiscale mission during an intense substorm associated with multiple dipolarizations. The four Magnetospheric Multiscale spacecraft, separated by distances of about 50 km, were located in the southern hemisphere in the dusk portion of a substorm current wedge. They observed fast flow disturbances (up to about 500 km/s), most intense in the dawn-dusk direction. Field-aligned currents were observed initially within the expanding plasma sheet, where the flow and field disturbances showed the distinct pattern expected in the braking region of localized flows. Subsequently, intense thin field-aligned current layers were detected at the inner boundary of equatorward moving flux tubes together with Earthward streaming hot ions. Intense Hall current layers were found adjacent to the field-aligned currents. In particular, we found a Hall current structure in the vicinity of the Earthward streaming ion jet that consisted of mixed ion components, that is, hot unmagnetized ions, cold E × B drifting ions, and magnetized electrons. Our observations show that both the near-Earth plasma jet diversion and the thin Hall current layers formed around the reconnection jet boundary are the sites where diversion of the perpendicular currents take place that contribute to the observed field-aligned current pattern as predicted by simulations of reconnection jets. Hence, multiscale structure of flow braking is preserved in the field-aligned currents in the off-equatorial plasma sheet and is also translated to ionosphere to become a part of the substorm field-aligned current system.
Alignment between Protostellar Outflows and Filamentary Structure

NASA Astrophysics Data System (ADS)

Stephens, Ian W.; Dunham, Michael M.; Myers, Philip C.; Pokhrel, Riwaj; Sadavoy, Sarah I.; Vorobyov, Eduard I.; Tobin, John J.; Pineda, Jaime E.; Offner, Stella S. R.; Lee, Katherine I.; Kristensen, Lars E.; Jørgensen, Jes K.; Goodman, Alyssa A.; Bourke, Tyler L.; Arce, Héctor G.; Plunkett, Adele L.

2017-09-01

We present new Submillimeter Array (SMA) observations of CO(2-1) outflows toward young, embedded protostars in the Perseus molecular cloud as part of the Mass Assembly of Stellar Systems and their Evolution with the SMA (MASSES) survey. For 57 Perseus protostars, we characterize the orientation of the outflow angles and compare them with the orientation of the local filaments as derived from Herschel observations. We find that the relative angles between outflows and filaments are inconsistent with purely parallel or purely perpendicular distributions. Instead, the observed distribution of outflow-filament angles are more consistent with either randomly aligned angles or a mix of projected parallel and perpendicular angles. A mix of parallel and perpendicular angles requires perpendicular alignment to be more common by a factor of ˜3. Our results show that the observed distributions probably hold regardless of the protostar’s multiplicity, age, or the host core’s opacity. These observations indicate that the angular momentum axis of a protostar may be independent of the large-scale structure. We discuss the significance of independent protostellar rotation axes in the general picture of filament-based star formation.
QUASAR--scoring and ranking of sequence-structure alignments.

PubMed

Birzele, Fabian; Gewehr, Jan E; Zimmer, Ralf

2005-12-15

Sequence-structure alignments are a common means for protein structure prediction in the fields of fold recognition and homology modeling, and there is a broad variety of programs that provide such alignments based on sequence similarity, secondary structure or contact potentials. Nevertheless, finding the best sequence-structure alignment in a pool of alignments remains a difficult problem. QUASAR (quality of sequence-structure alignments ranking) provides a unifying framework for scoring sequence-structure alignments that aids finding well-performing combinations of well-known and custom-made scoring schemes. Those scoring functions can be benchmarked against widely accepted quality scores like MaxSub, TMScore, Touch and APDB, thus enabling users to test their own alignment scores against 'standard-of-truth' structure-based scores. Furthermore, individual score combinations can be optimized with respect to benchmark sets based on known structural relationships using QUASAR's in-built optimization routines.
SFESA: a web server for pairwise alignment refinement by secondary structure shifts.

PubMed

Tong, Jing; Pei, Jimin; Grishin, Nick V

2015-09-03

Protein sequence alignment is essential for a variety of tasks such as homology modeling and active site prediction. Alignment errors remain the main cause of low-quality structure models. A bioinformatics tool to refine alignments is needed to make protein alignments more accurate. We developed the SFESA web server to refine pairwise protein sequence alignments. Compared to the previous version of SFESA, which required a set of 3D coordinates for a protein, the new server will search a sequence database for the closest homolog with an available 3D structure to be used as a template. For each alignment block defined by secondary structure elements in the template, SFESA evaluates alignment variants generated by local shifts and selects the best-scoring alignment variant. A scoring function that combines the sequence score of profile-profile comparison and the structure score of template-derived contact energy is used for evaluation of alignments. PROMALS pairwise alignments refined by SFESA are more accurate than those produced by current advanced alignment methods such as HHpred and CNFpred. In addition, SFESA also improves alignments generated by other software. SFESA is a web-based tool for alignment refinement, designed for researchers to compute, refine, and evaluate pairwise alignments with a combined sequence and structure scoring of alignment blocks. To our knowledge, the SFESA web server is the only tool that refines alignments by evaluating local shifts of secondary structure elements. The SFESA web server is available at http://prodata.swmed.edu/sfesa.
Mechanical design of multiple zone plates precision alignment apparatus for hard X-ray focusing in twenty-nanometer scale

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shu, Deming; Liu, Jie; Gleber, Sophie C.

An enhanced mechanical design of multiple zone plates precision alignment apparatus for hard x-ray focusing in a twenty-nanometer scale is provided. The precision alignment apparatus includes a zone plate alignment base frame; a plurality of zone plates; and a plurality of zone plate holders, each said zone plate holder for mounting and aligning a respective zone plate for hard x-ray focusing. At least one respective positioning stage drives and positions each respective zone plate holder. Each respective positioning stage is mounted on the zone plate alignment base frame. A respective linkage component connects each respective positioning stage and the respectivemore » zone plate holder. The zone plate alignment base frame, each zone plate holder and each linkage component is formed of a selected material for providing thermal expansion stability and positioning stability for the precision alignment apparatus.« less
Implementation of a parallel protein structure alignment service on cloud.

PubMed

Hung, Che-Lun; Lin, Yaw-Ling

2013-01-01

Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.
Implementation of a Parallel Protein Structure Alignment Service on Cloud

PubMed Central

Hung, Che-Lun; Lin, Yaw-Ling

2013-01-01

Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform. PMID:23671842
Phylogenetic study of Class Armophorea (Alveolata, Ciliophora) based on 18S-rDNA data.

PubMed

da Silva Paiva, Thiago; do Nascimento Borges, Bárbara; da Silva-Neto, Inácio Domingos

2013-12-01

The 18S rDNA phylogeny of Class Armophorea, a group of anaerobic ciliates, is proposed based on an analysis of 44 sequences (out of 195) retrieved from the NCBI/GenBank database. Emphasis was placed on the use of two nucleotide alignment criteria that involved variation in the gap-opening and gap-extension parameters and the use of rRNA secondary structure to orientate multiple-alignment. A sensitivity analysis of 76 data sets was run to assess the effect of variations in indel parameters on tree topologies. Bayesian inference, maximum likelihood and maximum parsimony phylogenetic analyses were used to explore how different analytic frameworks influenced the resulting hypotheses. A sensitivity analysis revealed that the relationships among higher taxa of the Intramacronucleata were dependent upon how indels were determined during multiple-alignment of nucleotides. The phylogenetic analyses rejected the monophyly of the Armophorea most of the time and consistently indicated that the Metopidae and Nyctotheridae were related to the Litostomatea. There was no consensus on the placement of the Caenomorphidae, which could be a sister group of the Metopidae + Nyctorheridae, or could have diverged at the base of the Spirotrichea branch or the Intramacronucleata tree.

Phylogenetic study of Class Armophorea (Alveolata, Ciliophora) based on 18S-rDNA data

PubMed Central

da Silva Paiva, Thiago; do Nascimento Borges, Bárbara; da Silva-Neto, Inácio Domingos

2013-01-01

The 18S rDNA phylogeny of Class Armophorea, a group of anaerobic ciliates, is proposed based on an analysis of 44 sequences (out of 195) retrieved from the NCBI/GenBank database. Emphasis was placed on the use of two nucleotide alignment criteria that involved variation in the gap-opening and gap-extension parameters and the use of rRNA secondary structure to orientate multiple-alignment. A sensitivity analysis of 76 data sets was run to assess the effect of variations in indel parameters on tree topologies. Bayesian inference, maximum likelihood and maximum parsimony phylogenetic analyses were used to explore how different analytic frameworks influenced the resulting hypotheses. A sensitivity analysis revealed that the relationships among higher taxa of the Intramacronucleata were dependent upon how indels were determined during multiple-alignment of nucleotides. The phylogenetic analyses rejected the monophyly of the Armophorea most of the time and consistently indicated that the Metopidae and Nyctotheridae were related to the Litostomatea. There was no consensus on the placement of the Caenomorphidae, which could be a sister group of the Metopidae + Nyctorheridae, or could have diverged at the base of the Spirotrichea branch or the Intramacronucleata tree. PMID:24385862
Mechanical properties of electrospun bilayer fibrous membranes as potential scaffolds for tissue engineering.

PubMed

Pu, Juan; Komvopoulos, Kyriakos

2014-06-01

Bilayer fibrous membranes of poly(l-lactic acid) (PLLA) were fabricated by electrospinning, using a parallel-disk mandrel configuration that resulted in the sequential deposition of a layer with fibers aligned across the two parallel disks and a layer with randomly oriented fibers, both layers deposited in a single process step. Membrane structure and fiber alignment were characterized by scanning electron microscopy and two-dimensional fast Fourier transform. Because of the intricacies of the generated electric field, bilayer membranes exhibited higher porosity than single-layer membranes consisting of randomly oriented fibers fabricated with a solid-drum collector. However, despite their higher porosity, bilayer membranes demonstrated generally higher elastic modulus, yield strength and toughness than single-layer membranes with random fibers. Bilayer membrane deformation at relatively high strain rates comprised multiple abrupt microfracture events characterized by discontinuous fiber breakage. Bilayer membrane elongation yielded excessive necking of the layer with random fibers and remarkable fiber stretching (on the order of 400%) in the layer with fibers aligned in the stress direction. In addition, fibers in both layers exhibited multiple localized necking, attributed to the nonuniform distribution of crystalline phases in the fibrillar structure. The high membrane porosity, good mechanical properties, and good biocompatibility and biodegradability of PLLA (demonstrated in previous studies) make the present bilayer membranes good scaffold candidates for a wide range of tissue engineering applications. Copyright © 2014 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
Phylo: A Citizen Science Approach for Improving Multiple Sequence Alignment

PubMed Central

Kam, Alfred; Kwak, Daniel; Leung, Clarence; Wu, Chu; Zarour, Eleyine; Sarmenta, Luis; Blanchette, Mathieu; Waldispühl, Jérôme

2012-01-01

Background Comparative genomics, or the study of the relationships of genome structure and function across different species, offers a powerful tool for studying evolution, annotating genomes, and understanding the causes of various genetic disorders. However, aligning multiple sequences of DNA, an essential intermediate step for most types of analyses, is a difficult computational task. In parallel, citizen science, an approach that takes advantage of the fact that the human brain is exquisitely tuned to solving specific types of problems, is becoming increasingly popular. There, instances of hard computational problems are dispatched to a crowd of non-expert human game players and solutions are sent back to a central server. Methodology/Principal Findings We introduce Phylo, a human-based computing framework applying “crowd sourcing” techniques to solve the Multiple Sequence Alignment (MSA) problem. The key idea of Phylo is to convert the MSA problem into a casual game that can be played by ordinary web users with a minimal prior knowledge of the biological context. We applied this strategy to improve the alignment of the promoters of disease-related genes from up to 44 vertebrate species. Since the launch in November 2010, we received more than 350,000 solutions submitted from more than 12,000 registered users. Our results show that solutions submitted contributed to improving the accuracy of up to 70% of the alignment blocks considered. Conclusions/Significance We demonstrate that, combined with classical algorithms, crowd computing techniques can be successfully used to help improving the accuracy of MSA. More importantly, we show that an NP-hard computational problem can be embedded in casual game that can be easily played by people without significant scientific training. This suggests that citizen science approaches can be used to exploit the billions of “human-brain peta-flops” of computation that are spent every day playing games. Phylo is available at: http://phylo.cs.mcgill.ca. PMID:22412834
Fourier transform power spectrum is a potential measure of tissue alignment in standard MRI: A multiple sclerosis study.

PubMed

Sharma, Shrushrita; Zhang, Yunyan

2017-01-01

Loss of tissue coherency in brain white matter is found in many neurological diseases such as multiple sclerosis (MS). While several approaches have been proposed to evaluate white matter coherency including fractional anisotropy and fiber tracking in diffusion-weighted imaging, few are available for standard magnetic resonance imaging (MRI). Here we present an image post-processing method for this purpose based on Fourier transform (FT) power spectrum. T2-weighted images were collected from 19 patients (10 relapsing-remitting and 9 secondary progressive MS) and 19 age- and gender-matched controls. Image processing steps included: computation, normalization, and thresholding of FT power spectrum; determination of tissue alignment profile and dominant alignment direction; and calculation of alignment complexity using a new measure named angular entropy. To test the validity of this method, we used a highly organized brain white matter structure, corpus callosum. Six regions of interest were examined from the left, central and right aspects of both genu and splenium. We found that the dominant orientation of each ROI derived from our method was significantly correlated with the predicted directions based on anatomy. There was greater angular entropy in patients than controls, and a trend to be greater in secondary progressive MS patients. These findings suggest that it is possible to detect tissue alignment and anisotropy using traditional MRI, which are routinely acquired in clinical practice. Analysis of FT power spectrum may become a new approach for advancing the evaluation and management of patients with MS and similar disorders. Further confirmation is warranted.
Hydra multiple head star sensor and its in-flight self-calibration of optical heads alignment

NASA Astrophysics Data System (ADS)

Majewski, L.; Blarre, L.; Perrimon, N.; Kocher, Y.; Martinez, P. E.; Dussy, S.

2017-11-01

HYDRA is EADS SODERN new product line of APS-based autonomous star trackers. The baseline is a multiple head sensor made of three separated optical heads and one electronic unit. Actually the concept which was chosen offers more than three single-head star trackers working independently. Since HYDRA merges all fields of view the result is a more accurate, more robust and completely autonomous multiple-head sensor, releasing the AOCS from the need to manage the outputs of independent single-head star trackers. Specific to the multiple head architecture and the underlying data fusion, is the calibration of the relative alignments between the sensor optical heads. The performance of the sensor is related to its estimation of such alignments. HYDRA design is first reminded in this paper along with simplification it can bring at system level (AOCS). Then self-calibration of optical heads alignment is highlighted through descriptions and simulation results, thus demonstrating the performances of a key part of HYDRA multiple-head concept.
R3D Align web server for global nucleotide to nucleotide alignments of RNA 3D structures.

PubMed

Rahrig, Ryan R; Petrov, Anton I; Leontis, Neocles B; Zirbel, Craig L

2013-07-01

The R3D Align web server provides online access to 'RNA 3D Align' (R3D Align), a method for producing accurate nucleotide-level structural alignments of RNA 3D structures. The web server provides a streamlined and intuitive interface, input data validation and output that is more extensive and easier to read and interpret than related servers. The R3D Align web server offers a unique Gallery of Featured Alignments, providing immediate access to pre-computed alignments of large RNA 3D structures, including all ribosomal RNAs, as well as guidance on effective use of the server and interpretation of the output. By accessing the non-redundant lists of RNA 3D structures provided by the Bowling Green State University RNA group, R3D Align connects users to structure files in the same equivalence class and the best-modeled representative structure from each group. The R3D Align web server is freely accessible at http://rna.bgsu.edu/r3dalign/.
Fast and accurate non-sequential protein structure alignment using a new asymmetric linear sum assignment heuristic.

PubMed

Brown, Peter; Pullan, Wayne; Yang, Yuedong; Zhou, Yaoqi

2016-02-01

The three dimensional tertiary structure of a protein at near atomic level resolution provides insight alluding to its function and evolution. As protein structure decides its functionality, similarity in structure usually implies similarity in function. As such, structure alignment techniques are often useful in the classifications of protein function. Given the rapidly growing rate of new, experimentally determined structures being made available from repositories such as the Protein Data Bank, fast and accurate computational structure comparison tools are required. This paper presents SPalignNS, a non-sequential protein structure alignment tool using a novel asymmetrical greedy search technique. The performance of SPalignNS was evaluated against existing sequential and non-sequential structure alignment methods by performing trials with commonly used datasets. These benchmark datasets used to gauge alignment accuracy include (i) 9538 pairwise alignments implied by the HOMSTRAD database of homologous proteins; (ii) a subset of 64 difficult alignments from set (i) that have low structure similarity; (iii) 199 pairwise alignments of proteins with similar structure but different topology; and (iv) a subset of 20 pairwise alignments from the RIPC set. SPalignNS is shown to achieve greater alignment accuracy (lower or comparable root-mean squared distance with increased structure overlap coverage) for all datasets, and the highest agreement with reference alignments from the challenging dataset (iv) above, when compared with both sequentially constrained alignments and other non-sequential alignments. SPalignNS was implemented in C++. The source code, binary executable, and a web server version is freely available at: http://sparks-lab.org yaoqi.zhou@griffith.edu.au. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Structural re-alignment in an immunologic surface region of ricin A chain

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zemla, A T; Zhou, C E

2007-07-24

We compared structure alignments generated by several protein structure comparison programs to determine whether existing methods would satisfactorily align residues at a highly conserved position within an immunogenic loop in ribosome inactivating proteins (RIPs). Using default settings, structure alignments generated by several programs (CE, DaliLite, FATCAT, LGA, MAMMOTH, MATRAS, SHEBA, SSM) failed to align the respective conserved residues, although LGA reported correct residue-residue (R-R) correspondences when the beta-carbon (Cb) position was used as the point of reference in the alignment calculations. Further tests using variable points of reference indicated that points distal from the beta carbon along a vector connectingmore » the alpha and beta carbons yielded rigid structural alignments in which residues known to be highly conserved in RIPs were reported as corresponding residues in structural comparisons between ricin A chain, abrin-A, and other RIPs. Results suggest that approaches to structure alignment employing alternate point representations corresponding to side chain position may yield structure alignments that are more consistent with observed conservation of functional surface residues than do standard alignment programs, which apply uniform criteria for alignment (i.e., alpha carbon (Ca) as point of reference) along the entirety of the peptide chain. We present the results of tests that suggest the utility of allowing user-specified points of reference in generating alternate structural alignments, and we present a web server for automatically generating such alignments.« less
GALAHAD: 1. Pharmacophore identification by hypermolecular alignment of ligands in 3D

NASA Astrophysics Data System (ADS)

Richmond, Nicola J.; Abrams, Charlene A.; Wolohan, Philippa R. N.; Abrahamian, Edmond; Willett, Peter; Clark, Robert D.

2006-09-01

Alignment of multiple ligands based on shared pharmacophoric and pharmacosteric features is a long-recognized challenge in drug discovery and development. This is particularly true when the spatial overlap between structures is incomplete, in which case no good template molecule is likely to exist. Pair-wise rigid ligand alignment based on linear assignment (the LAMDA algorithm) has the potential to address this problem (Richmond et al. in J Mol Graph Model 23:199-209, 2004). Here we present the version of LAMDA embodied in the GALAHAD program, which carries out multi-way alignments by iterative construction of hypermolecules that retain the aggregate as well as the individual attributes of the ligands. We have also generalized the cost function from being purely atom-based to being one that operates on ionic, hydrogen bonding, hydrophobic and steric features. Finally, we have added the ability to generate useful partial-match 3D search queries from the hypermolecules obtained. By running frozen conformations through the GALAHAD program, one can utilize the extended version of LAMDA to generate pharmacophores and pharmacosteres that agree well with crystal structure alignments for a range of literature datasets, with minor adjustments of the default parameters generating even better models. Allowing for inclusion of partial match constraints in the queries yields pharmacophores that are consistently a superset of full-match pharmacophores identified in previous analyses, with the additional features representing points of potentially beneficial interaction with the target.
DNAAlignEditor: DNA alignment editor tool

PubMed Central

Sanchez-Villeda, Hector; Schroeder, Steven; Flint-Garcia, Sherry; Guill, Katherine E; Yamasaki, Masanori; McMullen, Michael D

2008-01-01

Background With advances in DNA re-sequencing methods and Next-Generation parallel sequencing approaches, there has been a large increase in genomic efforts to define and analyze the sequence variability present among individuals within a species. For very polymorphic species such as maize, this has lead to a need for intuitive, user-friendly software that aids the biologist, often with naïve programming capability, in tracking, editing, displaying, and exporting multiple individual sequence alignments. To fill this need we have developed a novel DNA alignment editor. Results We have generated a nucleotide sequence alignment editor (DNAAlignEditor) that provides an intuitive, user-friendly interface for manual editing of multiple sequence alignments with functions for input, editing, and output of sequence alignments. The color-coding of nucleotide identity and the display of associated quality score aids in the manual alignment editing process. DNAAlignEditor works as a client/server tool having two main components: a relational database that collects the processed alignments and a user interface connected to database through universal data access connectivity drivers. DNAAlignEditor can be used either as a stand-alone application or as a network application with multiple users concurrently connected. Conclusion We anticipate that this software will be of general interest to biologists and population genetics in editing DNA sequence alignments and analyzing natural sequence variation regardless of species, and will be particularly useful for manual alignment editing of sequences in species with high levels of polymorphism. PMID:18366684
SeqFIRE: a web application for automated extraction of indel regions and conserved blocks from protein multiple sequence alignments.

PubMed

Ajawatanawong, Pravech; Atkinson, Gemma C; Watson-Haigh, Nathan S; Mackenzie, Bryony; Baldauf, Sandra L

2012-07-01

Analyses of multiple sequence alignments generally focus on well-defined conserved sequence blocks, while the rest of the alignment is largely ignored or discarded. This is especially true in phylogenomics, where large multigene datasets are produced through automated pipelines. However, some of the most powerful phylogenetic markers have been found in the variable length regions of multiple alignments, particularly insertions/deletions (indels) in protein sequences. We have developed Sequence Feature and Indel Region Extractor (SeqFIRE) to enable the automated identification and extraction of indels from protein sequence alignments. The program can also extract conserved blocks and identify fast evolving sites using a combination of conservation and entropy. All major variables can be adjusted by the user, allowing them to identify the sets of variables most suited to a particular analysis or dataset. Thus, all major tasks in preparing an alignment for further analysis are combined in a single flexible and user-friendly program. The output includes a numbered list of indels, alignments in NEXUS format with indels annotated or removed and indel-only matrices. SeqFIRE is a user-friendly web application, freely available online at www.seqfire.org/.
PARTS: Probabilistic Alignment for RNA joinT Secondary structure prediction

PubMed Central

Harmanci, Arif Ozgun; Sharma, Gaurav; Mathews, David H.

2008-01-01

A novel method is presented for joint prediction of alignment and common secondary structures of two RNA sequences. The joint consideration of common secondary structures and alignment is accomplished by structural alignment over a search space defined by the newly introduced motif called matched helical regions. The matched helical region formulation generalizes previously employed constraints for structural alignment and thereby better accommodates the structural variability within RNA families. A probabilistic model based on pseudo free energies obtained from precomputed base pairing and alignment probabilities is utilized for scoring structural alignments. Maximum a posteriori (MAP) common secondary structures, sequence alignment and joint posterior probabilities of base pairing are obtained from the model via a dynamic programming algorithm called PARTS. The advantage of the more general structural alignment of PARTS is seen in secondary structure predictions for the RNase P family. For this family, the PARTS MAP predictions of secondary structures and alignment perform significantly better than prior methods that utilize a more restrictive structural alignment model. For the tRNA and 5S rRNA families, the richer structural alignment model of PARTS does not offer a benefit and the method therefore performs comparably with existing alternatives. For all RNA families studied, the posterior probability estimates obtained from PARTS offer an improvement over posterior probability estimates from a single sequence prediction. When considering the base pairings predicted over a threshold value of confidence, the combination of sensitivity and positive predictive value is superior for PARTS than for the single sequence prediction. PARTS source code is available for download under the GNU public license at http://rna.urmc.rochester.edu. PMID:18304945
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment

PubMed Central

2013-01-01

Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

PubMed

Nagar, Anurag; Hahsler, Michael

2013-01-01

Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.
Leveraging Community Colleges in the Workforce Innovation and Opportunity Act: A Blueprint for State Policymakers. State-Federal Partnerships in Postsecondary Education

ERIC Educational Resources Information Center

Campbell, Colleen; Love, Ivy

2016-01-01

The Workforce Innovation and Opportunity Act (WIOA) of 2014 offers multiple opportunities to align the workforce development efforts of these stakeholders through structural measures and targeted support. In this paper, the authors examine ways that WIOA can influence a state's job training environment and highlight the crucial role of community…
ZnO-based multiple channel and multiple gate FinMOSFETs

NASA Astrophysics Data System (ADS)

Lee, Ching-Ting; Huang, Hung-Lin; Tseng, Chun-Yen; Lee, Hsin-Ying

2016-02-01

In recent years, zinc oxide (ZnO)-based metal-oxide-semiconductor field-effect transistors (MOSFETs) have attracted much attention, because ZnO-based semiconductors possess several advantages, including large exciton binding energy, nontoxicity, biocompatibility, low material cost, and wide direct bandgap. Moreover, the ZnO-based MOSFET is one of most potential devices, due to the applications in microwave power amplifiers, logic circuits, large scale integrated circuits, and logic swing. In this study, to enhance the performances of the ZnO-based MOSFETs, the ZnObased multiple channel and multiple gate structured FinMOSFETs were fabricated using the simple laser interference photolithography method and the self-aligned photolithography method. The multiple channel structure possessed the additional sidewall depletion width control ability to improve the channel controllability, because the multiple channel sidewall portions were surrounded by the gate electrode. Furthermore, the multiple gate structure had a shorter distance between source and gate and a shorter gate length between two gates to enhance the gate operating performances. Besides, the shorter distance between source and gate could enhance the electron velocity in the channel fin structure of the multiple gate structure. In this work, ninety one channels and four gates were used in the FinMOSFETs. Consequently, the drain-source saturation current (IDSS) and maximum transconductance (gm) of the ZnO-based multiple channel and multiple gate structured FinFETs operated at a drain-source voltage (VDS) of 10 V and a gate-source voltage (VGS) of 0 V were respectively improved from 11.5 mA/mm to 13.7 mA/mm and from 4.1 mS/mm to 6.9 mS/mm in comparison with that of the conventional ZnO-based single channel and single gate MOSFETs.
Iterative pass optimization of sequence data

NASA Technical Reports Server (NTRS)

Wheeler, Ward C.

2003-01-01

The problem of determining the minimum-cost hypothetical ancestral sequences for a given cladogram is known to be NP-complete. This "tree alignment" problem has motivated the considerable effort placed in multiple sequence alignment procedures. Wheeler in 1996 proposed a heuristic method, direct optimization, to calculate cladogram costs without the intervention of multiple sequence alignment. This method, though more efficient in time and more effective in cladogram length than many alignment-based procedures, greedily optimizes nodes based on descendent information only. In their proposal of an exact multiple alignment solution, Sankoff et al. in 1976 described a heuristic procedure--the iterative improvement method--to create alignments at internal nodes by solving a series of median problems. The combination of a three-sequence direct optimization with iterative improvement and a branch-length-based cladogram cost procedure, provides an algorithm that frequently results in superior (i.e., lower) cladogram costs. This iterative pass optimization is both computation and memory intensive, but economies can be made to reduce this burden. An example in arthropod systematics is discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.
A new statistical framework to assess structural alignment quality using information compression

PubMed Central

Collier, James H.; Allison, Lloyd; Lesk, Arthur M.; Garcia de la Banda, Maria; Konagurthu, Arun S.

2014-01-01

Motivation: Progress in protein biology depends on the reliability of results from a handful of computational techniques, structural alignments being one. Recent reviews have highlighted substantial inconsistencies and differences between alignment results generated by the ever-growing stock of structural alignment programs. The lack of consensus on how the quality of structural alignments must be assessed has been identified as the main cause for the observed differences. Current methods assess structural alignment quality by constructing a scoring function that attempts to balance conflicting criteria, mainly alignment coverage and fidelity of structures under superposition. This traditional approach to measuring alignment quality, the subject of considerable literature, has failed to solve the problem. Further development along the same lines is unlikely to rectify the current deficiencies in the field. Results: This paper proposes a new statistical framework to assess structural alignment quality and significance based on lossless information compression. This is a radical departure from the traditional approach of formulating scoring functions. It links the structural alignment problem to the general class of statistical inductive inference problems, solved using the information-theoretic criterion of minimum message length. Based on this, we developed an efficient and reliable measure of structural alignment quality, I-value. The performance of I-value is demonstrated in comparison with a number of popular scoring functions, on a large collection of competing alignments. Our analysis shows that I-value provides a rigorous and reliable quantification of structural alignment quality, addressing a major gap in the field. Availability: http://lcb.infotech.monash.edu.au/I-value Contact: arun.konagurthu@monash.edu Supplementary information: Online supplementary data are available at http://lcb.infotech.monash.edu.au/I-value/suppl.html PMID:25161241
ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos.

PubMed

Roca, Alberto I

2014-01-01

The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org.
Multiple network alignment via multiMAGNA+.

PubMed

Vijayan, Vipin; Milenkovic, Tijana

2017-08-21

Network alignment (NA) aims to find a node mapping that identifies topologically or functionally similar network regions between molecular networks of different species. Analogous to genomic sequence alignment, NA can be used to transfer biological knowledge from well- to poorly-studied species between aligned network regions. Pairwise NA (PNA) finds similar regions between two networks while multiple NA (MNA) can align more than two networks. We focus on MNA. Existing MNA methods aim to maximize total similarity over all aligned nodes (node conservation). Then, they evaluate alignment quality by measuring the amount of conserved edges, but only after the alignment is constructed. Directly optimizing edge conservation during alignment construction in addition to node conservation may result in superior alignments. Thus, we present a novel MNA method called multiMAGNA++ that can achieve this. Indeed, multiMAGNA++ outperforms or is on par with existing MNA methods, while often completing faster than existing methods. That is, multiMAGNA++ scales well to larger network data and can be parallelized effectively. During method evaluation, we also introduce new MNA quality measures to allow for more fair MNA method comparison compared to the existing alignment quality measures. MultiMAGNA++ code is available on the method's web page at http://nd.edu/~cone/multiMAGNA++/.

SubVis: an interactive R package for exploring the effects of multiple substitution matrices on pairwise sequence alignment

PubMed Central

Coan, Heather B.; Youker, Robert T.

2017-01-01

Understanding how proteins mutate is critical to solving a host of biological problems. Mutations occur when an amino acid is substituted for another in a protein sequence. The set of likelihoods for amino acid substitutions is stored in a matrix and input to alignment algorithms. The quality of the resulting alignment is used to assess the similarity of two or more sequences and can vary according to assumptions modeled by the substitution matrix. Substitution strategies with minor parameter variations are often grouped together in families. For example, the BLOSUM and PAM matrix families are commonly used because they provide a standard, predefined way of modeling substitutions. However, researchers often do not know if a given matrix family or any individual matrix within a family is the most suitable. Furthermore, predefined matrix families may inaccurately reflect a particular hypothesis that a researcher wishes to model or otherwise result in unsatisfactory alignments. In these cases, the ability to compare the effects of one or more custom matrices may be needed. This laborious process is often performed manually because the ability to simultaneously load multiple matrices and then compare their effects on alignments is not readily available in current software tools. This paper presents SubVis, an interactive R package for loading and applying multiple substitution matrices to pairwise alignments. Users can simultaneously explore alignments resulting from multiple predefined and custom substitution matrices. SubVis utilizes several of the alignment functions found in R, a common language among protein scientists. Functions are tied together with the Shiny platform which allows the modification of input parameters. Information regarding alignment quality and individual amino acid substitutions is displayed with the JavaScript language which provides interactive visualizations for revealing both high-level and low-level alignment information. PMID:28674656
Multiscale Currents Observed by MMS in the Flow Braking Region.

PubMed

Nakamura, Rumi; Varsani, Ali; Genestreti, Kevin J; Le Contel, Olivier; Nakamura, Takuma; Baumjohann, Wolfgang; Nagai, Tsugunobu; Artemyev, Anton; Birn, Joachim; Sergeev, Victor A; Apatenkov, Sergey; Ergun, Robert E; Fuselier, Stephen A; Gershman, Daniel J; Giles, Barbara J; Khotyaintsev, Yuri V; Lindqvist, Per-Arne; Magnes, Werner; Mauk, Barry; Petrukovich, Anatoli; Russell, Christopher T; Stawarz, Julia; Strangeway, Robert J; Anderson, Brian; Burch, James L; Bromund, Ken R; Cohen, Ian; Fischer, David; Jaynes, Allison; Kepko, Laurence; Le, Guan; Plaschke, Ferdinand; Reeves, Geoff; Singer, Howard J; Slavin, James A; Torbert, Roy B; Turner, Drew L

2018-02-01

We present characteristics of current layers in the off-equatorial near-Earth plasma sheet boundary observed with high time-resolution measurements from the Magnetospheric Multiscale mission during an intense substorm associated with multiple dipolarizations. The four Magnetospheric Multiscale spacecraft, separated by distances of about 50 km, were located in the southern hemisphere in the dusk portion of a substorm current wedge. They observed fast flow disturbances (up to about 500 km/s), most intense in the dawn-dusk direction. Field-aligned currents were observed initially within the expanding plasma sheet, where the flow and field disturbances showed the distinct pattern expected in the braking region of localized flows. Subsequently, intense thin field-aligned current layers were detected at the inner boundary of equatorward moving flux tubes together with Earthward streaming hot ions. Intense Hall current layers were found adjacent to the field-aligned currents. In particular, we found a Hall current structure in the vicinity of the Earthward streaming ion jet that consisted of mixed ion components, that is, hot unmagnetized ions, cold E × B drifting ions, and magnetized electrons. Our observations show that both the near-Earth plasma jet diversion and the thin Hall current layers formed around the reconnection jet boundary are the sites where diversion of the perpendicular currents take place that contribute to the observed field-aligned current pattern as predicted by simulations of reconnection jets. Hence, multiscale structure of flow braking is preserved in the field-aligned currents in the off-equatorial plasma sheet and is also translated to ionosphere to become a part of the substorm field-aligned current system.
Direct Observations of ULF and Whistler-Mode Chorus Modulation of 500eV EDI Electrons by MMS

NASA Astrophysics Data System (ADS)

Paulson, K. W.; Argall, M. R.; Ahmadi, N.; Torbert, R. B.; Le Contel, O.; Ergun, R.; Khotyaintsev, Y. V.; Strangeway, R. J.; Magnes, W.; Russell, C. T.

2016-12-01

We present here direct observations of chorus-wave modulated field-aligned 500 eV electrons using the Electron Drift Instrument (EDI) on board the Magnetospheric Multiscale mission. These periods of wave activity were additionally observed to be modulated by Pc5-frequency magnetic perturbations, some of which have been identified as drifting mirror-mode structures. The spacecraft encountered these mirror-mode structures just inside of the duskside magnetopause. Using the high sampling rate provided by EDI in burst sampling mode, we are able to observe the individual count fluctuations of field-aligned electrons in this region up to 512 Hz. We use the multiple look directions of EDI to generate both pitch angle and gyrophase plots of the fluctuating counts. Our observations often show unidirectional flow of these modulated electrons along the background field, and in some cases demonstrate gyrophase bunching in the wave region.
3D tissue formation by stacking detachable cell sheets formed on nanofiber mesh.

PubMed

Kim, Min Sung; Lee, Byungjun; Kim, Hong Nam; Bang, Seokyoung; Yang, Hee Seok; Kang, Seong Min; Suh, Kahp-Yang; Park, Suk-Hee; Jeon, Noo Li

2017-03-23

We present a novel approach for assembling 3D tissue by layer-by-layer stacking of cell sheets formed on aligned nanofiber mesh. A rigid frame was used to repeatedly collect aligned electrospun PCL (polycaprolactone) nanofiber to form a mesh structure with average distance between fibers 6.4 µm. When human umbilical vein endothelial cells (HUVECs), human foreskin dermal fibroblasts, and skeletal muscle cells (C2C12) were cultured on the nanofiber mesh, they formed confluent monolayers and could be handled as continuous cell sheets with areas 3 × 3 cm 2 or larger. Thicker 3D tissues have been formed by stacking multiple cell sheets collected on frames that can be nested (i.e. Matryoshka dolls) without any special tools. When cultured on the nanofiber mesh, skeletal muscle, C2C12 cells oriented along the direction of the nanofibers and differentiated into uniaxially aligned multinucleated myotube. Myotube cell sheets were stacked (upto 3 layers) in alternating or aligned directions to form thicker tissue with ∼50 µm thickness. Sandwiching HUVEC cell sheets with two dermal fibroblast cell sheets resulted in vascularized 3D tissue. HUVECs formed extensive networks and expressed CD31, a marker of endothelial cells. Cell sheets formed on nanofiber mesh have a number of advantages, including manipulation and stacking of multiple cell sheets for constructing 3D tissue and may find applications in a variety of tissue engineering applications.
A study of large, medium and small scale structures in the topside ionosphere

NASA Technical Reports Server (NTRS)

Gross, Stanley H.; Kuo, Spencer P.; Shmoys, Jerry

1986-01-01

Alouette and ISIS data were studied for large, medium, and small scale structures in the ionosphere. Correlation was also sought with measurements by other satellites, such as the Atmosphere Explorer C and E and the Dynamic Explorer 2 satellites, of both neutrals and ionization, and with measurements by ground facilities, such as the incoherent scatter radars. Large scale coherent wavelike structures were found from ISIS 2 electron density contours from above the F peak to nearly the satellite altitude. Such structures were also found to correlate with the observation by AE-C below the F peak during a conjunction of the two satellites. Vertical wavefronts found in the upper F region suggest the dominance of diffusion along field lines as well. Also discovered were multiple, evenly spaced field-aligned ducts in the F region that, at low latitudes, extended to the other hemisphere and were in the form of field-aligned sheets in the east-west direction. Low latitude heating events were discovered that could serve as sources for waves in the ionosphere.
LS-align: an atom-level, flexible ligand structural alignment algorithm for high-throughput virtual screening.

PubMed

Hu, Jun; Liu, Zi; Yu, Dong-Jun; Zhang, Yang

2018-02-15

Sequence-order independent structural comparison, also called structural alignment, of small ligand molecules is often needed for computer-aided virtual drug screening. Although many ligand structure alignment programs are proposed, most of them build the alignments based on rigid-body shape comparison which cannot provide atom-specific alignment information nor allow structural variation; both abilities are critical to efficient high-throughput virtual screening. We propose a novel ligand comparison algorithm, LS-align, to generate fast and accurate atom-level structural alignments of ligand molecules, through an iterative heuristic search of the target function that combines inter-atom distance with mass and chemical bond comparisons. LS-align contains two modules of Rigid-LS-align and Flexi-LS-align, designed for rigid-body and flexible alignments, respectively, where a ligand-size independent, statistics-based scoring function is developed to evaluate the similarity of ligand molecules relative to random ligand pairs. Large-scale benchmark tests are performed on prioritizing chemical ligands of 102 protein targets involving 1,415,871 candidate compounds from the DUD-E (Database of Useful Decoys: Enhanced) database, where LS-align achieves an average enrichment factor (EF) of 22.0 at the 1% cutoff and the AUC score of 0.75, which are significantly higher than other state-of-the-art methods. Detailed data analyses show that the advanced performance is mainly attributed to the design of the target function that combines structural and chemical information to enhance the sensitivity of recognizing subtle difference of ligand molecules and the introduces of structural flexibility that help capture the conformational changes induced by the ligand-receptor binding interactions. These data demonstrate a new avenue to improve the virtual screening efficiency through the development of sensitive ligand structural alignments. http://zhanglab.ccmb.med.umich.edu/LS-align/. njyudj@njust.edu.cn or zhng@umich.edu. Supplementary data are available at Bioinformatics online.
Multiple alignment analysis on phylogenetic tree of the spread of SARS epidemic using distance method

NASA Astrophysics Data System (ADS)

Amiroch, S.; Pradana, M. S.; Irawan, M. I.; Mukhlash, I.

2017-09-01

Multiple Alignment (MA) is a particularly important tool for studying the viral genome and determine the evolutionary process of the specific virus. Application of MA in the case of the spread of the Severe acute respiratory syndrome (SARS) epidemic is an interesting thing because this virus epidemic a few years ago spread so quickly that medical attention in many countries. Although there has been a lot of software to process multiple sequences, but the use of pairwise alignment to process MA is very important to consider. In previous research, the alignment between the sequences to process MA algorithm, Super Pairwise Alignment, but in this study used a dynamic programming algorithm Needleman wunchs simulated in Matlab. From the analysis of MA obtained and stable region and unstable which indicates the position where the mutation occurs, the system network topology that produced the phylogenetic tree of the SARS epidemic distance method, and system area networks mutation.
Analysis of Ribosome Inactivating Protein (RIP): A Bioinformatics Approach

NASA Astrophysics Data System (ADS)

Jothi, G. Edward Gnana; Majilla, G. Sahaya Jose; Subhashini, D.; Deivasigamani, B.

2012-10-01

In spite of the medical advances in recent years, the world is in need of different sources to encounter certain health issues.Ribosome Inactivating Proteins (RIPs) were found to be one among them. In order to get easy access about RIPs, there is a need to analyse RIPs towards constructing a database on RIPs. Also, multiple sequence alignment was done towards screening for homologues of significant RIPs from rare sources against RIPs from easily available sources in terms of similarity. Protein sequences were retrieved from SWISS-PROT and are further analysed using pair wise and multiple sequence alignment.Analysis shows that, 151 RIPs have been characterized to date. Amongst them, there are 87 type I, 37 type II, 1 type III and 25 unknown RIPs. The sequence length information of various RIPs about the availability of full or partial sequence was also found. The multiple sequence alignment of 37 type I RIP using the online server Multalin, indicates the presence of 20 conserved residues. Pairwise alignment and multiple sequence alignment of certain selected RIPs in two groups namely Group I and Group II were carried out and the consensus level was found to be 98%, 98% and 90% respectively.
Ontology Alignment Architecture for Semantic Sensor Web Integration

PubMed Central

Fernandez, Susel; Marsa-Maestre, Ivan; Velasco, Juan R.; Alarcos, Bernardo

2013-01-01

Sensor networks are a concept that has become very popular in data acquisition and processing for multiple applications in different fields such as industrial, medicine, home automation, environmental detection, etc. Today, with the proliferation of small communication devices with sensors that collect environmental data, semantic Web technologies are becoming closely related with sensor networks. The linking of elements from Semantic Web technologies with sensor networks has been called Semantic Sensor Web and has among its main features the use of ontologies. One of the key challenges of using ontologies in sensor networks is to provide mechanisms to integrate and exchange knowledge from heterogeneous sources (that is, dealing with semantic heterogeneity). Ontology alignment is the process of bringing ontologies into mutual agreement by the automatic discovery of mappings between related concepts. This paper presents a system for ontology alignment in the Semantic Sensor Web which uses fuzzy logic techniques to combine similarity measures between entities of different ontologies. The proposed approach focuses on two key elements: the terminological similarity, which takes into account the linguistic and semantic information of the context of the entity's names, and the structural similarity, based on both the internal and relational structure of the concepts. This work has been validated using sensor network ontologies and the Ontology Alignment Evaluation Initiative (OAEI) tests. The results show that the proposed techniques outperform previous approaches in terms of precision and recall. PMID:24051523
Ontology alignment architecture for semantic sensor Web integration.

PubMed

Fernandez, Susel; Marsa-Maestre, Ivan; Velasco, Juan R; Alarcos, Bernardo

2013-09-18

Sensor networks are a concept that has become very popular in data acquisition and processing for multiple applications in different fields such as industrial, medicine, home automation, environmental detection, etc. Today, with the proliferation of small communication devices with sensors that collect environmental data, semantic Web technologies are becoming closely related with sensor networks. The linking of elements from Semantic Web technologies with sensor networks has been called Semantic Sensor Web and has among its main features the use of ontologies. One of the key challenges of using ontologies in sensor networks is to provide mechanisms to integrate and exchange knowledge from heterogeneous sources (that is, dealing with semantic heterogeneity). Ontology alignment is the process of bringing ontologies into mutual agreement by the automatic discovery of mappings between related concepts. This paper presents a system for ontology alignment in the Semantic Sensor Web which uses fuzzy logic techniques to combine similarity measures between entities of different ontologies. The proposed approach focuses on two key elements: the terminological similarity, which takes into account the linguistic and semantic information of the context of the entity's names, and the structural similarity, based on both the internal and relational structure of the concepts. This work has been validated using sensor network ontologies and the Ontology Alignment Evaluation Initiative (OAEI) tests. The results show that the proposed techniques outperform previous approaches in terms of precision and recall.
SAbPred: a structure-based antibody prediction server

PubMed Central

Dunbar, James; Krawczyk, Konrad; Leem, Jinwoo; Marks, Claire; Nowak, Jaroslaw; Regep, Cristian; Georges, Guy; Kelm, Sebastian; Popovic, Bojana; Deane, Charlotte M.

2016-01-01

SAbPred is a server that makes predictions of the properties of antibodies focusing on their structures. Antibody informatics tools can help improve our understanding of immune responses to disease and aid in the design and engineering of therapeutic molecules. SAbPred is a single platform containing multiple applications which can: number and align sequences; automatically generate antibody variable fragment homology models; annotate such models with estimated accuracy alongside sequence and structural properties including potential developability issues; predict paratope residues; and predict epitope patches on protein antigens. The server is available at http://opig.stats.ox.ac.uk/webapps/sabpred. PMID:27131379
Evolutionary distances in the twilight zone--a rational kernel approach.

PubMed

Schwarz, Roland F; Fletcher, William; Förster, Frank; Merget, Benjamin; Wolf, Matthias; Schultz, Jörg; Markowetz, Florian

2010-12-31

Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.
GenomeVista

DOE Office of Scientific and Technical Information (OSTI.GOV)

Poliakov, Alexander; Couronne, Olivier

2002-11-04

Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less
Independent Metrics for Protein Backbone and Side-Chain Flexibility: Time Scales and Effects of Ligand Binding.

PubMed

Fuchs, Julian E; Waldner, Birgit J; Huber, Roland G; von Grafenstein, Susanne; Kramer, Christian; Liedl, Klaus R

2015-03-10

Conformational dynamics are central for understanding biomolecular structure and function, since biological macromolecules are inherently flexible at room temperature and in solution. Computational methods are nowadays capable of providing valuable information on the conformational ensembles of biomolecules. However, analysis tools and intuitive metrics that capture dynamic information from in silico generated structural ensembles are limited. In standard work-flows, flexibility in a conformational ensemble is represented through residue-wise root-mean-square fluctuations or B-factors following a global alignment. Consequently, these approaches relying on global alignments discard valuable information on local dynamics. Results inherently depend on global flexibility, residue size, and connectivity. In this study we present a novel approach for capturing positional fluctuations based on multiple local alignments instead of one single global alignment. The method captures local dynamics within a structural ensemble independent of residue type by splitting individual local and global degrees of freedom of protein backbone and side-chains. Dependence on residue type and size in the side-chains is removed via normalization with the B-factors of the isolated residue. As a test case, we demonstrate its application to a molecular dynamics simulation of bovine pancreatic trypsin inhibitor (BPTI) on the millisecond time scale. This allows for illustrating different time scales of backbone and side-chain flexibility. Additionally, we demonstrate the effects of ligand binding on side-chain flexibility of three serine proteases. We expect our new methodology for quantifying local flexibility to be helpful in unraveling local changes in biomolecular dynamics.
QuickProbs—A Fast Multiple Sequence Alignment Algorithm Designed for Graphics Processors

PubMed Central

Gudyś, Adam; Deorowicz, Sebastian

2014-01-01

Multiple sequence alignment is a crucial task in a number of biological analyses like secondary structure prediction, domain searching, phylogeny, etc. MSAProbs is currently the most accurate alignment algorithm, but its effectiveness is obtained at the expense of computational time. In the paper we present QuickProbs, the variant of MSAProbs customised for graphics processors. We selected the two most time consuming stages of MSAProbs to be redesigned for GPU execution: the posterior matrices calculation and the consistency transformation. Experiments on three popular benchmarks (BAliBASE, PREFAB, OXBench-X) on quad-core PC equipped with high-end graphics card show QuickProbs to be 5.7 to 9.7 times faster than original CPU-parallel MSAProbs. Additional tests performed on several protein families from Pfam database give overall speed-up of 6.7. Compared to other algorithms like MAFFT, MUSCLE, or ClustalW, QuickProbs proved to be much more accurate at similar speed. Additionally we introduce a tuned variant of QuickProbs which is significantly more accurate on sets of distantly related sequences than MSAProbs without exceeding its computation time. The GPU part of QuickProbs was implemented in OpenCL, thus the package is suitable for graphics processors produced by all major vendors. PMID:24586435
New Challenges of the Computation of Multiple Sequence Alignments in the High-Throughput Era (2010 JGI/ANL HPC Workshop)

ScienceCinema

Notredame, Cedric

2018-05-02

Cedric Notredame from the Centre for Genomic Regulation gives a presentation on New Challenges of the Computation of Multiple Sequence Alignments in the High-Throughput Era at the JGI/Argonne HPC Workshop on January 26, 2010.
Alignment between Protostellar Outflows and Filamentary Structure

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stephens, Ian W.; Dunham, Michael M.; Myers, Philip C.

2017-09-01

We present new Submillimeter Array (SMA) observations of CO(2–1) outflows toward young, embedded protostars in the Perseus molecular cloud as part of the Mass Assembly of Stellar Systems and their Evolution with the SMA (MASSES) survey. For 57 Perseus protostars, we characterize the orientation of the outflow angles and compare them with the orientation of the local filaments as derived from Herschel observations. We find that the relative angles between outflows and filaments are inconsistent with purely parallel or purely perpendicular distributions. Instead, the observed distribution of outflow-filament angles are more consistent with either randomly aligned angles or a mixmore » of projected parallel and perpendicular angles. A mix of parallel and perpendicular angles requires perpendicular alignment to be more common by a factor of ∼3. Our results show that the observed distributions probably hold regardless of the protostar’s multiplicity, age, or the host core’s opacity. These observations indicate that the angular momentum axis of a protostar may be independent of the large-scale structure. We discuss the significance of independent protostellar rotation axes in the general picture of filament-based star formation.« less
Two Simple and Efficient Algorithms to Compute the SP-Score Objective Function of a Multiple Sequence Alignment.

PubMed

Ranwez, Vincent

2016-01-01

Multiple sequence alignment (MSA) is a crucial step in many molecular analyses and many MSA tools have been developed. Most of them use a greedy approach to construct a first alignment that is then refined by optimizing the sum of pair score (SP-score). The SP-score estimation is thus a bottleneck for most MSA tools since it is repeatedly required and is time consuming. Given an alignment of n sequences and L sites, I introduce here optimized solutions reaching O(nL) time complexity for affine gap cost, instead of O(n2L), which are easy to implement.
Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets.

PubMed

Hoffmann, Nils; Keck, Matthias; Neuweger, Heiko; Wilhelm, Mathias; Högy, Petra; Niehaus, Karsten; Stoye, Jens

2012-08-27

Modern analytical methods in biology and chemistry use separation techniques coupled to sensitive detectors, such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). These hyphenated methods provide high-dimensional data. Comparing such data manually to find corresponding signals is a laborious task, as each experiment usually consists of thousands of individual scans, each containing hundreds or even thousands of distinct signals. In order to allow for successful identification of metabolites or proteins within such data, especially in the context of metabolomics and proteomics, an accurate alignment and matching of corresponding features between two or more experiments is required. Such a matching algorithm should capture fluctuations in the chromatographic system which lead to non-linear distortions on the time axis, as well as systematic changes in recorded intensities. Many different algorithms for the retention time alignment of GC-MS and LC-MS data have been proposed and published, but all of them focus either on aligning previously extracted peak features or on aligning and comparing the complete raw data containing all available features. In this paper we introduce two algorithms for retention time alignment of multiple GC-MS datasets: multiple alignment by bidirectional best hits peak assignment and cluster extension (BIPACE) and center-star multiple alignment by pairwise partitioned dynamic time warping (CeMAPP-DTW). We show how the similarity-based peak group matching method BIPACE may be used for multiple alignment calculation individually and how it can be used as a preprocessing step for the pairwise alignments performed by CeMAPP-DTW. We evaluate the algorithms individually and in combination on a previously published small GC-MS dataset studying the Leishmania parasite and on a larger GC-MS dataset studying grains of wheat (Triticum aestivum). We have shown that BIPACE achieves very high precision and recall and a very low number of false positive peak assignments on both evaluation datasets. CeMAPP-DTW finds a high number of true positives when executed on its own, but achieves even better results when BIPACE is used to constrain its search space. The source code of both algorithms is included in the OpenSource software framework Maltcms, which is available from http://maltcms.sf.net. The evaluation scripts of the present study are available from the same source.
Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets

PubMed Central

2012-01-01

Background Modern analytical methods in biology and chemistry use separation techniques coupled to sensitive detectors, such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS). These hyphenated methods provide high-dimensional data. Comparing such data manually to find corresponding signals is a laborious task, as each experiment usually consists of thousands of individual scans, each containing hundreds or even thousands of distinct signals. In order to allow for successful identification of metabolites or proteins within such data, especially in the context of metabolomics and proteomics, an accurate alignment and matching of corresponding features between two or more experiments is required. Such a matching algorithm should capture fluctuations in the chromatographic system which lead to non-linear distortions on the time axis, as well as systematic changes in recorded intensities. Many different algorithms for the retention time alignment of GC-MS and LC-MS data have been proposed and published, but all of them focus either on aligning previously extracted peak features or on aligning and comparing the complete raw data containing all available features. Results In this paper we introduce two algorithms for retention time alignment of multiple GC-MS datasets: multiple alignment by bidirectional best hits peak assignment and cluster extension (BIPACE) and center-star multiple alignment by pairwise partitioned dynamic time warping (CeMAPP-DTW). We show how the similarity-based peak group matching method BIPACE may be used for multiple alignment calculation individually and how it can be used as a preprocessing step for the pairwise alignments performed by CeMAPP-DTW. We evaluate the algorithms individually and in combination on a previously published small GC-MS dataset studying the Leishmania parasite and on a larger GC-MS dataset studying grains of wheat (Triticum aestivum). Conclusions We have shown that BIPACE achieves very high precision and recall and a very low number of false positive peak assignments on both evaluation datasets. CeMAPP-DTW finds a high number of true positives when executed on its own, but achieves even better results when BIPACE is used to constrain its search space. The source code of both algorithms is included in the OpenSource software framework Maltcms, which is available from http://maltcms.sf.net. The evaluation scripts of the present study are available from the same source. PMID:22920415

Statistical inference of protein structural alignments using information and compression.

PubMed

Collier, James H; Allison, Lloyd; Lesk, Arthur M; Stuckey, Peter J; Garcia de la Banda, Maria; Konagurthu, Arun S

2017-04-01

Structural molecular biology depends crucially on computational techniques that compare protein three-dimensional structures and generate structural alignments (the assignment of one-to-one correspondences between subsets of amino acids based on atomic coordinates). Despite its importance, the structural alignment problem has not been formulated, much less solved, in a consistent and reliable way. To overcome these difficulties, we present here a statistical framework for the precise inference of structural alignments, built on the Bayesian and information-theoretic principle of Minimum Message Length (MML). The quality of any alignment is measured by its explanatory power-the amount of lossless compression achieved to explain the protein coordinates using that alignment. We have implemented this approach in MMLigner , the first program able to infer statistically significant structural alignments. We also demonstrate the reliability of MMLigner 's alignment results when compared with the state of the art. Importantly, MMLigner can also discover different structural alignments of comparable quality, a challenging problem for oligomers and protein complexes. Source code, binaries and an interactive web version are available at http://lcb.infotech.monash.edu.au/mmligner . arun.konagurthu@monash.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Bellerophon: A program to detect chimeric sequences in multiple sequence alignments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip

2003-12-23

Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.
Multiple sequence alignment in HTML: colored, possibly hyperlinked, compact representations.

PubMed

Campagne, F; Maigret, B

1998-02-01

Protein sequence alignments are widely used in protein structure prediction, protein engineering, modeling of proteins, etc. This type of representation is useful at different stages of scientific activity: looking at previous results, working on a research project, and presenting the results. There is a need to make it available through a network (intranet or WWW), in a way that allows biologists, chemists, and noncomputer specialists to look at the data and carry on research--possibly in a collaborative research. Previous methods (text-based, Java-based) are reported and their advantages are discussed. We have developed two novel approaches to represent the alignments as colored, hyper-linked HTML pages. The first method creates an HTML page that uses efficiently the image cache mechanism of a WWW browser, thereby allowing the user to browse different alignments without waiting for the images to be loaded through the network, but only for the first viewed alignment. The generated pages can be browsed with any HTML2.0-compliant browser. The second method that we propose uses W3C-CSS1-style sheets to render alignments. This new method generates pages that require recent browsers to be viewed. We implemented these methods in the Viseur program and made a WWW service available that allows a user to convert an MSF alignment file in HTML for WWW publishing. The latter service is available at http:@www.lctn.u-nancy.fr/viseur/services.htm l.
A distributed system for fast alignment of next-generation sequencing data.

PubMed

Srimani, Jaydeep K; Wu, Po-Yen; Phan, John H; Wang, May D

2010-12-01

We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment results. Moreover, as the data and alignment algorithms become more prevalent, it will become necessary to examine the effect of the multitude of alignment parameters on various NGS systems. We validate the distributed software system by (1) computing simple timing results to show the speed-up gained by using multiple computers, (2) optimizing alignment parameters using simulated NGS data, and (3) computing NGS expression levels for a single biological sample using optimal parameters and comparing these expression levels to that of a microarray sample. Results indicate that the distributed alignment system achieves approximately a linear speed-up and correctly distributes sequence data to and gathers alignment results from multiple compute clients.
ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos

PubMed Central

2014-01-01

Background The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. Results The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. Conclusions The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org. PMID:25237393
Protein alignment algorithms with an efficient backtracking routine on multiple GPUs.

PubMed

Blazewicz, Jacek; Frohmberg, Wojciech; Kierzynka, Michal; Pesch, Erwin; Wojciechowski, Pawel

2011-05-20

Pairwise sequence alignment methods are widely used in biological research. The increasing number of sequences is perceived as one of the upcoming challenges for sequence alignment methods in the nearest future. To overcome this challenge several GPU (Graphics Processing Unit) computing approaches have been proposed lately. These solutions show a great potential of a GPU platform but in most cases address the problem of sequence database scanning and computing only the alignment score whereas the alignment itself is omitted. Thus, the need arose to implement the global and semiglobal Needleman-Wunsch, and Smith-Waterman algorithms with a backtracking procedure which is needed to construct the alignment. In this paper we present the solution that performs the alignment of every given sequence pair, which is a required step for progressive multiple sequence alignment methods, as well as for DNA recognition at the DNA assembly stage. Performed tests show that the implementation, with performance up to 6.3 GCUPS on a single GPU for affine gap penalties, is very efficient in comparison to other CPU and GPU-based solutions. Moreover, multiple GPUs support with load balancing makes the application very scalable. The article shows that the backtracking procedure of the sequence alignment algorithms may be designed to fit in with the GPU architecture. Therefore, our algorithm, apart from scores, is able to compute pairwise alignments. This opens a wide range of new possibilities, allowing other methods from the area of molecular biology to take advantage of the new computational architecture. Performed tests show that the efficiency of the implementation is excellent. Moreover, the speed of our GPU-based algorithms can be almost linearly increased when using more than one graphics card.
Identification and cloning of four riboswitches from Burkholderia pseudomallei strain K96243

NASA Astrophysics Data System (ADS)

Munyati-Othman, Noor; Fatah, Ahmad Luqman Abdul; Piji, Mohd Al Akmarul Fizree Bin Md; Ramlan, Effirul Ikhwan; Raih, Mohd Firdaus

2015-09-01

Structured RNAs referred as riboswitches have been predicted to be present in the genome sequence of Burkholderia pseudomallei strain K96243. Four of the riboswitches were identified and analyzed through BLASTN, Rfam search and multiple sequence alignment. The RNA aptamers belong to the following riboswitch classifications: glycine riboswitch, cobalamin riboswitch, S-adenosyl-(L)-homocysteine (SAH) riboswitch and flavin mononucleotide (FMN) riboswitch. The conserved nucleotides for each aptamer were identified and were marked on the secondary structure generated by RNAfold. These riboswitches were successfully amplified and cloned for further study.
Heuristics for multiobjective multiple sequence alignment.

PubMed

Abbasi, Maryam; Paquete, Luís; Pereira, Francisco B

2016-07-15

Aligning multiple sequences arises in many tasks in Bioinformatics. However, the alignments produced by the current software packages are highly dependent on the parameters setting, such as the relative importance of opening gaps with respect to the increase of similarity. Choosing only one parameter setting may provide an undesirable bias in further steps of the analysis and give too simplistic interpretations. In this work, we reformulate multiple sequence alignment from a multiobjective point of view. The goal is to generate several sequence alignments that represent a trade-off between maximizing the substitution score and minimizing the number of indels/gaps in the sum-of-pairs score function. This trade-off gives to the practitioner further information about the similarity of the sequences, from which she could analyse and choose the most plausible alignment. We introduce several heuristic approaches, based on local search procedures, that compute a set of sequence alignments, which are representative of the trade-off between the two objectives (substitution score and indels). Several algorithm design options are discussed and analysed, with particular emphasis on the influence of the starting alignment and neighborhood search definitions on the overall performance. A perturbation technique is proposed to improve the local search, which provides a wide range of high-quality alignments. The proposed approach is tested experimentally on a wide range of instances. We performed several experiments with sequences obtained from the benchmark database BAliBASE 3.0. To evaluate the quality of the results, we calculate the hypervolume indicator of the set of score vectors returned by the algorithms. The results obtained allow us to identify reasonably good choices of parameters for our approach. Further, we compared our method in terms of correctly aligned pairs ratio and columns correctly aligned ratio with respect to reference alignments. Experimental results show that our approaches can obtain better results than TCoffee and Clustal Omega in terms of the first ratio.
Optimal simultaneous superpositioning of multiple structures with missing data.

PubMed

Theobald, Douglas L; Steindel, Phillip A

2012-08-01

Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the different structures. However, in practice, some points are usually 'missing' from several structures, for example, when the alignment contains gaps. Current superposition methods deal with missing data simply by superpositioning a subset of points that are shared among all the structures. This practice is inefficient, as it ignores important data, and it fails to satisfy the common least-squares criterion. In the extreme, disregarding missing positions prohibits the calculation of a superposition altogether. Here, we present a general solution for determining an optimal superposition when some of the data are missing. We use the expectation-maximization algorithm, a classic statistical technique for dealing with incomplete data, to find both maximum-likelihood solutions and the optimal least-squares solution as a special case. The methods presented here are implemented in THESEUS 2.0, a program for superpositioning macromolecular structures. ANSI C source code and selected compiled binaries for various computing platforms are freely available under the GNU open source license from http://www.theseus3d.org. dtheobald@brandeis.edu Supplementary data are available at Bioinformatics online.
A method of alignment masking for refining the phylogenetic signal of multiple sequence alignments.

PubMed

Rajan, Vaibhav

2013-03-01

Inaccurate inference of positional homologies in multiple sequence alignments and systematic errors introduced by alignment heuristics obfuscate phylogenetic inference. Alignment masking, the elimination of phylogenetically uninformative or misleading sites from an alignment before phylogenetic analysis, is a common practice in phylogenetic analysis. Although masking is often done manually, automated methods are necessary to handle the much larger data sets being prepared today. In this study, we introduce the concept of subsplits and demonstrate their use in extracting phylogenetic signal from alignments. We design a clustering approach for alignment masking where each cluster contains similar columns-similarity being defined on the basis of compatible subsplits; our approach then identifies noisy clusters and eliminates them. Trees inferred from the columns in the retained clusters are found to be topologically closer to the reference trees. We test our method on numerous standard benchmarks (both synthetic and biological data sets) and compare its performance with other methods of alignment masking. We find that our method can eliminate sites more accurately than other methods, particularly on divergent data, and can improve the topologies of the inferred trees in likelihood-based analyses. Software available upon request from the author.
Phylo-VISTA: Interactive visualization of multiple DNA sequence alignments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shah, Nameeta; Couronne, Olivier; Pennacchio, Len A.

The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. Results: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a frameworkmore » based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. Availability: Phylo-VISTA is available at http://www-gsd.lbl. gov/phylovista. It requires an Internet browser with Java Plugin 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu« less
Alignment and testing of critical interface fixtures for the James Webb Space Telescope

NASA Astrophysics Data System (ADS)

McLean, Kyle; Bagdanove, Paul; Berrier, Joshua; Cofie, Emmanuel; Glassman, Tiffany; Hadjimichael, Theodore; Johnson, Eric; Levi, Joshua; Lo, Amy; McMann, Joseph; Ohl, Raymond; Osgood, Dean; Parker, James; Redman, Kevin; Roberts, Vicki; Stephens, Matthew; Sutton, Adam; Wenzel, Greg; Young, Jerrod

2017-08-01

NASA's James Webb Space Telescope (JWST) is a 6.5m diameter, segmented, deployable telescope for cryogenic IR space astronomy. The JWST Observatory architecture includes the Primary Mirror Backplane Support Structure (PMBSS) and Integrated Science Instrument Module (ISIM) Electronics Compartment (IEC) which is designed to integrate to the spacecraft bus via six cup/cone interfaces. Prior to integration to the spacecraft bus, the JWST observatory must undergo environmental testing, handling, and transportation. Multiple fixtures were developed to support these tasks including the vibration fixture and handling and integration fixture (HIF). This work reports on the development of the nominal alignment of the six interfaces and metrology operations performed for the JWST observatory to safely integrate them for successful environmental testing.
Alignment and Testing of Critical Interface Fixtures for the James Webb Space Telescope

NASA Technical Reports Server (NTRS)

Mclean, Kyle; Bagdanove, Paul; Berrier, Joshua; Cofie, Emmanuel; Glassman, Tiffany; Hadjimichael, Theodore; Johnson, Eric; Levi, Joshua; Lo, Amy; McMann, Joseph;

2017-01-01

NASA's James Webb Space Telescope (JWST) is a 6.6m diameter, segmented, deployable telescope for cryogenic IR space astronomy. The JWST Observatory architecture includes the Primary Mirror Backplane Support Structure (PMBSS) and Integrated Science Instrument Module (ISIM) Electronics Compartment (IEC) which is designed to integrate to the spacecraft bus via six cup/cone interfaces. Prior to integration to the spacecraft bus the JWST observatory must undergo environmental testing, handling, and transportation. Multiple fixtures were developed to support these tasks including the vibration fixture and handling and integration fixture (HIF). This work reports on the development of the nominal alignment of the six interfaces and metrology operations performed for the JWST observatory to safely integrate them for successful environmental testing.

Alignment and Testing of Critical Interface Fixtures for the James Webb Space Telescope

NASA Technical Reports Server (NTRS)

Mclean, Kyle; Bagdanove, Paul; Berrier, Joshua; Cofie, Emmanuel; Glassman, Tiffany; Hadjimichael, Theodore; Johnson, Eric; Levi, Joshua; Lo, Amy; McMann, Joseph;

2017-01-01

NASAs James Webb Space Telescope (JWST) is a 6.6m diameter, segmented, deployable telescope for cryogenic IR space astronomy. The JWST Observatory architecture includes the Primary Mirror Backplane Support Structure (PMBSS) and Integrated Science Instrument Module (ISIM) Electronics Compartment (IEC) which is designed to integrate to the spacecraft bus via six cupcone interfaces. Prior to integration to the spacecraft bus the JWST observatory must undergo environmental testing, handling, and transportation. Multiple fixtures were developed to support these tasks including the vibration fixture and handling and integration fixture (HIF). This work reports on the development of the nominal alignment of the six interfaces and metrology operations performed for the JWST observatory to safely integrate them for successful environmental testing.

Swarm observation of field-aligned current and electric field in multiple arc systems

NASA Astrophysics Data System (ADS)

Wu, J.; Knudsen, D. J.; Gillies, M.; Donovan, E.; Burchill, J. K.

2017-12-01

It is often thought that auroral arcs are a direct consequence of upward field-aligned currents. In fact, the relation between currents and brightness is more complicated. Multiple auroral arc systems provide and opportunity to study this relation in detail. In this study, we have identified two types of FAC configurations in multiple parallel arc systems using ground-based optical data from the THEMIS all-sky imagers (ASIs), magnetometers and electric field instruments onboard the Swarm satellites during the period from December 2013 to March 2015. In type 1 events, each arc is an intensification within a broad, unipolar current sheet and downward currents only exist outside the upward current sheet. These types of events are termed "unipolar FAC" events. In type 2 events, multiple arc systems represent a collection of multiple up/down current pairs, which are termed as "multipolar FAC" events. Comparisons of these two types of FAC events are presented with 17 "unipolar FAC" events and 12 "multipolar FAC" events. The results show that "unipolar FAC" and "multipolar FAC" events have systematic differences in terms of MLT, arc width and separation, and dependence on substorm onset time. For "unipolar FAC" events, significant electric field enhancements are shown on the edges of the broad upward current sheet. Electric field fluctuations inside the multiple arc system can be large or small. For "multipolar FAC" events, a strong correlation between magnetic and electric field indicate uniform conductance within each upward current sheet. The electrodynamical structures of multiple arc systems presented in this paper represents a step toward understanding arc generation.
Characterization and functional analyses of a novel chicken CD8a variant X1 (CD8a1)

USDA-ARS?s Scientific Manuscript database

We provide the first description of cloning, as well as structural and functional analysis of a novel variant in the chicken CD8alpha family, termed the CD8-alpha X1 (CD8alpha1) gene. Multiple alignment of CD8alpha1 with known CD8alpha and beta sequences of other species revealed relatively low con...
BAYESIAN PROTEIN STRUCTURE ALIGNMENT.

PubMed

Rodriguez, Abel; Schmidler, Scott C

The analysis of the three-dimensional structure of proteins is an important topic in molecular biochemistry. Structure plays a critical role in defining the function of proteins and is more strongly conserved than amino acid sequence over evolutionary timescales. A key challenge is the identification and evaluation of structural similarity between proteins; such analysis can aid in understanding the role of newly discovered proteins and help elucidate evolutionary relationships between organisms. Computational biologists have developed many clever algorithmic techniques for comparing protein structures, however, all are based on heuristic optimization criteria, making statistical interpretation somewhat difficult. Here we present a fully probabilistic framework for pairwise structural alignment of proteins. Our approach has several advantages, including the ability to capture alignment uncertainty and to estimate key "gap" parameters which critically affect the quality of the alignment. We show that several existing alignment methods arise as maximum a posteriori estimates under specific choices of prior distributions and error models. Our probabilistic framework is also easily extended to incorporate additional information, which we demonstrate by including primary sequence information to generate simultaneous sequence-structure alignments that can resolve ambiguities obtained using structure alone. This combined model also provides a natural approach for the difficult task of estimating evolutionary distance based on structural alignments. The model is illustrated by comparison with well-established methods on several challenging protein alignment examples.
High-resolution imaging of basal cell carcinoma: a comparison between multiphoton microscopy with fluorescence lifetime imaging and reflectance confocal microscopy.

PubMed

Manfredini, Marco; Arginelli, Federica; Dunsby, Christopher; French, Paul; Talbot, Clifford; König, Karsten; Pellacani, Giovanni; Ponti, Giovanni; Seidenari, Stefania

2013-02-01

The aim of this study was to compare morphological aspects of basal cell carcinoma (BCC) as assessed by two different imaging methods: in vivo reflectance confocal microscopy (RCM) and multiphoton tomography with fluorescence lifetime imaging implementation (MPT-FLIM). The study comprised 16 BCCs for which a complete set of RCM and MPT-FLIM images were available. The presence of seven MPT-FLIM descriptors was evaluated. The presence of seven RCM equivalent parameters was scored in accordance to their extension. Chi-squared test with Fisher's exact test and Spearman's rank correlation coefficient were determined between MPT-FLIM scores and adjusted-RCM scores. MPT-FLIM and RCM descriptors of BCC were coupled to match the descriptors that define the same pathological structures. The comparison included: Streaming and Aligned elongated cells, Streaming with multiple directions and Double alignment, Palisading (RCM) and Palisading (MPT-FLIM), Typical tumor islands, and Cell islands surrounded by fibers, Dark silhouettes and Phantom islands, Plump bright cells and Melanophages, Vessels (RCM), and Vessels (MPT-FLIM). The parameters that were significantly correlated were Melanophages/Plump Bright Cells, Aligned elongated cells/Streaming, Double alignment/Streaming with multiple directions, and Palisading (MPT-FLIM)/Palisading (RCM). According to our data, both methods are suitable to image BCC's features. The concordance between MPT-FLIM and RCM is high, with some limitations due to the technical differences between the two devices. The hardest difficulty when comparing the images generated by the two imaging modalities is represented by their different field of view. © 2012 John Wiley & Sons A/S.
StructAlign, a Program for Alignment of Structures of DNA-Protein Complexes.

PubMed

Popov, Ya V; Galitsyna, A A; Alexeevski, A V; Karyagina, A S; Spirin, S A

2015-11-01

Comparative analysis of structures of complexes of homologous proteins with DNA is important in the analysis of DNA-protein recognition. Alignment is a necessary stage of the analysis. An alignment is a matching of amino acid residues and nucleotides of one complex to residues and nucleotides of the other. Currently, there are no programs available for aligning structures of DNA-protein complexes. We present the program StructAlign, which should fill this gap. The program inputs a pair of complexes of DNA double helix with proteins and outputs an alignment of DNA chains corresponding to the best spatial fit of the protein chains.
Is multiple-sequence alignment required for accurate inference of phylogeny?

PubMed

Höhl, Michael; Ragan, Mark A

2007-04-01

The process of inferring phylogenetic trees from molecular sequences almost always starts with a multiple alignment of these sequences but can also be based on methods that do not involve multiple sequence alignment. Very little is known about the accuracy with which such alignment-free methods recover the correct phylogeny or about the potential for increasing their accuracy. We conducted a large-scale comparison of ten alignment-free methods, among them one new approach that does not calculate distances and a faster variant of our pattern-based approach; all distance-based alignment-free methods are freely available from http://www.bioinformatics.org.au (as Python package decaf+py). We show that most methods exhibit a higher overall reconstruction accuracy in the presence of high among-site rate variation. Under all conditions that we considered, variants of the pattern-based approach were significantly better than the other alignment-free methods. The new pattern-based variant achieved a speed-up of an order of magnitude in the distance calculation step, accompanied by a small loss of tree reconstruction accuracy. A method of Bayesian inference from k-mers did not improve on classical alignment-free (and distance-based) methods but may still offer other advantages due to its Bayesian nature. We found the optimal word length k of word-based methods to be stable across various data sets, and we provide parameter ranges for two different alphabets. The influence of these alphabets was analyzed to reveal a trade-off in reconstruction accuracy between long and short branches. We have mapped the phylogenetic accuracy for many alignment-free methods, among them several recently introduced ones, and increased our understanding of their behavior in response to biologically important parameters. In all experiments, the pattern-based approach emerged as superior, at the expense of higher resource consumption. Nonetheless, no alignment-free method that we examined recovers the correct phylogeny as accurately as does an approach based on maximum-likelihood distance estimates of multiply aligned sequences.

SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.

PubMed

Liu, Kevin; Warnow, Tandy J; Holder, Mark T; Nelesen, Serita M; Yu, Jiaye; Stamatakis, Alexandros P; Linder, C Randal

2012-01-01

Highly accurate estimation of phylogenetic trees for large data sets is difficult, in part because multiple sequence alignments must be accurate for phylogeny estimation methods to be accurate. Coestimation of alignments and trees has been attempted but currently only SATé estimates reasonably accurate trees and alignments for large data sets in practical time frames (Liu K., Raghavan S., Nelesen S., Linder C.R., Warnow T. 2009b. Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science. 324:1561-1564). Here, we present a modification to the original SATé algorithm that improves upon SATé (which we now call SATé-I) in terms of speed and of phylogenetic and alignment accuracy. SATé-II uses a different divide-and-conquer strategy than SATé-I and so produces smaller more closely related subsets than SATé-I; as a result, SATé-II produces more accurate alignments and trees, can analyze larger data sets, and runs more efficiently than SATé-I. Generally, SATé is a metamethod that takes an existing multiple sequence alignment method as an input parameter and boosts the quality of that alignment method. SATé-II-boosted alignment methods are significantly more accurate than their unboosted versions, and trees based upon these improved alignments are more accurate than trees based upon the original alignments. Because SATé-I used maximum likelihood (ML) methods that treat gaps as missing data to estimate trees and because we found a correlation between the quality of tree/alignment pairs and ML scores, we explored the degree to which SATé's performance depends on using ML with gaps treated as missing data to determine the best tree/alignment pair. We present two lines of evidence that using ML with gaps treated as missing data to optimize the alignment and tree produces very poor results. First, we show that the optimization problem where a set of unaligned DNA sequences is given and the output is the tree and alignment of those sequences that maximize likelihood under the Jukes-Cantor model is uninformative in the worst possible sense. For all inputs, all trees optimize the likelihood score. Second, we show that a greedy heuristic that uses GTR+Gamma ML to optimize the alignment and the tree can produce very poor alignments and trees. Therefore, the excellent performance of SATé-II and SATé-I is not because ML is used as an optimization criterion for choosing the best tree/alignment pair but rather due to the particular divide-and-conquer realignment techniques employed.
Iterative non-sequential protein structural alignment.

PubMed

Salem, Saeed; Zaki, Mohammed J; Bystroff, Christopher

2009-06-01

Structural similarity between proteins gives us insights into their evolutionary relationships when there is low sequence similarity. In this paper, we present a novel approach called SNAP for non-sequential pair-wise structural alignment. Starting from an initial alignment, our approach iterates over a two-step process consisting of a superposition step and an alignment step, until convergence. We propose a novel greedy algorithm to construct both sequential and non-sequential alignments. The quality of SNAP alignments were assessed by comparing against the manually curated reference alignments in the challenging SISY and RIPC datasets. Moreover, when applied to a dataset of 4410 protein pairs selected from the CATH database, SNAP produced longer alignments with lower rmsd than several state-of-the-art alignment methods. Classification of folds using SNAP alignments was both highly sensitive and highly selective. The SNAP software along with the datasets are available online at http://www.cs.rpi.edu/~zaki/software/SNAP.
DNA Translator and Aligner: HyperCard utilities to aid phylogenetic analysis of molecules.

PubMed

Eernisse, D J

1992-04-01

DNA Translator and Aligner are molecular phylogenetics HyperCard stacks for Macintosh computers. They manipulate sequence data to provide graphical gene mapping, conversions, translations and manual multiple-sequence alignment editing. DNA Translator is able to convert documented GenBank or EMBL documented sequences into linearized, rescalable gene maps whose gene sequences are extractable by clicking on the corresponding map button or by selection from a scrolling list. Provided gene maps, complete with extractable sequences, consist of nine metazoan, one yeast, and one ciliate mitochondrial DNAs and three green plant chloroplast DNAs. Single or multiple sequences can be manipulated to aid in phylogenetic analysis. Sequences can be translated between nucleic acids and proteins in either direction with flexible support of alternate genetic codes and ambiguous nucleotide symbols. Multiple aligned sequence output from diverse sources can be converted to Nexus, Hennig86 or PHYLIP format for subsequent phylogenetic analysis. Input or output alignments can be examined with Aligner, a convenient accessory stack included in the DNA Translator package. Aligner is an editor for the manual alignment of up to 100 sequences that toggles between display of matched characters and normal unmatched sequences. DNA Translator also generates graphic displays of amino acid coding and codon usage frequency relative to all other, or only synonymous, codons for approximately 70 select organism-organelle combinations. Codon usage data is compatible with spreadsheet or UWGCG formats for incorporation of additional molecules of interest. The complete package is available via anonymous ftp and is free for non-commercial uses.
NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks.

PubMed

Hu, Jialu; Kehr, Birte; Reinert, Knut

2014-02-15

Owing to recent advancements in high-throughput technologies, protein-protein interaction networks of more and more species become available in public databases. The question of how to identify functionally conserved proteins across species attracts a lot of attention in computational biology. Network alignments provide a systematic way to solve this problem. However, most existing alignment tools encounter limitations in tackling this problem. Therefore, the demand for faster and more efficient alignment tools is growing. We present a fast and accurate algorithm, NetCoffee, which allows to find a global alignment of multiple protein-protein interaction networks. NetCoffee searches for a global alignment by maximizing a target function using simulated annealing on a set of weighted bipartite graphs that are constructed using a triplet approach similar to T-Coffee. To assess its performance, NetCoffee was applied to four real datasets. Our results suggest that NetCoffee remedies several limitations of previous algorithms, outperforms all existing alignment tools in terms of speed and nevertheless identifies biologically meaningful alignments. The source code and data are freely available for download under the GNU GPL v3 license at https://code.google.com/p/netcoffee/.
Experimental assessment and analysis of super-resolution in fluorescence microscopy based on multiple-point spread function fitting of spectrally demultiplexed images

NASA Astrophysics Data System (ADS)

Nishimura, Takahiro; Kimura, Hitoshi; Ogura, Yusuke; Tanida, Jun

2018-06-01

This paper presents an experimental assessment and analysis of super-resolution microscopy based on multiple-point spread function fitting of spectrally demultiplexed images using a designed DNA structure as a test target. For the purpose, a DNA structure was designed to have binding sites at a certain interval that is smaller than the diffraction limit. The structure was labeled with several types of quantum dots (QDs) to acquire their spatial information as spectrally encoded images. The obtained images are analyzed with a point spread function multifitting algorithm to determine the QD locations that indicate the binding site positions. The experimental results show that the labeled locations can be observed beyond the diffraction-limited resolution using three-colored fluorescence images that were obtained with a confocal fluorescence microscope. Numerical simulations show that labeling with eight types of QDs enables the positions aligned at 27.2-nm pitches on the DNA structure to be resolved with high accuracy.
Reinforcing Visual Grouping Cues to Communicate Complex Informational Structure.

PubMed

Bae, Juhee; Watson, Benjamin

2014-12-01

In his book Multimedia Learning [7], Richard Mayer asserts that viewers learn best from imagery that provides them with cues to help them organize new information into the correct knowledge structures. Designers have long been exploiting the Gestalt laws of visual grouping to deliver viewers those cues using visual hierarchy, often communicating structures much more complex than the simple organizations studied in psychological research. Unfortunately, designers are largely practical in their work, and have not paused to build a complex theory of structural communication. If we are to build a tool to help novices create effective and well structured visuals, we need a better understanding of how to create them. Our work takes a first step toward addressing this lack, studying how five of the many grouping cues (proximity, color similarity, common region, connectivity, and alignment) can be effectively combined to communicate structured text and imagery from real world examples. To measure the effectiveness of this structural communication, we applied a digital version of card sorting, a method widely used in anthropology and cognitive science to extract cognitive structures. We then used tree edit distance to measure the difference between perceived and communicated structures. Our most significant findings are: 1) with careful design, complex structure can be communicated clearly; 2) communicating complex structure is best done with multiple reinforcing grouping cues; 3) common region (use of containers such as boxes) is particularly effective at communicating structure; and 4) alignment is a weak structural communicator.
Using Variable-Length Aligned Fragment Pairs and an Improved Transition Function for Flexible Protein Structure Alignment.

PubMed

Cao, Hu; Lu, Yonggang

2017-01-01

With the rapid growth of known protein 3D structures in number, how to efficiently compare protein structures becomes an essential and challenging problem in computational structural biology. At present, many protein structure alignment methods have been developed. Among all these methods, flexible structure alignment methods are shown to be superior to rigid structure alignment methods in identifying structure similarities between proteins, which have gone through conformational changes. It is also found that the methods based on aligned fragment pairs (AFPs) have a special advantage over other approaches in balancing global structure similarities and local structure similarities. Accordingly, we propose a new flexible protein structure alignment method based on variable-length AFPs. Compared with other methods, the proposed method possesses three main advantages. First, it is based on variable-length AFPs. The length of each AFP is separately determined to maximally represent a local similar structure fragment, which reduces the number of AFPs. Second, it uses local coordinate systems, which simplify the computation at each step of the expansion of AFPs during the AFP identification. Third, it decreases the number of twists by rewarding the situation where nonconsecutive AFPs share the same transformation in the alignment, which is realized by dynamic programming with an improved transition function. The experimental data show that compared with FlexProt, FATCAT, and FlexSnap, the proposed method can achieve comparable results by introducing fewer twists. Meanwhile, it can generate results similar to those of the FATCAT method in much less running time due to the reduced number of AFPs.
Electrochemical tuning of vertically aligned MoS2 nanofilms and its application in improving hydrogen evolution reaction

PubMed Central

Wang, Haotian; Lu, Zhiyi; Xu, Shicheng; Kong, Desheng; Cha, Judy J.; Zheng, Guangyuan; Hsu, Po-Chun; Yan, Kai; Bradshaw, David; Prinz, Fritz B.; Cui, Yi

2013-01-01

The ability to intercalate guest species into the van der Waals gap of 2D layered materials affords the opportunity to engineer the electronic structures for a variety of applications. Here we demonstrate the continuous tuning of layer vertically aligned MoS2 nanofilms through electrochemical intercalation of Li+ ions. By scanning the Li intercalation potential from high to low, we have gained control of multiple important material properties in a continuous manner, including tuning the oxidation state of Mo, the transition of semiconducting 2H to metallic 1T phase, and expanding the van der Waals gap until exfoliation. Using such nanofilms after different degree of Li intercalation, we show the significant improvement of the hydrogen evolution reaction activity. A strong correlation between such tunable material properties and hydrogen evolution reaction activity is established. This work provides an intriguing and effective approach on tuning electronic structures for optimizing the catalytic activity. PMID:24248362
MUSCLE: multiple sequence alignment with high accuracy and high throughput.

PubMed

Edgar, Robert C

2004-01-01

We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the log-expectation score, and refinement using tree-dependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.
PASS2: an automated database of protein alignments organised as structural superfamilies.

PubMed

Bhaduri, Anirban; Pugalenthi, Ganesan; Sowdhamini, Ramanathan

2004-04-02

The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at http://www.ncbs.res.in/~faculty/mini/campass/pass2.html
Diabetes Alters Mechanical Properties and Collagen Fiber Re-Alignment in Multiple Mouse Tendons

PubMed Central

Connizzo, Brianne K.; Bhatt, Pankti R.; Liechty, Kenneth W.; Soslowsky, Louis J.

2014-01-01

Tendons function to transfer load from muscle to bone through their complex composition and hierarchical structure, consisting mainly of type I collagen. Recent evidence suggests that type II diabetes may cause alterations in collagen structure, such as irregular fibril morphology and density, which could play a role in the mechanical function of tendons. Using the db/db mouse model of type II diabetes, the diabetic skin was found to have impaired biomechanical properties when compared to the non-diabetic group. The purpose of this study was to assess the effect of diabetes on biomechanics, collagen fiber re-alignment, and biochemistry in three functionally different tendons (Achilles, supraspinatus, patellar) using the db/db mouse model. Results showed that cross-sectional area and stiffness, but not modulus, were significantly reduced in all three tendons. However, the tendon response to load (transition strain, collagen fiber re-alignment) occurred earlier in the mechanical test, contrary to expectations. In addition, the patellar tendon had an altered response to diabetes when compared to the other two tendons, with no changes in fiber realignment and decreased collagen content at the midsubstance of the tendon. Overall, type II diabetes alters tendon mechanical properties and the dynamic response to load. PMID:24833253
Overcoming Sequence Misalignments with Weighted Structural Superposition

PubMed Central

Khazanov, Nickolay A.; Damm-Ganamet, Kelly L.; Quang, Daniel X.; Carlson, Heather A.

2012-01-01

An appropriate structural superposition identifies similarities and differences between homologous proteins that are not evident from sequence alignments alone. We have coupled our Gaussian-weighted RMSD (wRMSD) tool with a sequence aligner and seed extension (SE) algorithm to create a robust technique for overlaying structures and aligning sequences of homologous proteins (HwRMSD). HwRMSD overcomes errors in the initial sequence alignment that would normally propagate into a standard RMSD overlay. SE can generate a corrected sequence alignment from the improved structural superposition obtained by wRMSD. HwRMSD’s robust performance and its superiority over standard RMSD are demonstrated over a range of homologous proteins. Its better overlay results in corrected sequence alignments with good agreement to HOMSTRAD. Finally, HwRMSD is compared to established structural alignment methods: FATCAT, SSM, CE, and Dalilite. Most methods are comparable at placing residue pairs within 2 Å, but HwRMSD places many more residue pairs within 1 Å, providing a clear advantage. Such high accuracy is essential in drug design, where small distances can have a large impact on computational predictions. This level of accuracy is also needed to correct sequence alignments in an automated fashion, especially for omics-scale analysis. HwRMSD can align homologs with low sequence identity and large conformational differences, cases where both sequence-based and structural-based methods may fail. The HwRMSD pipeline overcomes the dependency of structural overlays on initial sequence pairing and removes the need to determine the best sequence-alignment method, substitution matrix, and gap parameters for each unique pair of homologs. PMID:22733542
Biclustering as a method for RNA local multiple sequence alignment.

PubMed

Wang, Shu; Gutell, Robin R; Miranker, Daniel P

2007-12-15

Biclustering is a clustering method that simultaneously clusters both the domain and range of a relation. A challenge in multiple sequence alignment (MSA) is that the alignment of sequences is often intended to reveal groups of conserved functional subsequences. Simultaneously, the grouping of the sequences can impact the alignment; precisely the kind of dual situation biclustering is intended to address. We define a representation of the MSA problem enabling the application of biclustering algorithms. We develop a computer program for local MSA, BlockMSA, that combines biclustering with divide-and-conquer. BlockMSA simultaneously finds groups of similar sequences and locally aligns subsequences within them. Further alignment is accomplished by dividing both the set of sequences and their contents. The net result is both a multiple sequence alignment and a hierarchical clustering of the sequences. BlockMSA was tested on the subsets of the BRAliBase 2.1 benchmark suite that display high variability and on an extension to that suite to larger problem sizes. Also, alignments were evaluated of two large datasets of current biological interest, T box sequences and Group IC1 Introns. The results were compared with alignments computed by ClustalW, MAFFT, MUCLE and PROBCONS alignment programs using Sum of Pairs (SPS) and Consensus Count. Results for the benchmark suite are sensitive to problem size. On problems of 15 or greater sequences, BlockMSA is consistently the best. On none of the problems in the test suite are there appreciable differences in scores among BlockMSA, MAFFT and PROBCONS. On the T box sequences, BlockMSA does the most faithful job of reproducing known annotations. MAFFT and PROBCONS do not. On the Intron sequences, BlockMSA, MAFFT and MUSCLE are comparable at identifying conserved regions. BlockMSA is implemented in Java. Source code and supplementary datasets are available at http://aug.csres.utexas.edu/msa/
Evaluating Hierarchical Structure in Music Annotations

PubMed Central

McFee, Brian; Nieto, Oriol; Farbood, Morwaread M.; Bello, Juan Pablo

2017-01-01

Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement. PMID:28824514
Simple and robust generation of ultrafast laser pulse trains using polarization-independent parallel-aligned thin films

NASA Astrophysics Data System (ADS)

Wang, Andong; Jiang, Lan; Li, Xiaowei; Wang, Zhi; Du, Kun; Lu, Yongfeng

2018-05-01

Ultrafast laser pulse temporal shaping has been widely applied in various important applications such as laser materials processing, coherent control of chemical reactions, and ultrafast imaging. However, temporal pulse shaping has been limited to only-in-lab technique due to the high cost, low damage threshold, and polarization dependence. Herein we propose a novel design of ultrafast laser pulse train generation device, which consists of multiple polarization-independent parallel-aligned thin films. Various pulse trains with controllable temporal profile can be generated flexibly by multi-reflections within the splitting films. Compared with other pulse train generation techniques, this method has advantages of compact structure, low cost, high damage threshold and polarization independence. These advantages endow it with high potential for broad utilization in ultrafast applications.
Bioinformatic prediction and in vivo validation of residue-residue interactions in human proteins

NASA Astrophysics Data System (ADS)

Jordan, Daniel; Davis, Erica; Katsanis, Nicholas; Sunyaev, Shamil

2014-03-01

Identifying residue-residue interactions in protein molecules is important for understanding both protein structure and function in the context of evolutionary dynamics and medical genetics. Such interactions can be difficult to predict using existing empirical or physical potentials, especially when residues are far from each other in sequence space. Using a multiple sequence alignment of 46 diverse vertebrate species we explore the space of allowed sequences for orthologous protein families. Amino acid changes that are known to damage protein function allow us to identify specific changes that are likely to have interacting partners. We fit the parameters of the continuous-time Markov process used in the alignment to conclude that these interactions are primarily pairwise, rather than higher order. Candidates for sites under pairwise epistasis are predicted, which can then be tested by experiment. We report the results of an initial round of in vivo experiments in a zebrafish model that verify the presence of multiple pairwise interactions predicted by our model. These experimentally validated interactions are novel, distant in sequence, and are not readily explained by known biochemical or biophysical features.
Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign

PubMed Central

2007-01-01

Background Joint alignment and secondary structure prediction of two RNA sequences can significantly improve the accuracy of the structural predictions. Methods addressing this problem, however, are forced to employ constraints that reduce computation by restricting the alignments and/or structures (i.e. folds) that are permissible. In this paper, a new methodology is presented for the purpose of establishing alignment constraints based on nucleotide alignment and insertion posterior probabilities. Using a hidden Markov model, posterior probabilities of alignment and insertion are computed for all possible pairings of nucleotide positions from the two sequences. These alignment and insertion posterior probabilities are additively combined to obtain probabilities of co-incidence for nucleotide position pairs. A suitable alignment constraint is obtained by thresholding the co-incidence probabilities. The constraint is integrated with Dynalign, a free energy minimization algorithm for joint alignment and secondary structure prediction. The resulting method is benchmarked against the previous version of Dynalign and against other programs for pairwise RNA structure prediction. Results The proposed technique eliminates manual parameter selection in Dynalign and provides significant computational time savings in comparison to prior constraints in Dynalign while simultaneously providing a small improvement in the structural prediction accuracy. Savings are also realized in memory. In experiments over a 5S RNA dataset with average sequence length of approximately 120 nucleotides, the method reduces computation by a factor of 2. The method performs favorably in comparison to other programs for pairwise RNA structure prediction: yielding better accuracy, on average, and requiring significantly lesser computational resources. Conclusion Probabilistic analysis can be utilized in order to automate the determination of alignment constraints for pairwise RNA structure prediction methods in a principled fashion. These constraints can reduce the computational and memory requirements of these methods while maintaining or improving their accuracy of structural prediction. This extends the practical reach of these methods to longer length sequences. The revised Dynalign code is freely available for download. PMID:17445273
eShadow: A tool for comparing closely related sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ovcharenko, Ivan; Boffelli, Dario; Loots, Gabriela G.

2004-01-15

Primate sequence comparisons are difficult to interpret due to the high degree of sequence similarity shared between such closely related species. Recently, a novel method, phylogenetic shadowing, has been pioneered for predicting functional elements in the human genome through the analysis of multiple primate sequence alignments. We have expanded this theoretical approach to create a computational tool, eShadow, for the identification of elements under selective pressure in multiple sequence alignments of closely related genomes, such as in comparisons of human to primate or mouse to rat DNA. This tool integrates two different statistical methods and allows for the dynamic visualizationmore » of the resulting conservation profile. eShadow also includes a versatile optimization module capable of training the underlying Hidden Markov Model to differentially predict functional sequences. This module grants the tool high flexibility in the analysis of multiple sequence alignments and in comparing sequences with different divergence rates. Here, we describe the eShadow comparative tool and its potential uses for analyzing both multiple nucleotide and protein alignments to predict putative functional elements. The eShadow tool is publicly available at http://eshadow.dcode.org/« less
GIRAF: a method for fast search and flexible alignment of ligand binding interfaces in proteins at atomic resolution

PubMed Central

Kinjo, Akira R.; Nakamura, Haruki

2012-01-01

Comparison and classification of protein structures are fundamental means to understand protein functions. Due to the computational difficulty and the ever-increasing amount of structural data, however, it is in general not feasible to perform exhaustive all-against-all structure comparisons necessary for comprehensive classifications. To efficiently handle such situations, we have previously proposed a method, now called GIRAF. We herein describe further improvements in the GIRAF protein structure search and alignment method. The GIRAF method achieves extremely efficient search of similar structures of ligand binding sites of proteins by exploiting database indexing of structural features of local coordinate frames. In addition, it produces refined atom-wise alignments by iterative applications of the Hungarian method to the bipartite graph defined for a pair of superimposed structures. By combining the refined alignments based on different local coordinate frames, it is made possible to align structures involving domain movements. We provide detailed accounts for the database design, the search and alignment algorithms as well as some benchmark results. PMID:27493524
Web-Beagle: a web server for the alignment of RNA secondary structures.

PubMed

Mattei, Eugenio; Pietrosanto, Marco; Ferrè, Fabrizio; Helmer-Citterich, Manuela

2015-07-01

Web-Beagle (http://beagle.bio.uniroma2.it) is a web server for the pairwise global or local alignment of RNA secondary structures. The server exploits a new encoding for RNA secondary structure and a substitution matrix of RNA structural elements to perform RNA structural alignments. The web server allows the user to compute up to 10 000 alignments in a single run, taking as input sets of RNA sequences and structures or primary sequences alone. In the latter case, the server computes the secondary structure prediction for the RNAs on-the-fly using RNAfold (free energy minimization). The user can also compare a set of input RNAs to one of five pre-compiled RNA datasets including lncRNAs and 3' UTRs. All types of comparison produce in output the pairwise alignments along with structural similarity and statistical significance measures for each resulting alignment. A graphical color-coded representation of the alignments allows the user to easily identify structural similarities between RNAs. Web-Beagle can be used for finding structurally related regions in two or more RNAs, for the identification of homologous regions or for functional annotation. Benchmark tests show that Web-Beagle has lower computational complexity, running time and better performances than other available methods. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

Optimal simultaneous superpositioning of multiple structures with missing data

PubMed Central

Theobald, Douglas L.; Steindel, Phillip A.

2012-01-01

Motivation: Superpositioning is an essential technique in structural biology that facilitates the comparison and analysis of conformational differences among topologically similar structures. Performing a superposition requires a one-to-one correspondence, or alignment, of the point sets in the different structures. However, in practice, some points are usually ‘missing’ from several structures, for example, when the alignment contains gaps. Current superposition methods deal with missing data simply by superpositioning a subset of points that are shared among all the structures. This practice is inefficient, as it ignores important data, and it fails to satisfy the common least-squares criterion. In the extreme, disregarding missing positions prohibits the calculation of a superposition altogether. Results: Here, we present a general solution for determining an optimal superposition when some of the data are missing. We use the expectation–maximization algorithm, a classic statistical technique for dealing with incomplete data, to find both maximum-likelihood solutions and the optimal least-squares solution as a special case. Availability and implementation: The methods presented here are implemented in THESEUS 2.0, a program for superpositioning macromolecular structures. ANSI C source code and selected compiled binaries for various computing platforms are freely available under the GNU open source license from http://www.theseus3d.org. Contact: dtheobald@brandeis.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22543369
Transformation and Alignment in Similarity

ERIC Educational Resources Information Center

Hodgetts, Carl J.; Hahn, Ulrike; Chater, Nick

2009-01-01

This paper contrasts two structural accounts of psychological similarity: structural alignment (SA) and Representational Distortion (RD). SA proposes that similarity is determined by how readily the structures of two objects can be brought into alignment; RD measures similarity by the complexity of the transformation that "distorts" one…
Protein Structure and Function Prediction Using I-TASSER

PubMed Central

Yang, Jianyi; Zhang, Yang

2016-01-01

I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons. I-TASSER is freely available as both an on-line server and a stand-alone package. This unit describes how to use the I-TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSER modeling quality for distant-homologous and multi-domain protein targets. PMID:26678386
Vertically aligned nanostructure scanning probe microscope tips

DOEpatents

Guillorn, Michael A.; Ilic, Bojan; Melechko, Anatoli V.; Merkulov, Vladimir I.; Lowndes, Douglas H.; Simpson, Michael L.

2006-12-19

Methods and apparatus are described for cantilever structures that include a vertically aligned nanostructure, especially vertically aligned carbon nanofiber scanning probe microscope tips. An apparatus includes a cantilever structure including a substrate including a cantilever body, that optionally includes a doped layer, and a vertically aligned nanostructure coupled to the cantilever body.
Helical structures in vertically aligned dust particle chains in a complex plasma

NASA Astrophysics Data System (ADS)

Hyde, Truell W.; Kong, Jie; Matthews, Lorin S.

2013-05-01

Self-assembly of structures from vertically aligned, charged dust particle bundles within a glass box placed on the lower, powered electrode of a Gaseous Electronics Conference rf reference cell were produced and examined experimentally. Self-organized formation of one-dimensional vertical chains, two-dimensional zigzag structures, and three-dimensional helical structures of triangular, quadrangular, pentagonal, hexagonal, and heptagonal symmetries are shown to occur. System evolution is shown to progress from a one-dimensional chain structure, through a zigzag transition to a two-dimensional, spindlelike structure, and then to various three-dimensional, helical structures exhibiting multiple symmetries. Stable configurations are found to be dependent upon the system confinement, γ2=ω0h/ω0v2 (where ω0h,v are the horizontal and vertical dust resonance frequencies), the total number of particles within a bundle, and the rf power. For clusters having fixed numbers of particles, the rf power at which structural phase transitions occur is repeatable and exhibits no observable hysteresis. The critical conditions for these structural phase transitions as well as the basic symmetry exhibited by the one-, two-, and three-dimensional structures that subsequently develop are in good agreement with the theoretically predicted configurations of minimum energy determined employing molecular dynamics simulations for charged dust particles confined in a prolate, spheroidal potential as presented theoretically by Kamimura and Ishihara [Kamimura and Ishihara, Phys. Rev. EPLEEE81063-651X10.1103/PhysRevE.85.016406 85, 016406 (2012)].
Strong quantum-confined Stark effect in a lattice-matched GeSiSn/GeSn multi-quantum-well structure

NASA Astrophysics Data System (ADS)

Peng, Ruizhi; Chunfuzhang; Han, Genquan; Hao, Yue

2017-06-01

This paper presents modeling and simulation of a multiple quantum well structure formed with Ge0.95Sn0.05 quantum wells separated by Ge0.51Si0.35Sn0.14 barriers for the applications. These alloy compositions are chosen to satisfy two conditions simultaneously: type-I band alignment between Ge0.95Sn0.05/Ge0.51Si0.35Sn0.14 and a lattice match between wells and barriers. This lattice match ensures that the strain-free structure can be grown upon a relaxed Ge0.51Si0.35Sn0.14 buffer on a silicon substrate - a CMOS compatible process. A electro-absorption modulator with the Ge0.95Sn0.05/Ge0.51Si0.35Sn0.14 multiple quantum well structure based on quantum-confined Stark effect(QCSE) is demonstrated in theory. The energy band diagrams of the GeSiSn/GeSn multi-quantum-well structure at 0 and 0.5V bias are calculated, respectively. And the corresponding absorption coefficients as a function of cut-off energy for this multiple quantum well structure at 0 and 0.5Vbias are also obtained, respectively. The reduction of cut-off energy is observed with the applying of the external electric field, indicating a strong QCSE in the structure.
Recapitulating phylogenies using k-mers: from trees to networks.

PubMed

Bernard, Guillaume; Ragan, Mark A; Chan, Cheong Xin

2016-01-01

Ernst Haeckel based his landmark Tree of Life on the supposed ontogenic recapitulation of phylogeny, i.e. that successive embryonic stages during the development of an organism re-trace the morphological forms of its ancestors over the course of evolution. Much of this idea has since been discredited. Today, phylogenies are often based on families of molecular sequences. The standard approach starts with a multiple sequence alignment, in which the sequences are arranged relative to each other in a way that maximises a measure of similarity position-by-position along their entire length. A tree (or sometimes a network) is then inferred. Rigorous multiple sequence alignment is computationally demanding, and evolutionary processes that shape the genomes of many microbes (bacteria, archaea and some morphologically simple eukaryotes) can add further complications. In particular, recombination, genome rearrangement and lateral genetic transfer undermine the assumptions that underlie multiple sequence alignment, and imply that a tree-like structure may be too simplistic. Here, using genome sequences of 143 bacterial and archaeal genomes, we construct a network of phylogenetic relatedness based on the number of shared k -mers (subsequences at fixed length k ). Our findings suggest that the network captures not only key aspects of microbial genome evolution as inferred from a tree, but also features that are not treelike. The method is highly scalable, allowing for investigation of genome evolution across a large number of genomes. Instead of using specific regions or sequences from genome sequences, or indeed Haeckel's idea of ontogeny, we argue that genome phylogenies can be inferred using k -mers from whole-genome sequences. Representing these networks dynamically allows biological questions of interest to be formulated and addressed quickly and in a visually intuitive manner.
Accuracy Estimation and Parameter Advising for Protein Multiple Sequence Alignment

PubMed Central

DeBlasio, Dan

2013-01-01

Abstract We develop a novel and general approach to estimating the accuracy of multiple sequence alignments without knowledge of a reference alignment, and use our approach to address a new task that we call parameter advising: the problem of choosing values for alignment scoring function parameters from a given set of choices to maximize the accuracy of a computed alignment. For protein alignments, we consider twelve independent features that contribute to a quality alignment. An accuracy estimator is learned that is a polynomial function of these features; its coefficients are determined by minimizing its error with respect to true accuracy using mathematical optimization. Compared to prior approaches for estimating accuracy, our new approach (a) introduces novel feature functions that measure nonlocal properties of an alignment yet are fast to evaluate, (b) considers more general classes of estimators beyond linear combinations of features, and (c) develops new regression formulations for learning an estimator from examples; in addition, for parameter advising, we (d) determine the optimal parameter set of a given cardinality, which specifies the best parameter values from which to choose. Our estimator, which we call Facet (for “feature-based accuracy estimator”), yields a parameter advisor that on the hardest benchmarks provides more than a 27% improvement in accuracy over the best default parameter choice, and for parameter advising significantly outperforms the best prior approaches to assessing alignment quality. PMID:23489379
Identification and sequence analyses of novel lipase encoding novel thermophillic bacilli isolated from Armenian geothermal springs.

PubMed

Shahinyan, Grigor; Margaryan, Armine; Panosyan, Hovik; Trchounian, Armen

2017-05-02

Among the huge diversity of thermophilic bacteria mainly bacilli have been reported as active thermostable lipase producers. Geothermal springs serve as the main source for isolation of thermostable lipase producing bacilli. Thermostable lipolytic enzymes, functioning in the harsh conditions, have promising applications in processing of organic chemicals, detergent formulation, synthesis of biosurfactants, pharmaceutical processing etc. In order to study the distribution of lipase-producing thermophilic bacilli and their specific lipase protein primary structures, three lipase producers from different genera were isolated from mesothermal (27.5-70 °C) springs distributed on the territory of Armenia and Nagorno Karabakh. Based on phenotypic characteristics and 16S rRNA gene sequencing the isolates were identified as Geobacillus sp., Bacillus licheniformis and Anoxibacillus flavithermus strains. The lipase genes of isolates were sequenced by using initially designed primer sets. Multiple alignments generated from primary structures of the lipase proteins and annotated lipase protein sequences, conserved regions analysis and amino acid composition have illustrated the similarity (98-99%) of the lipases with true lipases (family I) and GDSL esterase family (family II). A conserved sequence block that determines the thermostability has been identified in the multiple alignments of the lipase proteins. The results are spreading light on the lipase producing bacilli distribution in geothermal springs in Armenia and Nagorno Karabakh. Newly isolated bacilli strains could be prospective source for thermostable lipases and their genes.
Data Acquisition and Linguistic Resources

NASA Astrophysics Data System (ADS)

Strassel, Stephanie; Christianson, Caitlin; McCary, John; Staderman, William; Olive, Joseph

All human language technology demands substantial quantities of data for system training and development, plus stable benchmark data to measure ongoing progress. While creation of high quality linguistic resources is both costly and time consuming, such data has the potential to profoundly impact not just a single evaluation program but language technology research in general. GALE's challenging performance targets demand linguistic data on a scale and complexity never before encountered. Resources cover multiple languages (Arabic, Chinese, and English) and multiple genres -- both structured (newswire and broadcast news) and unstructured (web text, including blogs and newsgroups, and broadcast conversation). These resources include significant volumes of monolingual text and speech, parallel text, and transcribed audio combined with multiple layers of linguistic annotation, ranging from word aligned parallel text and Treebanks to rich semantic annotation.
Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences.

PubMed

Tan, Yen Hock; Huang, He; Kihara, Daisuke

2006-08-15

Aligning distantly related protein sequences is a long-standing problem in bioinformatics, and a key for successful protein structure prediction. Its importance is increasing recently in the context of structural genomics projects because more and more experimentally solved structures are available as templates for protein structure modeling. Toward this end, recent structure prediction methods employ profile-profile alignments, and various ways of aligning two profiles have been developed. More fundamentally, a better amino acid similarity matrix can improve a profile itself; thereby resulting in more accurate profile-profile alignments. Here we have developed novel amino acid similarity matrices from knowledge-based amino acid contact potentials. Contact potentials are used because the contact propensity to the other amino acids would be one of the most conserved features of each position of a protein structure. The derived amino acid similarity matrices are tested on benchmark alignments at three different levels, namely, the family, the superfamily, and the fold level. Compared to BLOSUM45 and the other existing matrices, the contact potential-based matrices perform comparably in the family level alignments, but clearly outperform in the fold level alignments. The contact potential-based matrices perform even better when suboptimal alignments are considered. Comparing the matrices themselves with each other revealed that the contact potential-based matrices are very different from BLOSUM45 and the other matrices, indicating that they are located in a different basin in the amino acid similarity matrix space.
Ionospheric Scintillation Explorer (ISX)

NASA Astrophysics Data System (ADS)

Iuliano, J.; Bahcivan, H.

2015-12-01

NSF has recently selected Ionospheric Scintillation Explorer (ISX), a 3U Cubesat mission to explore the three-dimensional structure of scintillation-scale ionospheric irregularities associated with Equatorial Spread F (ESF). ISX is a collaborative effort between SRI International and Cal Poly. This project addresses the science question: To what distance along a flux tube does an irregularity of certain transverse-scale extend? It has been difficult to measure the magnetic field-alignment of scintillation-scale turbulent structures because of the difficulty of sampling a flux tube at multiple locations within a short time. This measurement is now possible due to the worldwide transition to DTV, which presents unique signals of opportunity for remote sensing of ionospheric irregularities from numerous vantage points. DTV spectra, in various formats, contain phase-stable, narrowband pilot carrier components that are transmitted simultaneously. A 4-channel radar receiver will simultaneously record up to 4 spatially separated transmissions from the ground. Correlations of amplitude and phase scintillation patterns corresponding to multiple points on the same flux tube will be a measure of the spatial extent of the structures along the magnetic field. A subset of geometries where two or more transmitters are aligned with the orbital path will be used to infer the temporal development of the structures. ISX has the following broad impact. Scintillation of space-based radio signals is a space weather problem that is intensively studied. ISX is a step toward a CubeSat constellation to monitor worldwide TEC variations and radio wave distortions on thousands of ionospheric paths. Furthermore, the rapid sampling along spacecraft orbits provides a unique dataset to deterministically reconstruct ionospheric irregularities at scintillation-scale resolution using diffraction radio tomography, a technique that enables prediction of scintillations at other radio frequencies, and potentially, mitigation of phase distortions.
GalaxyTBM: template-based modeling by building a reliable core and refining unreliable local regions.

PubMed

Ko, Junsu; Park, Hahnbeom; Seok, Chaok

2012-08-10

Protein structures can be reliably predicted by template-based modeling (TBM) when experimental structures of homologous proteins are available. However, it is challenging to obtain structures more accurate than the single best templates by either combining information from multiple templates or by modeling regions that vary among templates or are not covered by any templates. We introduce GalaxyTBM, a new TBM method in which the more reliable core region is modeled first from multiple templates and less reliable, variable local regions, such as loops or termini, are then detected and re-modeled by an ab initio method. This TBM method is based on "Seok-server," which was tested in CASP9 and assessed to be amongst the top TBM servers. The accuracy of the initial core modeling is enhanced by focusing on more conserved regions in the multiple-template selection and multiple sequence alignment stages. Additional improvement is achieved by ab initio modeling of up to 3 unreliable local regions in the fixed framework of the core structure. Overall, GalaxyTBM reproduced the performance of Seok-server, with GalaxyTBM and Seok-server resulting in average GDT-TS of 68.1 and 68.4, respectively, when tested on 68 single-domain CASP9 TBM targets. For application to multi-domain proteins, GalaxyTBM must be combined with domain-splitting methods. Application of GalaxyTBM to CASP9 targets demonstrates that accurate protein structure prediction is possible by use of a multiple-template-based approach, and ab initio modeling of variable regions can further enhance the model quality.
What to do When Scalar Invariance Fails: The Extended Alignment Method for Multi-Group Factor Analysis Comparison of Latent Means Across Many Groups.

PubMed

Marsh, Herbert W; Guo, Jiesi; Parker, Philip D; Nagengast, Benjamin; Asparouhov, Tihomir; Muthén, Bengt; Dicke, Theresa

2017-01-12

Scalar invariance is an unachievable ideal that in practice can only be approximated; often using potentially questionable approaches such as partial invariance based on a stepwise selection of parameter estimates with large modification indices. Study 1 demonstrates an extension of the power and flexibility of the alignment approach for comparing latent factor means in large-scale studies (30 OECD countries, 8 factors, 44 items, N = 249,840), for which scalar invariance is typically not supported in the traditional confirmatory factor analysis approach to measurement invariance (CFA-MI). Importantly, we introduce an alignment-within-CFA (AwC) approach, transforming alignment from a largely exploratory tool into a confirmatory tool, and enabling analyses that previously have not been possible with alignment (testing the invariance of uniquenesses and factor variances/covariances; multiple-group MIMIC models; contrasts on latent means) and structural equation models more generally. Specifically, it also allowed a comparison of gender differences in a 30-country MIMIC AwC (i.e., a SEM with gender as a covariate) and a 60-group AwC CFA (i.e., 30 countries × 2 genders) analysis. Study 2, a simulation study following up issues raised in Study 1, showed that latent means were more accurately estimated with alignment than with the scalar CFA-MI, and particularly with partial invariance scalar models based on the heavily criticized stepwise selection strategy. In summary, alignment augmented by AwC provides applied researchers from diverse disciplines considerable flexibility to address substantively important issues when the traditional CFA-MI scalar model does not fit the data. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Image processing for cryogenic transmission electron microscopy of symmetry-mismatched complexes.

PubMed

Huiskonen, Juha T

2018-02-08

Cryogenic transmission electron microscopy (cryo-TEM) is a high-resolution biological imaging method, whereby biological samples, such as purified proteins, macromolecular complexes, viral particles, organelles and cells, are embedded in vitreous ice preserving their native structures. Due to sensitivity of biological materials to the electron beam of the microscope, only relatively low electron doses can be applied during imaging. As a result, the signal arising from the structure of interest is overpowered by noise in the images. To increase the signal-to-noise ratio, different image processing-based strategies that aim at coherent averaging of signal have been devised. In such strategies, images are generally assumed to arise from multiple identical copies of the structure. Prior to averaging, the images must be grouped according to the view of the structure they represent and images representing the same view must be simultaneously aligned relatively to each other. For computational reconstruction of the three-dimensional structure, images must contain different views of the original structure. Structures with multiple symmetry-related substructures are advantageous in averaging approaches because each image provides multiple views of the substructures. However, the symmetry assumption may be valid for only parts of the structure, leading to incoherent averaging of the other parts. Several image processing approaches have been adapted to tackle symmetry-mismatched substructures with increasing success. Such structures are ubiquitous in nature and further computational method development is needed to understanding their biological functions. ©2018 The Author(s).
Generic accelerated sequence alignment in SeqAn using vectorization and multi-threading.

PubMed

Rahn, René; Budach, Stefan; Costanza, Pascal; Ehrhardt, Marcel; Hancox, Jonny; Reinert, Knut

2018-05-03

Pairwise sequence alignment is undoubtedly a central tool in many bioinformatics analyses. In this paper, we present a generically accelerated module for pairwise sequence alignments applicable for a broad range of applications. In our module, we unified the standard dynamic programming kernel used for pairwise sequence alignments and extended it with a generalized inter-sequence vectorization layout, such that many alignments can be computed simultaneously by exploiting SIMD (Single Instruction Multiple Data) instructions of modern processors. We then extended the module by adding two layers of thread-level parallelization, where we a) distribute many independent alignments on multiple threads and b) inherently parallelize a single alignment computation using a work stealing approach producing a dynamic wavefront progressing along the minor diagonal. We evaluated our alignment vectorization and parallelization on different processors, including the newest Intel® Xeon® (Skylake) and Intel® Xeon Phi™ (KNL) processors, and use cases. The instruction set AVX512-BW (Byte and Word), available on Skylake processors, can genuinely improve the performance of vectorized alignments. We could run single alignments 1600 times faster on the Xeon Phi™ and 1400 times faster on the Xeon® than executing them with our previous sequential alignment module. The module is programmed in C++ using the SeqAn (Reinert et al., 2017) library and distributed with version 2.4. under the BSD license. We support SSE4, AVX2, AVX512 instructions and included UME::SIMD, a SIMD-instruction wrapper library, to extend our module for further instruction sets. We thoroughly test all alignment components with all major C++ compilers on various platforms. rene.rahn@fu-berlin.de.
mRAISE: an alternative algorithmic approach to ligand-based virtual screening

NASA Astrophysics Data System (ADS)

von Behren, Mathias M.; Bietz, Stefan; Nittinger, Eva; Rarey, Matthias

2016-08-01

Ligand-based virtual screening is a well established method to find new lead molecules in todays drug discovery process. In order to be applicable in day to day practice, such methods have to face multiple challenges. The most important part is the reliability of the results, which can be shown and compared in retrospective studies. Furthermore, in the case of 3D methods, they need to provide biologically relevant molecular alignments of the ligands, that can be further investigated by a medicinal chemist. Last but not least, they have to be able to screen large databases in reasonable time. Many algorithms for ligand-based virtual screening have been proposed in the past, most of them based on pairwise comparisons. Here, a new method is introduced called mRAISE. Based on structural alignments, it uses a descriptor-based bitmap search engine (RAISE) to achieve efficiency. Alignments created on the fly by the search engine get evaluated with an independent shape-based scoring function also used for ranking of compounds. The correct ranking as well as the alignment quality of the method are evaluated and compared to other state of the art methods. On the commonly used Directory of Useful Decoys dataset mRAISE achieves an average area under the ROC curve of 0.76, an average enrichment factor at 1 % of 20.2 and an average hit rate at 1 % of 55.5. With these results, mRAISE is always among the top performing methods with available data for comparison. To access the quality of the alignments calculated by ligand-based virtual screening methods, we introduce a new dataset containing 180 prealigned ligands for 11 diverse targets. Within the top ten ranked conformations, the alignment closest to X-ray structure calculated with mRAISE has a root-mean-square deviation of less than 2.0 Å for 80.8 % of alignment pairs and achieves a median of less than 2.0 Å for eight of the 11 cases. The dataset used to rate the quality of the calculated alignments is freely available at http://www.zbh.uni-hamburg.de/mraise-dataset.html. The table of all PDB codes contained in the ensembles can be found in the supplementary material. The software tool mRAISE is freely available for evaluation purposes and academic use (see http://www.zbh.uni-hamburg.de/raise).
Parametric and non-parametric masking of randomness in sequence alignments can be improved and leads to better resolved trees.

PubMed

Kück, Patrick; Meusemann, Karen; Dambach, Johannes; Thormann, Birthe; von Reumont, Björn M; Wägele, Johann W; Misof, Bernhard

2010-03-31

Methods of alignment masking, which refers to the technique of excluding alignment blocks prior to tree reconstructions, have been successful in improving the signal-to-noise ratio in sequence alignments. However, the lack of formally well defined methods to identify randomness in sequence alignments has prevented a routine application of alignment masking. In this study, we compared the effects on tree reconstructions of the most commonly used profiling method (GBLOCKS) which uses a predefined set of rules in combination with alignment masking, with a new profiling approach (ALISCORE) based on Monte Carlo resampling within a sliding window, using different data sets and alignment methods. While the GBLOCKS approach excludes variable sections above a certain threshold which choice is left arbitrary, the ALISCORE algorithm is free of a priori rating of parameter space and therefore more objective. ALISCORE was successfully extended to amino acids using a proportional model and empirical substitution matrices to score randomness in multiple sequence alignments. A complex bootstrap resampling leads to an even distribution of scores of randomly similar sequences to assess randomness of the observed sequence similarity. Testing performance on real data, both masking methods, GBLOCKS and ALISCORE, helped to improve tree resolution. The sliding window approach was less sensitive to different alignments of identical data sets and performed equally well on all data sets. Concurrently, ALISCORE is capable of dealing with different substitution patterns and heterogeneous base composition. ALISCORE and the most relaxed GBLOCKS gap parameter setting performed best on all data sets. Correspondingly, Neighbor-Net analyses showed the most decrease in conflict. Alignment masking improves signal-to-noise ratio in multiple sequence alignments prior to phylogenetic reconstruction. Given the robust performance of alignment profiling, alignment masking should routinely be used to improve tree reconstructions. Parametric methods of alignment profiling can be easily extended to more complex likelihood based models of sequence evolution which opens the possibility of further improvements.
IVisTMSA: Interactive Visual Tools for Multiple Sequence Alignments.

PubMed

Pervez, Muhammad Tariq; Babar, Masroor Ellahi; Nadeem, Asif; Aslam, Naeem; Naveed, Nasir; Ahmad, Sarfraz; Muhammad, Shah; Qadri, Salman; Shahid, Muhammad; Hussain, Tanveer; Javed, Maryam

2015-01-01

IVisTMSA is a software package of seven graphical tools for multiple sequence alignments. MSApad is an editing and analysis tool. It can load 409% more data than Jalview, STRAP, CINEMA, and Base-by-Base. MSA comparator allows the user to visualize consistent and inconsistent regions of reference and test alignments of more than 21-MB size in less than 12 seconds. MSA comparator is 5,200% efficient and more than 40% efficient as compared to BALiBASE c program and FastSP, respectively. MSA reconstruction tool provides graphical user interfaces for four popular aligners and allows the user to load several sequence files at a time. FASTA generator converts seven formats of alignments of unlimited size into FASTA format in a few seconds. MSA ID calculator calculates identity matrix of more than 11,000 sequences with a sequence length of 2,696 base pairs in less than 100 seconds. Tree and Distance Matrix calculation tools generate phylogenetic tree and distance matrix, respectively, using neighbor joining% identity and BLOSUM 62 matrix.
Scalable multi-sample single-cell data analysis by Partition-Assisted Clustering and Multiple Alignments of Networks

PubMed Central

Samusik, Nikolay; Wang, Xiaowei; Guan, Leying; Nolan, Garry P.

2017-01-01

Mass cytometry (CyTOF) has greatly expanded the capability of cytometry. It is now easy to generate multiple CyTOF samples in a single study, with each sample containing single-cell measurement on 50 markers for more than hundreds of thousands of cells. Current methods do not adequately address the issues concerning combining multiple samples for subpopulation discovery, and these issues can be quickly and dramatically amplified with increasing number of samples. To overcome this limitation, we developed Partition-Assisted Clustering and Multiple Alignments of Networks (PAC-MAN) for the fast automatic identification of cell populations in CyTOF data closely matching that of expert manual-discovery, and for alignments between subpopulations across samples to define dataset-level cellular states. PAC-MAN is computationally efficient, allowing the management of very large CyTOF datasets, which are increasingly common in clinical studies and cancer studies that monitor various tissue samples for each subject. PMID:29281633

Spatial and Alignment Analyses for a Field of Small Volcanic Vents South of Pavonis Mons and Implications for the Tharsis Province, Mars

NASA Technical Reports Server (NTRS)

Bleacher, Jacob E.; Glaze, Lori S.; Greeley, Ronald; Hauber, Ernst; Baloga, Stephen; Sakimoto, Susan E. H.; Williams, David A.; Glotch, Timothy D.

2009-01-01

A field of small volcanic vents south of Pavonis Mons was mapped with each vent assigned a two-dimensional data point. Nearest neighbor and two-point azimuth analyses were applied to the resulting location data. Nearest neighbor results show that vents within this field are spatially random in a Poisson sense, suggesting that the vents formed independently of each other without sharing a centralized magma source at shallow depth. Two-point azimuth results show that the vents display north-trending alignment relationships between one another. This trend corresponds to the trends of faults and fractures of the Noachian-aged Claritas Fossae, which might extend into our study area buried beneath more recently emplaced lava flows. However, individual elongate vent summit structures do not consistently display the same trend. The development of the volcanic field appears to display tectonic control from buried Noachian-aged structural patterns on small, ascending magma bodies while the surface orientations of the linear vents might reflect different, younger tectonic patterns. These results suggest a complex interaction between magma ascension through the crust, and multiple, older, buried Tharsis-related tectonic structures.
System and method for 2D workpiece alignment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weaver, William T.; Carlson, Charles T.; Smith, Scott A.

2015-07-14

A carrier capable of holding one or more workpieces is disclosed. The carrier includes movable projections located along the sides of each cell in the carrier. This carrier, in conjunction with a separate alignment apparatus, aligns each workpiece within its respective cell against several alignment pins, using a multiple step alignment process to guarantee proper positioning of the workpiece in the cell. First, the workpieces are moved toward one side of the cell. Once the workpieces have been aligned against this side, the workpieces are then moved toward an adjacent orthogonal side such that the workpieces are aligned to twomore » sides of the cell. Once aligned, the workpiece is held in place by the projections located along each side of each cell. In addition, the alignment pins are also used to align the associated mask, thereby guaranteeing that the mask is properly aligned to the workpiece.« less
Information Technology (IT) Strategic Alignment: A Correlational Study between the Impact of IT Governance Structures and IT Strategic Alignment

ERIC Educational Resources Information Center

Asante, Keith K.

2010-01-01

This dissertation explored the extent to which Information Technology (IT) strategic alignment are impacted by IT governance structures. The study discusses several strategic alignment and IT governance literature that presents a gap in the literature domain. Subsequent studies researched issues surrounding why organizations are not able to align…
ADOMA: A Command Line Tool to Modify ClustalW Multiple Alignment Output.

PubMed

Zaal, Dionne; Nota, Benjamin

2016-01-01

We present ADOMA, a command line tool that produces alternative outputs from ClustalW multiple alignments of nucleotide or protein sequences. ADOMA can simplify the output of alignments by showing only the different residues between sequences, which is often desirable when only small differences such as single nucleotide polymorphisms are present (e.g., between different alleles). Another feature of ADOMA is that it can enhance the ClustalW output by coloring the residues in the alignment. This tool is easily integrated into automated Linux pipelines for next-generation sequencing data analysis, and may be useful for researchers in a broad range of scientific disciplines including evolutionary biology and biomedical sciences. The source code is freely available at https://sourceforge. net/projects/adoma/. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Bellerophon: a program to detect chimeric sequences in multiple sequence alignments.

PubMed

Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip

2004-09-22

Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments. Bellerophon is available as an interactive web server at http://foo.maths.uq.edu.au/~huber/bellerophon.pl
Phylogenetic Characterization of Transport Protein Superfamilies: Superiority of SuperfamilyTree Programs over Those Based on Multiple Alignments

PubMed Central

Chen, Jonathan S.; Reddy, Vamsee; Chen, Joshua H.; Shlykov, Maksim A.; Zheng, Wei Hao; Cho, Jaehoon; Yen, Ming Ren; Saier, Milton H.

2012-01-01

Transport proteins function in the translocation of ions, solutes and macromolecules across cellular and organellar membranes. These integral membrane proteins fall into >600 families as tabulated in the Transporter Classification Database (www.tcdb.org). Recent studies, some of which are reported here, define distant phylogenetic relationships between families with the creation of superfamilies. Several of these are analyzed using a novel set of programs designed to allow reliable prediction of phylogenetic trees when sequence divergence is too great to allow the use of multiple alignments. These new programs, called SuperfamilyTree1 and 2 (SFT1 and 2), allow display of protein and family relationships, respectively, based on thousands of comparative BLAST scores rather than multiple alignments. Superfamilies analyzed include: (1) Aerolysins, (2) RTX Toxins, (3) Defensins, (4) Ion Transporters, (5) Bile/Arsenite/Riboflavin Transporters, (6) Cation: Proton Antiporters, and (7) the Glucose/Fructose/Lactose superfamily within the prokaryotic phosphoenol pyruvate-dependent Phosphotransferase System. In addition to defining the phylogenetic relationships of the proteins and families within these seven superfamilies, evidence is provided showing that the SFT programs outperform programs that are based on multiple alignments whenever sequence divergence of superfamily members is extensive. The SFT programs should be applicable to virtually any superfamily of proteins or nucleic acids. PMID:22286036
Tunable arbitrary unitary transformer based on multiple sections of multicore fibers with phase control.

PubMed

Zhou, Junhe; Wu, Jianjie; Hu, Qinsong

2018-02-05

In this paper, we propose a novel tunable unitary transformer, which can achieve arbitrary discrete unitary transforms. The unitary transformer is composed of multiple sections of multi-core fibers with closely aligned coupled cores. Phase shifters are inserted before and after the sections to control the phases of the waves in the cores. A simple algorithm is proposed to find the optimal phase setup for the phase shifters to realize the desired unitary transforms. The proposed device is fiber based and is particularly suitable for the mode division multiplexing systems. A tunable mode MUX/DEMUX for a three-mode fiber is designed based on the proposed structure.
Statistical modeling of SRAM yield performance and circuit variability

NASA Astrophysics Data System (ADS)

Cheng, Qi; Chen, Yijian

2015-03-01

In this paper, we develop statistical models to investigate SRAM yield performance and circuit variability in the presence of self-aligned multiple patterning (SAMP) process. It is assumed that SRAM fins are fabricated by a positivetone (spacer is line) self-aligned sextuple patterning (SASP) process which accommodates two types of spacers, while gates are fabricated by a more pitch-relaxed self-aligned quadruple patterning (SAQP) process which only allows one type of spacer. A number of possible inverter and SRAM structures are identified and the related circuit multi-modality is studied using the developed failure-probability and yield models. It is shown that SRAM circuit yield is significantly impacted by the multi-modality of fins' spatial variations in a SRAM cell. The sensitivity of 6-transistor SRAM read/write failure probability to SASP process variations is calculated and the specific circuit type with the highest probability to fail in the reading/writing operation is identified. Our study suggests that the 6-transistor SRAM configuration may not be scalable to 7-nm half pitch and more robust SRAM circuit design needs to be researched.
Simultaneous fluorescence and quantitative phase microscopy with single-pixel detectors

NASA Astrophysics Data System (ADS)

Liu, Yang; Suo, Jinli; Zhang, Yuanlong; Dai, Qionghai

2018-02-01

Multimodal microscopy offers high flexibilities for biomedical observation and diagnosis. Conventional multimodal approaches either use multiple cameras or a single camera spatially multiplexing different modes. The former needs expertise demanding alignment and the latter suffers from limited spatial resolution. Here, we report an alignment-free full-resolution simultaneous fluorescence and quantitative phase imaging approach using single-pixel detectors. By combining reference-free interferometry with single-pixel detection, we encode the phase and fluorescence of the sample in two detection arms at the same time. Then we employ structured illumination and the correlated measurements between the sample and the illuminations for reconstruction. The recovered fluorescence and phase images are inherently aligned thanks to single-pixel detection. To validate the proposed method, we built a proof-of-concept setup for first imaging the phase of etched glass with the depth of a few hundred nanometers and then imaging the fluorescence and phase of the quantum dot drop. This method holds great potential for multispectral fluorescence microscopy with additional single-pixel detectors or a spectrometer. Besides, this cost-efficient multimodal system might find broad applications in biomedical science and neuroscience.
Introduction to bioinformatics.

PubMed

Can, Tolga

2014-01-01

Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.
Three-dimensional matrix fiber alignment modulates cell migration and MT1-MMP utility by spatially and temporally directing protrusions

NASA Astrophysics Data System (ADS)

Fraley, Stephanie I.; Wu, Pei-Hsun; He, Lijuan; Feng, Yunfeng; Krisnamurthy, Ranjini; Longmore, Gregory D.; Wirtz, Denis

2015-10-01

Multiple attributes of the three-dimensional (3D) extracellular matrix (ECM) have been independently implicated as regulators of cell motility, including pore size, crosslink density, structural organization, and stiffness. However, these parameters cannot be independently varied within a complex 3D ECM protein network. We present an integrated, quantitative study of these parameters across a broad range of complex matrix configurations using self-assembling 3D collagen and show how each parameter relates to the others and to cell motility. Increasing collagen density resulted in a decrease and then an increase in both pore size and fiber alignment, which both correlated significantly with cell motility but not bulk matrix stiffness within the range tested. However, using the crosslinking enzyme Transglutaminase II to alter microstructure independently of density revealed that motility is most significantly predicted by fiber alignment. Cellular protrusion rate, protrusion orientation, speed of migration, and invasion distance showed coupled biphasic responses to increasing collagen density not predicted by 2D models or by stiffness, but instead by fiber alignment. The requirement of matrix metalloproteinase (MMP) activity was also observed to depend on microstructure, and a threshold of MMP utility was identified. Our results suggest that fiber topography guides protrusions and thereby MMP activity and motility.
A series of PDB related databases for everyday needs.

PubMed

Joosten, Robbie P; te Beek, Tim A H; Krieger, Elmar; Hekkelman, Maarten L; Hooft, Rob W W; Schneider, Reinhard; Sander, Chris; Vriend, Gert

2011-01-01

The Protein Data Bank (PDB) is the world-wide repository of macromolecular structure information. We present a series of databases that run parallel to the PDB. Each database holds one entry, if possible, for each PDB entry. DSSP holds the secondary structure of the proteins. PDBREPORT holds reports on the structure quality and lists errors. HSSP holds a multiple sequence alignment for all proteins. The PDBFINDER holds easy to parse summaries of the PDB file content, augmented with essentials from the other systems. PDB_REDO holds re-refined, and often improved, copies of all structures solved by X-ray. WHY_NOT summarizes why certain files could not be produced. All these systems are updated weekly. The data sets can be used for the analysis of properties of protein structures in areas ranging from structural genomics, to cancer biology and protein design.
Effect of the sequence data deluge on the performance of methods for detecting protein functional residues.

PubMed

Garrido-Martín, Diego; Pazos, Florencio

2018-02-27

The exponential accumulation of new sequences in public databases is expected to improve the performance of all the approaches for predicting protein structural and functional features. Nevertheless, this was never assessed or quantified for some widely used methodologies, such as those aimed at detecting functional sites and functional subfamilies in protein multiple sequence alignments. Using raw protein sequences as only input, these approaches can detect fully conserved positions, as well as those with a family-dependent conservation pattern. Both types of residues are routinely used as predictors of functional sites and, consequently, understanding how the sequence content of the databases affects them is relevant and timely. In this work we evaluate how the growth and change with time in the content of sequence databases affect five sequence-based approaches for detecting functional sites and subfamilies. We do that by recreating historical versions of the multiple sequence alignments that would have been obtained in the past based on the database contents at different time points, covering a period of 20 years. Applying the methods to these historical alignments allows quantifying the temporal variation in their performance. Our results show that the number of families to which these methods can be applied sharply increases with time, while their ability to detect potentially functional residues remains almost constant. These results are informative for the methods' developers and final users, and may have implications in the design of new sequencing initiatives.
Deblurring of Class-Averaged Images in Single-Particle Electron Microscopy.

PubMed

Park, Wooram; Madden, Dean R; Rockmore, Daniel N; Chirikjian, Gregory S

2010-03-01

This paper proposes a method for deblurring of class-averaged images in single-particle electron microscopy (EM). Since EM images of biological samples are very noisy, the images which are nominally identical projection images are often grouped, aligned and averaged in order to cancel or reduce the background noise. However, the noise in the individual EM images generates errors in the alignment process, which creates an inherent limit on the accuracy of the resulting class averages. This inaccurate class average due to the alignment errors can be viewed as the result of a convolution of an underlying clear image with a blurring function. In this work, we develop a deconvolution method that gives an estimate for the underlying clear image from a blurred class-averaged image using precomputed statistics of misalignment. Since this convolution is over the group of rigid body motions of the plane, SE(2), we use the Fourier transform for SE(2) in order to convert the convolution into a matrix multiplication in the corresponding Fourier space. For practical implementation we use a Hermite-function-based image modeling technique, because Hermite expansions enable lossless Cartesian-polar coordinate conversion using the Laguerre-Fourier expansions, and Hermite expansion and Laguerre-Fourier expansion retain their structures under the Fourier transform. Based on these mathematical properties, we can obtain the deconvolution of the blurred class average using simple matrix multiplication. Tests of the proposed deconvolution method using synthetic and experimental EM images confirm the performance of our method.
Mirror instability and origin of morningside auroral structure

NASA Technical Reports Server (NTRS)

Chiu, Y. T.; Schulz, M.; Fennell, J. F.; Kishi, A. M.

1983-01-01

Auroral optical imagery shows marked differences between auroral features of the evening and morning sectors: the separation between diffuse and discrete auroras in the evening sector is not distinct in the morning sector, which is dominated by auroral patches and multiple banded structures aligned along some direction. Plasma distribution function signatures also show marked differences: downward electron beams and inverted-V signatures prefer the evening sector, while the electron spectra on the morning sector are similar to the diffuse aurora. A theory of morningside auroras consistent with these features was constructed. The theory is based on modulation of the growth rates of electron cyclotron waves by the mirror instability, which is in turn driven by inward-convected ions that have become anisotropic. This modulation produces alternating bands of enhanced and reduced electron precipitation which approximate the observed multiple auroral bands and patches of the morning sector.
Special Focus

PubMed Central

Nawrocki, Eric P.; Burge, Sarah W.

2013-01-01

The development of RNA bioinformatic tools began more than 30 y ago with the description of the Nussinov and Zuker dynamic programming algorithms for single sequence RNA secondary structure prediction. Since then, many tools have been developed for various RNA sequence analysis problems such as homology search, multiple sequence alignment, de novo RNA discovery, read-mapping, and many more. In this issue, we have collected a sampling of reviews and original research that demonstrate some of the many ways bioinformatics is integrated with current RNA biology research. PMID:23948768
RPG: the Ribosomal Protein Gene database.

PubMed

Nakao, Akihiro; Yoshihama, Maki; Kenmochi, Naoya

2004-01-01

RPG (http://ribosome.miyazaki-med.ac.jp/) is a new database that provides detailed information about ribosomal protein (RP) genes. It contains data from humans and other organisms, including Drosophila melanogaster, Caenorhabditis elegans, Saccharo myces cerevisiae, Methanococcus jannaschii and Escherichia coli. Users can search the database by gene name and organism. Each record includes sequences (genomic, cDNA and amino acid sequences), intron/exon structures, genomic locations and information about orthologs. In addition, users can view and compare the gene structures of the above organisms and make multiple amino acid sequence alignments. RPG also provides information on small nucleolar RNAs (snoRNAs) that are encoded in the introns of RP genes.
RPG: the Ribosomal Protein Gene database

PubMed Central

Nakao, Akihiro; Yoshihama, Maki; Kenmochi, Naoya

2004-01-01

RPG (http://ribosome.miyazaki-med.ac.jp/) is a new database that provides detailed information about ribosomal protein (RP) genes. It contains data from humans and other organisms, including Drosophila melanogaster, Caenorhabditis elegans, Saccharo myces cerevisiae, Methanococcus jannaschii and Escherichia coli. Users can search the database by gene name and organism. Each record includes sequences (genomic, cDNA and amino acid sequences), intron/exon structures, genomic locations and information about orthologs. In addition, users can view and compare the gene structures of the above organisms and make multiple amino acid sequence alignments. RPG also provides information on small nucleolar RNAs (snoRNAs) that are encoded in the introns of RP genes. PMID:14681386
Evol and ProDy for bridging protein sequence evolution and structural dynamics

PubMed Central

Mao, Wenzhi; Liu, Ying; Chennubhotla, Chakra; Lezon, Timothy R.; Bahar, Ivet

2014-01-01

Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. Availability and implementation: ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt.edu/. Contact: bahar@pitt.edu PMID:24849577
GOSSIP: a method for fast and accurate global alignment of protein structures.

PubMed

Kifer, I; Nussinov, R; Wolfson, H J

2011-04-01

The database of known protein structures (PDB) is increasing rapidly. This results in a growing need for methods that can cope with the vast amount of structural data. To analyze the accumulating data, it is important to have a fast tool for identifying similar structures and clustering them by structural resemblance. Several excellent tools have been developed for the comparison of protein structures. These usually address the task of local structure alignment, an important yet computationally intensive problem due to its complexity. It is difficult to use such tools for comparing a large number of structures to each other at a reasonable time. Here we present GOSSIP, a novel method for a global all-against-all alignment of any set of protein structures. The method detects similarities between structures down to a certain cutoff (a parameter of the program), hence allowing it to detect similar structures at a much higher speed than local structure alignment methods. GOSSIP compares many structures in times which are several orders of magnitude faster than well-known available structure alignment servers, and it is also faster than a database scanning method. We evaluate GOSSIP both on a dataset of short structural fragments and on two large sequence-diverse structural benchmarks. Our conclusions are that for a threshold of 0.6 and above, the speed of GOSSIP is obtained with no compromise of the accuracy of the alignments or of the number of detected global similarities. A server, as well as an executable for download, are available at http://bioinfo3d.cs.tau.ac.il/gossip/.

Alignment method for solar collector arrays

DOEpatents

Driver, Jr., Richard B

2012-10-23

The present invention is directed to an improved method for establishing camera fixture location for aligning mirrors on a solar collector array (SCA) comprising multiple mirror modules. The method aligns the mirrors on a module by comparing the location of the receiver image in photographs with the predicted theoretical receiver image location. To accurately align an entire SCA, a common reference is used for all of the individual module images within the SCA. The improved method can use relative pixel location information in digital photographs along with alignment fixture inclinometer data to calculate relative locations of the fixture between modules. The absolute locations are determined by minimizing alignment asymmetry for the SCA. The method inherently aligns all of the mirrors in an SCA to the receiver, even with receiver position and module-to-module alignment errors.
The twilight zone of cis element alignments.

PubMed

Sebastian, Alvaro; Contreras-Moreira, Bruno

2013-02-01

Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein-DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein-DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments.
The twilight zone of cis element alignments

PubMed Central

Sebastian, Alvaro; Contreras-Moreira, Bruno

2013-01-01

Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein–DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein–DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments. PMID:23268451
Psychometric Evaluation of the Overexcitability Questionnaire-Two Applying Bayesian Structural Equation Modeling (BSEM) and Multiple-Group BSEM-Based Alignment with Approximate Measurement Invariance

PubMed Central

De Bondt, Niki; Van Petegem, Peter

2015-01-01

The Overexcitability Questionnaire-Two (OEQ-II) measures the degree and nature of overexcitability, which assists in determining the developmental potential of an individual according to Dabrowski's Theory of Positive Disintegration. Previous validation studies using frequentist confirmatory factor analysis, which postulates exact parameter constraints, led to model rejection and a long series of model modifications. Bayesian structural equation modeling (BSEM) allows the application of zero-mean, small-variance priors for cross-loadings, residual covariances, and differences in measurement parameters across groups, better reflecting substantive theory and leading to better model fit and less overestimation of factor correlations. Our BSEM analysis with a sample of 516 students in higher education yields positive results regarding the factorial validity of the OEQ-II. Likewise, applying BSEM-based alignment with approximate measurement invariance, the absence of non-invariant factor loadings and intercepts across gender is supportive of the psychometric quality of the OEQ-II. Compared to males, females scored significantly higher on emotional and sensual overexcitability, and significantly lower on psychomotor overexcitability. PMID:26733931
Psychometric Evaluation of the Overexcitability Questionnaire-Two Applying Bayesian Structural Equation Modeling (BSEM) and Multiple-Group BSEM-Based Alignment with Approximate Measurement Invariance.

PubMed

De Bondt, Niki; Van Petegem, Peter

2015-01-01

The Overexcitability Questionnaire-Two (OEQ-II) measures the degree and nature of overexcitability, which assists in determining the developmental potential of an individual according to Dabrowski's Theory of Positive Disintegration. Previous validation studies using frequentist confirmatory factor analysis, which postulates exact parameter constraints, led to model rejection and a long series of model modifications. Bayesian structural equation modeling (BSEM) allows the application of zero-mean, small-variance priors for cross-loadings, residual covariances, and differences in measurement parameters across groups, better reflecting substantive theory and leading to better model fit and less overestimation of factor correlations. Our BSEM analysis with a sample of 516 students in higher education yields positive results regarding the factorial validity of the OEQ-II. Likewise, applying BSEM-based alignment with approximate measurement invariance, the absence of non-invariant factor loadings and intercepts across gender is supportive of the psychometric quality of the OEQ-II. Compared to males, females scored significantly higher on emotional and sensual overexcitability, and significantly lower on psychomotor overexcitability.
Population-based structural variation discovery with Hydra-Multi.

PubMed

Lindberg, Michael R; Hall, Ira M; Quinlan, Aaron R

2015-04-15

Current strategies for SNP and INDEL discovery incorporate sequence alignments from multiple individuals to maximize sensitivity and specificity. It is widely accepted that this approach also improves structural variant (SV) detection. However, multisample SV analysis has been stymied by the fundamental difficulties of SV calling, e.g. library insert size variability, SV alignment signal integration and detecting long-range genomic rearrangements involving disjoint loci. Extant tools suffer from poor scalability, which limits the number of genomes that can be co-analyzed and complicates analysis workflows. We have developed an approach that enables multisample SV analysis in hundreds to thousands of human genomes using commodity hardware. Here, we describe Hydra-Multi and measure its accuracy, speed and scalability using publicly available datasets provided by The 1000 Genomes Project and by The Cancer Genome Atlas (TCGA). Hydra-Multi is written in C++ and is freely available at https://github.com/arq5x/Hydra. aaronquinlan@gmail.com or ihall@genome.wustl.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Identifying functionally informative evolutionary sequence profiles.

PubMed

Gil, Nelson; Fiser, Andras

2018-04-15

Multiple sequence alignments (MSAs) can provide essential input to many bioinformatics applications, including protein structure prediction and functional annotation. However, the optimal selection of sequences to obtain biologically informative MSAs for such purposes is poorly explored, and has traditionally been performed manually. We present Selection of Alignment by Maximal Mutual Information (SAMMI), an automated, sequence-based approach to objectively select an optimal MSA from a large set of alternatives sampled from a general sequence database search. The hypothesis of this approach is that the mutual information among MSA columns will be maximal for those MSAs that contain the most diverse set possible of the most structurally and functionally homogeneous protein sequences. SAMMI was tested to select MSAs for functional site residue prediction by analysis of conservation patterns on a set of 435 proteins obtained from protein-ligand (peptides, nucleic acids and small substrates) and protein-protein interaction databases. Availability and implementation: A freely accessible program, including source code, implementing SAMMI is available at https://github.com/nelsongil92/SAMMI.git. andras.fiser@einstein.yu.edu. Supplementary data are available at Bioinformatics online.
Reconstruction of interatomic vectors by principle component analysis of nuclear magnetic resonance data in multiple alignments

NASA Astrophysics Data System (ADS)

Hus, Jean-Christophe; Bruschweiler, Rafael

2002-07-01

A general method is presented for the reconstruction of interatomic vector orientations from nuclear magnetic resonance (NMR) spectroscopic data of tensor interactions of rank 2, such as dipolar coupling and chemical shielding anisotropy interactions, in solids and partially aligned liquid-state systems. The method, called PRIMA, is based on a principal component analysis of the covariance matrix of the NMR parameters collected for multiple alignments. The five nonzero eigenvalues and their eigenvectors efficiently allow the approximate reconstruction of the vector orientations of the underlying interactions. The method is demonstrated for an isotropic distribution of sample orientations as well as for finite sets of orientations and internuclear vectors encountered in protein systems.
Pre-calculated protein structure alignments at the RCSB PDB website.

PubMed

Prlic, Andreas; Bliven, Spencer; Rose, Peter W; Bluhm, Wolfgang F; Bizon, Chris; Godzik, Adam; Bourne, Philip E

2010-12-01

With the continuous growth of the RCSB Protein Data Bank (PDB), providing an up-to-date systematic structure comparison of all protein structures poses an ever growing challenge. Here, we present a comparison tool for calculating both 1D protein sequence and 3D protein structure alignments. This tool supports various applications at the RCSB PDB website. First, a structure alignment web service calculates pairwise alignments. Second, a stand-alone application runs alignments locally and visualizes the results. Third, pre-calculated 3D structure comparisons for the whole PDB are provided and updated on a weekly basis. These three applications allow users to discover novel relationships between proteins available either at the RCSB PDB or provided by the user. A web user interface is available at http://www.rcsb.org/pdb/workbench/workbench.do. The source code is available under the LGPL license from http://www.biojava.org. A source bundle, prepared for local execution, is available from http://source.rcsb.org andreas@sdsc.edu; pbourne@ucsd.edu.
Alignment Pins for Assembling and Disassembling Structures

NASA Technical Reports Server (NTRS)

Campbell, Oliver C.

2008-01-01

Simple, easy-to-use, highly effective tooling has been devised for maintaining alignment of bolt holes in mating structures during assembly and disassembly of the structures. The tooling was originally used during removal of a body flap from the space shuttle Atlantis, in which misalignments during removal of the last few bolts could cause the bolts to bind in their holes. By suitably modifying the dimensions of the tooling components, the basic design of the tooling can readily be adapted to other structures that must be maintained in alignment. The tooling includes tapered, internally threaded alignment pins designed to fit in the bolt holes in one of the mating structures, plus a draw bolt and a cup that are used to install or remove each alignment pin. In preparation for disassembly of two mating structures, external supports are provided to prevent unintended movement of the structures. During disassembly of the structures, as each bolt that joins the structures is removed, an alignment pin is installed in its place. Once all the bolts have been removed and replaced with pins, the pins maintain alignment as the structures are gently pushed or pulled apart on the supports. In assembling the two structures, one reverses the procedure described above: pins are installed in the bolt holes, the structures are pulled or pushed together on the supports, then the pins are removed and replaced with bolts. The figure depicts the tooling and its use. To install an alignment pin in a bolt hole in a structural panel, the tapered end of the pin is inserted from one side of the panel, the cup is placed over the pin on the opposite side of the panel, the draw bolt is inserted through the cup and threaded into the pin, the draw bolt is tightened to pull the pin until the pin is seated firmly in the hole, then the draw bolt and cup are removed, leaving the pin in place. To remove an alignment pin, the cup is placed over the pin on the first-mentioned side of the panel, the draw bolt is inserted through the cup and threaded into the pin, then the draw bolt is tightened to pull the pin out of the hole.
Dynamic/Jitter Assessment of Multiple Potential HabEx Structural Designs

NASA Technical Reports Server (NTRS)

Knight, J. Brent; Stahl, H. Philip; Singleton, Andy; Hunt, Ron; Therrell, Melissa; Caldwell, Kate; Garcia, Jay; Baysinger, Mike

2017-01-01

One of the driving structural requirements of the Habitable Exo-Planet (HabEx) telescope is to maintain Line Of Sight (LOS) stability between the Primary Mirror (PM) and Secondary Mirror (SM) of = 5 mas. Dynamic analyses of two configurations of a proposed (HabEx) 4 meter off-axis telescope structure were performed to predict effects of jitter on primary/secondary mirror alignment. The dynamic disturbance used as the forcing function was the James Webb Space Telescope reaction wheel assembly vibration emission specification level. The objective of these analyses was to predict "order-of-magnitude" performance for various structural configurations which will roll into efforts to define the HabEx structural design's global architecture. Two variations of the basic architectural design were analyzed. Relative motion between the PM and the SM for each design configuration are reported.
Dynamic/jitter assessment of multiple potential HabEx structural designs

NASA Astrophysics Data System (ADS)

Knight, J. Brent; Stahl, H. Philip; Singleton, Andy; Hunt, Ron; Therrell, Melissa; Caldwell, Kate; Garcia, Jay; Baysinger, Mike

2017-09-01

One of the driving structural requirements of the Habitable Exo-Planet (HabEx) telescope is to maintain Line Of Sight (LOS) stability between the Primary Mirror (PM) and Secondary Mirror (SM) of <= 5 milli-arc seconds (mas). Dynamic analyses of two configurations of a proposed HabEx 4 meter off-axis telescope structure were performed to predict effects of a vibration input on primary/secondary mirror alignment. The dynamic disturbance used as the forcing function was the James Webb Space Telescope reaction wheel assembly vibration emission specification level. The objective of these analyses was to predict "order-of-magnitude" performance for various structural configurations which contribute to efforts in defining the HabEx structural design's global architecture. Two variations of the basic architectural design were analyzed. Relative motion between the PM and the SM for each design configuration are reported.
Purification, developmental expression, and in silico characterization of α-amylase inhibitor from Echinochloa frumentacea.

PubMed

Panwar, Priyankar; Verma, A K; Dubey, Ashutosh

2018-05-01

Barnyard ( Echinochloa frumentacea ) and finger ( Eleusine coracana ) millet growing at northwestern Himalaya were explored for the α-amylase inhibitor (α-AI). The mature seeds of barnyard millet variety PRJ1 had maximum α-AI activity which increases in different developmental stage. α-AI was purified up to 22.25-fold from barnyard millet variety PRJ1. Semi-quantitative PCR of different developmental stages of barnyard millet seeds showed increased levels of the transcript from 7 to 28 days. Sequence analysis revealed that it contained 315 bp nucleotide which encodes 104 amino acid sequence with molecular weight 10.72 kDa. The predicted 3D structure of α-AI was 86.73% similar to a bifunctional inhibitor of ragi. In silico analysis of 71 α-AI protein sequences were carried out for biochemical features, homology search, multiple sequence alignment, phylogenetic tree construction, motif, and superfamily distribution of protein sequences. Analysis of multiple sequence alignment revealed the existence of conserved regions NPLP[S/G]CRWYVV[S/Q][Q/R]TCG[V/I] throughout sequences. Superfam analysis revealed that α-AI protein sequences were distributed among seven different superfamilies.
Ligand Binding Site Detection by Local Structure Alignment and Its Performance Complementarity

PubMed Central

Lee, Hui Sun; Im, Wonpil

2013-01-01

Accurate determination of potential ligand binding sites (BS) is a key step for protein function characterization and structure-based drug design. Despite promising results of template-based BS prediction methods using global structure alignment (GSA), there is a room to improve the performance by properly incorporating local structure alignment (LSA) because BS are local structures and often similar for proteins with dissimilar global folds. We present a template-based ligand BS prediction method using G-LoSA, our LSA tool. A large benchmark set validation shows that G-LoSA predicts drug-like ligands’ positions in single-chain protein targets more precisely than TM-align, a GSA-based method, while the overall success rate of TM-align is better. G-LoSA is particularly efficient for accurate detection of local structures conserved across proteins with diverse global topologies. Recognizing the performance complementarity of G-LoSA to TM-align and a non-template geometry-based method, fpocket, a robust consensus scoring method, CMCS-BSP (Complementary Methods and Consensus Scoring for ligand Binding Site Prediction), is developed and shows improvement on prediction accuracy. The G-LoSA source code is freely available at http://im.bioinformatics.ku.edu/GLoSA. PMID:23957286
CMSA: a heterogeneous CPU/GPU computing system for multiple similar RNA/DNA sequence alignment.

PubMed

Chen, Xi; Wang, Chen; Tang, Shanjiang; Yu, Ce; Zou, Quan

2017-06-24

The multiple sequence alignment (MSA) is a classic and powerful technique for sequence analysis in bioinformatics. With the rapid growth of biological datasets, MSA parallelization becomes necessary to keep its running time in an acceptable level. Although there are a lot of work on MSA problems, their approaches are either insufficient or contain some implicit assumptions that limit the generality of usage. First, the information of users' sequences, including the sizes of datasets and the lengths of sequences, can be of arbitrary values and are generally unknown before submitted, which are unfortunately ignored by previous work. Second, the center star strategy is suited for aligning similar sequences. But its first stage, center sequence selection, is highly time-consuming and requires further optimization. Moreover, given the heterogeneous CPU/GPU platform, prior studies consider the MSA parallelization on GPU devices only, making the CPUs idle during the computation. Co-run computation, however, can maximize the utilization of the computing resources by enabling the workload computation on both CPU and GPU simultaneously. This paper presents CMSA, a robust and efficient MSA system for large-scale datasets on the heterogeneous CPU/GPU platform. It performs and optimizes multiple sequence alignment automatically for users' submitted sequences without any assumptions. CMSA adopts the co-run computation model so that both CPU and GPU devices are fully utilized. Moreover, CMSA proposes an improved center star strategy that reduces the time complexity of its center sequence selection process from O(mn 2 ) to O(mn). The experimental results show that CMSA achieves an up to 11× speedup and outperforms the state-of-the-art software. CMSA focuses on the multiple similar RNA/DNA sequence alignment and proposes a novel bitmap based algorithm to improve the center star strategy. We can conclude that harvesting the high performance of modern GPU is a promising approach to accelerate multiple sequence alignment. Besides, adopting the co-run computation model can maximize the entire system utilization significantly. The source code is available at https://github.com/wangvsa/CMSA .
Resolving the multiple sequence alignment problem using biogeography-based optimization with multiple populations.

PubMed

Zemali, El-Amine; Boukra, Abdelmadjid

2015-08-01

The multiple sequence alignment (MSA) is one of the most challenging problems in bioinformatics, it involves discovering similarity between a set of protein or DNA sequences. This paper introduces a new method for the MSA problem called biogeography-based optimization with multiple populations (BBOMP). It is based on a recent metaheuristic inspired from the mathematics of biogeography named biogeography-based optimization (BBO). To improve the exploration ability of BBO, we have introduced a new concept allowing better exploration of the search space. It consists of manipulating multiple populations having each one its own parameters. These parameters are used to build up progressive alignments allowing more diversity. At each iteration, the best found solution is injected in each population. Moreover, to improve solution quality, six operators are defined. These operators are selected with a dynamic probability which changes according to the operators efficiency. In order to test proposed approach performance, we have considered a set of datasets from Balibase 2.0 and compared it with many recent algorithms such as GAPAM, MSA-GA, QEAMSA and RBT-GA. The results show that the proposed approach achieves better average score than the previously cited methods.
Structure-Based Sequence Alignment of the Transmembrane Domains of All Human GPCRs: Phylogenetic, Structural and Functional Implications

PubMed Central

Cvicek, Vaclav; Goddard, William A.; Abrol, Ravinder

2016-01-01

The understanding of G-protein coupled receptors (GPCRs) is undergoing a revolution due to increased information about their signaling and the experimental determination of structures for more than 25 receptors. The availability of at least one receptor structure for each of the GPCR classes, well separated in sequence space, enables an integrated superfamily-wide analysis to identify signatures involving the role of conserved residues, conserved contacts, and downstream signaling in the context of receptor structures. In this study, we align the transmembrane (TM) domains of all experimental GPCR structures to maximize the conserved inter-helical contacts. The resulting superfamily-wide GpcR Sequence-Structure (GRoSS) alignment of the TM domains for all human GPCR sequences is sufficient to generate a phylogenetic tree that correctly distinguishes all different GPCR classes, suggesting that the class-level differences in the GPCR superfamily are encoded at least partly in the TM domains. The inter-helical contacts conserved across all GPCR classes describe the evolutionarily conserved GPCR structural fold. The corresponding structural alignment of the inactive and active conformations, available for a few GPCRs, identifies activation hot-spot residues in the TM domains that get rewired upon activation. Many GPCR mutations, known to alter receptor signaling and cause disease, are located at these conserved contact and activation hot-spot residue positions. The GRoSS alignment places the chemosensory receptor subfamilies for bitter taste (TAS2R) and pheromones (Vomeronasal, VN1R) in the rhodopsin family, known to contain the chemosensory olfactory receptor subfamily. The GRoSS alignment also enables the quantification of the structural variability in the TM regions of experimental structures, useful for homology modeling and structure prediction of receptors. Furthermore, this alignment identifies structurally and functionally important residues in all human GPCRs. These residues can be used to make testable hypotheses about the structural basis of receptor function and about the molecular basis of disease-associated single nucleotide polymorphisms. PMID:27028541
Evolution of physician-hospital alignment models: a case study of comanagement.

PubMed

Sowers, Kevin W; Newman, Paul R; Langdon, Jeffrey C

2013-06-01

Recently, quality, financial, and regulatory demands have driven physicians to seek alignment opportunities with hospitals. The motivation for alignment on the part of physicians and hospitals is now accelerating because the new paradigm under healthcare reform requires an increased focus on improving quality, cost, and efficiency. We (1) identify the key drivers for physician-hospital alignment models; (2) summarize comanagement as a physician-hospital alignment model; and (3) explore a detailed case study of comanagement as an option to better align physicians with hospital goals on quality, safety, and outcomes. A Medline abstract review was performed that identified 45 references that discuss options for physician-hospital alignment. None of the articles identified provide a detailed example of successful alignment structures. A detailed case study of a successful comanagement alignment program is reviewed. The key drivers for alignment are inpatient growth rates, declining reimbursements, and the opportunity to improve quality, decrease costs, and increase efficiency. Two general strategies of alignment involve noneconomic and/or economic integration. In our example, comanagement with economic integration was chosen as the preferred structure for physician-hospital alignment. The choice of structure will vary depending on the existing relationships and governance of the hospital and the physicians in the targeted area of focus. The measure of success in building physician-hospital alignment is measured in improvements in care for the patient, reduced cost of care delivery, and improved relations between physicians and hospital leadership.
Disruption of collagen/apatite alignment impairs bone mechanical function in osteoblastic metastasis induced by prostate cancer.

PubMed

Sekita, Aiko; Matsugaki, Aira; Nakano, Takayoshi

2017-04-01

Prostate cancer (PCa) frequently metastasizes to the bone, generally inducing osteoblastic alterations that increase bone brittleness. Although there is growing interest in the management of the physical capability of patients with bone metastasis, the mechanism underlying the impairment of bone mechanical function remains unclear. The alignment of both collagen fibrils and biological apatite (BAp) c-axis, together with bone mineral density, is one of the strongest contributors to bone mechanical function. In this study, we analyzed the bone microstructure of the mouse femurs with and without PCa cell inoculation. Histological assessment revealed that the bone-forming pattern in the PCa-bearing bone was non-directional, resulting in a spongious structure, whereas that in the control bone was unidirectional and layer-by-layer, resulting in a compact lamellar structure. The degree of preferential alignment of collagen fibrils and BAp, which was evaluated by quantitative polarized microscopy and microbeam X-ray diffraction, respectively, were significantly lower in the PCa-bearing bone than in the control bone. Material parameters including Young's modulus and toughness, measured by the three-point bending test, were simultaneously decreased in the PCa-bearing bone. Specifically, there was a significant positive correlation between the degree of BAp c-axis orientation and Young's modulus. In conclusion, the impairment of mechanical function in the PCa-bearing bone is attributable to disruption of the anisotropic microstructure of bone in multiple phases. This is the first report demonstrating that cancer bone metastasis induces disruption of the collagen/BAp alignment in long bones, thereby impairing their mechanical function. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Analyzing and synthesizing phylogenies using tree alignment graphs.

PubMed

Smith, Stephen A; Brown, Joseph W; Hinchliff, Cody E

2013-01-01

Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe.

Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs

PubMed Central

Smith, Stephen A.; Brown, Joseph W.; Hinchliff, Cody E.

2013-01-01

Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe. PMID:24086118
HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

PubMed

Wan, Shixiang; Zou, Quan

2017-01-01

Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families

PubMed Central

Gudyś, Adam; Deorowicz, Sebastian

2017-01-01

The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are ineffective when large protein families are investigated. We present QuickProbs 2, an algorithm for multiple sequence alignment. Based on probabilistic models, equipped with novel column-oriented refinement and selective consistency, it offers outstanding accuracy. When analysing hundreds of sequences, Quick-Probs 2 is noticeably better than ClustalΩ and MAFFT, the previous leaders for processing numerous protein families. In the case of smaller sets, for which consistency-based methods are the best performing, QuickProbs 2 is also superior to the competitors. Due to low computational requirements of selective consistency and utilization of massively parallel architectures, presented algorithm has similar execution times to ClustalΩ, and is orders of magnitude faster than full consistency approaches, like MSAProbs or PicXAA. All these make QuickProbs 2 an excellent tool for aligning families ranging from few, to hundreds of proteins. PMID:28139687
SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

PubMed

Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

2012-07-15

In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.
Multi-profile Bayesian alignment model for LC-MS data analysis with integration of internal standards

PubMed Central

Tsai, Tsung-Heng; Tadesse, Mahlet G.; Di Poto, Cristina; Pannell, Lewis K.; Mechref, Yehia; Wang, Yue; Ressom, Habtom W.

2013-01-01

Motivation: Liquid chromatography-mass spectrometry (LC-MS) has been widely used for profiling expression levels of biomolecules in various ‘-omic’ studies including proteomics, metabolomics and glycomics. Appropriate LC-MS data preprocessing steps are needed to detect true differences between biological groups. Retention time (RT) alignment, which is required to ensure that ion intensity measurements among multiple LC-MS runs are comparable, is one of the most important yet challenging preprocessing steps. Current alignment approaches estimate RT variability using either single chromatograms or detected peaks, but do not simultaneously take into account the complementary information embedded in the entire LC-MS data. Results: We propose a Bayesian alignment model for LC-MS data analysis. The alignment model provides estimates of the RT variability along with uncertainty measures. The model enables integration of multiple sources of information including internal standards and clustered chromatograms in a mathematically rigorous framework. We apply the model to LC-MS metabolomic, proteomic and glycomic data. The performance of the model is evaluated based on ground-truth data, by measuring correlation of variation, RT difference across runs and peak-matching performance. We demonstrate that Bayesian alignment model improves significantly the RT alignment performance through appropriate integration of relevant information. Availability and implementation: MATLAB code, raw and preprocessed LC-MS data are available at http://omics.georgetown.edu/alignLCMS.html Contact: hwr@georgetown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24013927
Optimal Alignment of Structures for Finite and Periodic Systems.

PubMed

Griffiths, Matthew; Niblett, Samuel P; Wales, David J

2017-10-10

Finding the optimal alignment between two structures is important for identifying the minimum root-mean-square distance (RMSD) between them and as a starting point for calculating pathways. Most current algorithms for aligning structures are stochastic, scale exponentially with the size of structure, and the performance can be unreliable. We present two complementary methods for aligning structures corresponding to isolated clusters of atoms and to condensed matter described by a periodic cubic supercell. The first method (Go-PERMDIST), a branch and bound algorithm, locates the global minimum RMSD deterministically in polynomial time. The run time increases for larger RMSDs. The second method (FASTOVERLAP) is a heuristic algorithm that aligns structures by finding the global maximum kernel correlation between them using fast Fourier transforms (FFTs) and fast SO(3) transforms (SOFTs). For periodic systems, FASTOVERLAP scales with the square of the number of identical atoms in the system, reliably finds the best alignment between structures that are not too distant, and shows significantly better performance than existing algorithms. The expected run time for Go-PERMDIST is longer than FASTOVERLAP for periodic systems. For finite clusters, the FASTOVERLAP algorithm is competitive with existing algorithms. The expected run time for Go-PERMDIST to find the global RMSD between two structures deterministically is generally longer than for existing stochastic algorithms. However, with an earlier exit condition, Go-PERMDIST exhibits similar or better performance.
Depth image super-resolution via semi self-taught learning framework

NASA Astrophysics Data System (ADS)

Zhao, Furong; Cao, Zhiguo; Xiao, Yang; Zhang, Xiaodi; Xian, Ke; Li, Ruibo

2017-06-01

Depth images have recently attracted much attention in computer vision and high-quality 3D content for 3DTV and 3D movies. In this paper, we present a new semi self-taught learning application framework for enhancing resolution of depth maps without making use of ancillary color images data at the target resolution, or multiple aligned depth maps. Our framework consists of cascade random forests reaching from coarse to fine results. We learn the surface information and structure transformations both from a small high-quality depth exemplars and the input depth map itself across different scales. Considering that edge plays an important role in depth map quality, we optimize an effective regularized objective that calculates on output image space and input edge space in random forests. Experiments show the effectiveness and superiority of our method against other techniques with or without applying aligned RGB information
* Hierarchically Structured Electrospun Scaffolds with Chemically Conjugated Growth Factor for Ligament Tissue Engineering.

PubMed

Pauly, Hannah M; Sathy, Binulal N; Olvera, Dinorath; McCarthy, Helen O; Kelly, Daniel J; Popat, Ketul C; Dunne, Nicholas J; Haut Donahue, Tammy Lynn

2017-08-01

The anterior cruciate ligament (ACL) of the knee is vital for proper joint function and is commonly ruptured during sports injuries or car accidents. Due to a lack of intrinsic healing capacity and drawbacks with allografts and autografts, there is a need for a tissue-engineered ACL replacement. Our group has previously used aligned sheets of electrospun polycaprolactone nanofibers to develop solid cylindrical bundles of longitudinally aligned nanofibers. We have shown that these nanofiber bundles support cell proliferation and elongation and the hierarchical structure and material properties are similar to the native human ACL. It is possible to combine multiple nanofiber bundles to create a scaffold that attempts to mimic the macroscale structure of the ACL. The goal of this work was to develop a hierarchical bioactive scaffold for ligament tissue engineering using connective tissue growth factor (CTGF)-conjugated nanofiber bundles and evaluate the behavior of mesenchymal stem cells (MSCs) on these scaffolds in vitro and in vivo. CTGF was immobilized onto the surface of individual nanofiber bundles or scaffolds consisting of multiple nanofiber bundles. The conjugation efficiency and the release of conjugated CTGF were assessed using X-ray photoelectron spectroscopy, assays, and immunofluorescence staining. Scaffolds were seeded with MSCs and maintained in vitro for 7 days (individual nanofiber bundles), in vitro for 21 days (scaled-up scaffolds of 20 nanofiber bundles), or in vivo for 6 weeks (small scaffolds of 4 nanofiber bundles), and ligament-specific tissue formation was assessed in comparison to non-CTGF-conjugated control scaffolds. Results showed that CTGF conjugation encouraged cell proliferation and ligament-specific tissue formation in vitro and in vivo. The results suggest that hierarchical electrospun nanofiber bundles conjugated with CTGF are a scalable and bioactive scaffold for ACL tissue engineering.
ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures.

PubMed

Konc, Janez; Cesnik, Tomo; Konc, Joanna Trykowska; Penca, Matej; Janežič, Dušanka

2012-02-27

ProBiS-Database is a searchable repository of precalculated local structural alignments in proteins detected by the ProBiS algorithm in the Protein Data Bank. Identification of functionally important binding regions of the protein is facilitated by structural similarity scores mapped to the query protein structure. PDB structures that have been aligned with a query protein may be rapidly retrieved from the ProBiS-Database, which is thus able to generate hypotheses concerning the roles of uncharacterized proteins. Presented with uncharacterized protein structure, ProBiS-Database can discern relationships between such a query protein and other better known proteins in the PDB. Fast access and a user-friendly graphical interface promote easy exploration of this database of over 420 million local structural alignments. The ProBiS-Database is updated weekly and is freely available online at http://probis.cmm.ki.si/database.
MSAViewer: interactive JavaScript visualization of multiple sequence alignments.

PubMed

Yachdav, Guy; Wilzbach, Sebastian; Rauscher, Benedikt; Sheridan, Robert; Sillitoe, Ian; Procter, James; Lewis, Suzanna E; Rost, Burkhard; Goldberg, Tatyana

2016-11-15

The MSAViewer is a quick and easy visualization and analysis JavaScript component for Multiple Sequence Alignment data of any size. Core features include interactive navigation through the alignment, application of popular color schemes, sorting, selecting and filtering. The MSAViewer is 'web ready': written entirely in JavaScript, compatible with modern web browsers and does not require any specialized software. The MSAViewer is part of the BioJS collection of components. The MSAViewer is released as open source software under the Boost Software License 1.0. Documentation, source code and the viewer are available at http://msa.biojs.net/Supplementary information: Supplementary data are available at Bioinformatics online. msa@bio.sh. © The Author 2016. Published by Oxford University Press.
MSAViewer: interactive JavaScript visualization of multiple sequence alignments

PubMed Central

Yachdav, Guy; Wilzbach, Sebastian; Rauscher, Benedikt; Sheridan, Robert; Sillitoe, Ian; Procter, James; Lewis, Suzanna E.; Rost, Burkhard; Goldberg, Tatyana

2016-01-01

Summary: The MSAViewer is a quick and easy visualization and analysis JavaScript component for Multiple Sequence Alignment data of any size. Core features include interactive navigation through the alignment, application of popular color schemes, sorting, selecting and filtering. The MSAViewer is ‘web ready’: written entirely in JavaScript, compatible with modern web browsers and does not require any specialized software. The MSAViewer is part of the BioJS collection of components. Availability and Implementation: The MSAViewer is released as open source software under the Boost Software License 1.0. Documentation, source code and the viewer are available at http://msa.biojs.net/. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: msa@bio.sh PMID:27412096
Photovoltaic module and interlocked stack of photovoltaic modules

DOEpatents

Wares, Brian S.

2014-09-02

One embodiment relates to an arrangement of photovoltaic modules configured for transportation. The arrangement includes a plurality of photovoltaic modules, each photovoltaic module including a frame. A plurality of individual male alignment features and a plurality of individual female alignment features are included on each frame. Adjacent photovoltaic modules are interlocked by multiple individual male alignment features on a first module of the adjacent photovoltaic modules fitting into and being surrounded by corresponding individual female alignment features on a second module of the adjacent photovoltaic modules. Other embodiments, features and aspects are also disclosed.
CHROMA: consensus-based colouring of multiple alignments for publication.

PubMed

Goodstadt, L; Ponting, C P

2001-09-01

CHROMA annotates multiple protein sequence alignments by consensus to produce formatted and coloured text suitable for incorporation into other documents for publication. The package is designed to be flexible and reliable, and has a simple-to-use graphical user interface running under Microsoft Windows. Both the executables and source code for CHROMA running under Windows and Linux (portable command-line only) are freely available at http://www.lg.ndirect.co.uk/chroma. Software enquiries should be directed to CHROMA@lg.ndirect.co.uk.
ChromA: signal-based retention time alignment for chromatography-mass spectrometry data.

PubMed

Hoffmann, Nils; Stoye, Jens

2009-08-15

We describe ChromA, a web-based alignment tool for chromatography-mass spectrometry data from the metabolomics and proteomics domains. Users can supply their data in open and standardized file formats for retention time alignment using dynamic time warping with different configurable local distance and similarity functions. Additionally, user-defined anchors can be used to constrain and speedup the alignment. A neighborhood around each anchor can be added to increase the flexibility of the constrained alignment. ChromA offers different visualizations of the alignment for easier qualitative interpretation and comparison of the data. For the multiple alignment of more than two data files, the center-star approximation is applied to select a reference among input files to align to. ChromA is available at http://bibiserv.techfak.uni-bielefeld.de/chroma. Executables and source code under the L-GPL v3 license are provided for download at the same location.
Spatio-temporal alignment of multiple sensors

NASA Astrophysics Data System (ADS)

Zhang, Tinghua; Ni, Guoqiang; Fan, Guihua; Sun, Huayan; Yang, Biao

2018-01-01

Aiming to achieve the spatio-temporal alignment of multi sensor on the same platform for space target observation, a joint spatio-temporal alignment method is proposed. To calibrate the parameters and measure the attitude of cameras, an astronomical calibration method is proposed based on star chart simulation and collinear invariant features of quadrilateral diagonal between the observed star chart. In order to satisfy a temporal correspondence and spatial alignment similarity simultaneously, the method based on the astronomical calibration and attitude measurement in this paper formulates the video alignment to fold the spatial and temporal alignment into a joint alignment framework. The advantage of this method is reinforced by exploiting the similarities and prior knowledge of velocity vector field between adjacent frames, which is calculated by the SIFT Flow algorithm. The proposed method provides the highest spatio-temporal alignment accuracy compared to the state-of-the-art methods on sequences recorded from multi sensor at different times.
A Stochastic Point Cloud Sampling Method for Multi-Template Protein Comparative Modeling.

PubMed

Li, Jilong; Cheng, Jianlin

2016-05-10

Generating tertiary structural models for a target protein from the known structure of its homologous template proteins and their pairwise sequence alignment is a key step in protein comparative modeling. Here, we developed a new stochastic point cloud sampling method, called MTMG, for multi-template protein model generation. The method first superposes the backbones of template structures, and the Cα atoms of the superposed templates form a point cloud for each position of a target protein, which are represented by a three-dimensional multivariate normal distribution. MTMG stochastically resamples the positions for Cα atoms of the residues whose positions are uncertain from the distribution, and accepts or rejects new position according to a simulated annealing protocol, which effectively removes atomic clashes commonly encountered in multi-template comparative modeling. We benchmarked MTMG on 1,033 sequence alignments generated for CASP9, CASP10 and CASP11 targets, respectively. Using multiple templates with MTMG improves the GDT-TS score and TM-score of structural models by 2.96-6.37% and 2.42-5.19% on the three datasets over using single templates. MTMG's performance was comparable to Modeller in terms of GDT-TS score, TM-score, and GDT-HA score, while the average RMSD was improved by a new sampling approach. The MTMG software is freely available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/mtmg.html.
A Stochastic Point Cloud Sampling Method for Multi-Template Protein Comparative Modeling

PubMed Central

Li, Jilong; Cheng, Jianlin

2016-01-01

Generating tertiary structural models for a target protein from the known structure of its homologous template proteins and their pairwise sequence alignment is a key step in protein comparative modeling. Here, we developed a new stochastic point cloud sampling method, called MTMG, for multi-template protein model generation. The method first superposes the backbones of template structures, and the Cα atoms of the superposed templates form a point cloud for each position of a target protein, which are represented by a three-dimensional multivariate normal distribution. MTMG stochastically resamples the positions for Cα atoms of the residues whose positions are uncertain from the distribution, and accepts or rejects new position according to a simulated annealing protocol, which effectively removes atomic clashes commonly encountered in multi-template comparative modeling. We benchmarked MTMG on 1,033 sequence alignments generated for CASP9, CASP10 and CASP11 targets, respectively. Using multiple templates with MTMG improves the GDT-TS score and TM-score of structural models by 2.96–6.37% and 2.42–5.19% on the three datasets over using single templates. MTMG’s performance was comparable to Modeller in terms of GDT-TS score, TM-score, and GDT-HA score, while the average RMSD was improved by a new sampling approach. The MTMG software is freely available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/mtmg.html. PMID:27161489
GeneBee-net: Internet-based server for analyzing biopolymers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brodsky, L.I.; Ivanov, V.V.; Nikolaev, V.K.

This work describes a network server for searching databanks of biopolymer structures and performing other biocomputing procedures; it is available via direct Internet connection. Basic server procedures are dedicated to homology (similarity) search of sequence and 3D structure of proteins. The homologies found could be used to build multiple alignments, predict protein and RNA secondary structure, and construct phylogenetic trees. In addition to traditional methods of sequence similarity search, the authors propose {open_quotes}non-matrix{close_quotes} (correlational) search. An analogous approach is used to identify regions of similar tertiary structure of proteins. Algorithm concepts and usage examples are presented for new methods. Servicemore » logic is based upon interaction of a client program and server procedures. The client program allows the compilation of queries and the processing of results of an analysis.« less
Bioinspired Design: Magnetic Freeze Casting

NASA Astrophysics Data System (ADS)

Porter, Michael Martin

Nature is the ultimate experimental scientist, having billions of years of evolution to design, test, and adapt a variety of multifunctional systems for a plethora of diverse applications. Next-generation materials that draw inspiration from the structure-property-function relationships of natural biological materials have led to many high-performance structural materials with hybrid, hierarchical architectures that fit form to function. In this dissertation, a novel materials processing method, magnetic freeze casting, is introduced to develop porous scaffolds and hybrid composites with micro-architectures that emulate bone, abalone nacre, and other hard biological materials. This method uses ice as a template to form ceramic-based materials with continuously, interconnected microstructures and magnetic fields to control the alignment of these structures in multiple directions. The resulting materials have anisotropic properties with enhanced mechanical performance that have potential applications as bone implants or lightweight structural composites, among others.
Organizational Structure and Strategy. Symposium 30. [Concurrent Symposium Session at AHRD Annual Conference, 2000.

ERIC Educational Resources Information Center

2000

This packet contains four papers on organizational structure and strategy from a symposium on human resource development (HRD). The first paper, "Exploring Alignment: A Comparative Case Study of Alignment in Two Organizations" (Steven W. Semler), reports on a case study that compared the results of an alignment measurement instrument…

Micro-scale and meso-scale architectural cues cooperate and compete to direct aligned tissue formation

PubMed Central

Gilchrist, Christopher L.; Ruch, David S.; Little, Dianne; Guilak, Farshid

2014-01-01

Tissue and biomaterial microenvironments provide architectural cues that direct important cell behaviors including cell shape, alignment, migration, and resulting tissue formation. These architectural features may be presented to cells across multiple length scales, from nanometers to millimeters in size. In this study, we examined how architectural cues at two distinctly different length scales, “micro-scale” cues on the order of ~1–2 μm, and “meso-scale” cues several orders of magnitude larger (>100 μm), interact to direct aligned neo-tissue formation. Utilizing a micro-photopatterning (μPP) model system to precisely arrange cell-adhesive patterns, we examined the effects of substrate architecture at these length scales on human mesenchymal stem cell (hMSC) organization, gene expression, and fibrillar collagen deposition. Both micro- and meso-scale architectures directed cell alignment and resulting tissue organization, and when combined, meso cues could enhance or compete against micro-scale cues. As meso boundary aspect ratios were increased, meso-scale cues overrode micro-scale cues and controlled tissue alignment, with a characteristic critical width (~500 μm) similar to boundary dimensions that exist in vivo in highly aligned tissues. Meso-scale cues acted via both lateral confinement (in a cell-density-dependent manner) and by permitting end-to-end cell arrangements that yielded greater fibrillar collagen deposition. Despite large differences in fibrillar collagen content and organization between μPP architectural conditions, these changes did not correspond with changes in gene expression of key matrix or tendon-related genes. These findings highlight the complex interplay between geometric cues at multiple length scales and may have implications for tissue engineering strategies, where scaffold designs that incorporate cues at multiple length scales could improve neo-tissue organization and resulting functional outcomes. PMID:25263687
Biopython: freely available Python tools for computational molecular biology and bioinformatics.

PubMed

Cock, Peter J A; Antao, Tiago; Chang, Jeffrey T; Chapman, Brad A; Cox, Cymon J; Dalke, Andrew; Friedberg, Iddo; Hamelryck, Thomas; Kauff, Frank; Wilczynski, Bartek; de Hoon, Michiel J L

2009-06-01

The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning. Biopython is freely available, with documentation and source code at (www.biopython.org) under the Biopython license.
Stepwise Elastic Behavior in a Model Elastomer

NASA Astrophysics Data System (ADS)

Bhawe, Dhananjay M.; Cohen, Claude; Escobedo, Fernando A.

2004-12-01

MonteCarlo simulations of an entanglement-free cross-linked polymer network of semiflexible chains reveal a peculiar stepwise elastic response. For increasing stress, step jumps in strain are observed that do not correlate with changes in the number of aligned chains. We show that this unusual behavior stems from the ability of the system to form multiple ordered chain domains that exclude the cross-linking species. This novel elastomer shows a toughening behavior similar to that observed in biological structural materials, such as muscle proteins and abalone shell adhesive.
Cellular and Nuclear Alignment Analysis for Determining Epithelial Cell Chirality

PubMed Central

Raymond, Michael J.; Ray, Poulomi; Kaur, Gurleen; Singh, Ajay V.; Wan, Leo Q.

2015-01-01

Left-right (LR) asymmetry is a biologically conserved property in living organisms that can be observed in the asymmetrical arrangement of organs and tissues and in tissue morphogenesis, such as the directional looping of the gastrointestinal tract and heart. The expression of LR asymmetry in embryonic tissues can be appreciated in biased cell alignment. Previously an in vitro chirality assay was reported by patterning multiple cells on microscale defined geometries and quantified the cell phenotype–dependent LR asymmetry, or cell chirality. However, morphology and chirality of individual cells on micropatterned surfaces has not been well characterized. Here, a Python-based algorithm was developed to identify and quantify immunofluorescence stained individual epithelial cells on multicellular patterns. This approach not only produces results similar to the image intensity gradient-based method reported previously, but also can capture properties of single cells such as area and aspect ratio. We also found that cell nuclei exhibited biased alignment. Around 35% cells were misaligned and were typically smaller and less elongated. This new imaging analysis approach is an effective tool for measuring single cell chirality inside multicellular structures and can potentially help unveil biophysical mechanisms underlying cellular chiral bias both in vitro and in vivo. PMID:26294010
A scalable architecture for extracting, aligning, linking, and visualizing multi-Int data

NASA Astrophysics Data System (ADS)

Knoblock, Craig A.; Szekely, Pedro

2015-05-01

An analyst today has a tremendous amount of data available, but each of the various data sources typically exists in their own silos, so an analyst has limited ability to see an integrated view of the data and has little or no access to contextual information that could help in understanding the data. We have developed the Domain-Insight Graph (DIG) system, an innovative architecture for extracting, aligning, linking, and visualizing massive amounts of domain-specific content from unstructured sources. Under the DARPA Memex program we have already successfully applied this architecture to multiple application domains, including the enormous international problem of human trafficking, where we extracted, aligned and linked data from 50 million online Web pages. DIG builds on our Karma data integration toolkit, which makes it easy to rapidly integrate structured data from a variety of sources, including databases, spreadsheets, XML, JSON, and Web services. The ability to integrate Web services allows Karma to pull in live data from the various social media sites, such as Twitter, Instagram, and OpenStreetMaps. DIG then indexes the integrated data and provides an easy to use interface for query, visualization, and analysis.
Structural and optical properties of semi-polar (11-22) InGaN/GaN green light-emitting diode structure

NASA Astrophysics Data System (ADS)

Zhao, Guijuan; Wang, Lianshan; Li, Huijie; Meng, Yulin; Li, Fangzheng; Yang, Shaoyan; Wang, Zhanguo

2018-01-01

Semi-polar (11-22) InGaN multiple quantum well (MQW) green light-emitting diode (LED) structures have been realized by metal-organic chemical vapor deposition on an m-plane sapphire substrate. By introducing double GaN buffer layers, we improve the crystal quality of semi-polar (11-22) GaN significantly. The vertical alignment of the diffraction peaks in the (11-22) X-ray reciprocal space mapping indicates the fully strained MQW on the GaN layer. The photoluminescence spectra of the LED structure show stronger emission intensity along the [1-100] InGaN/GaN direction. The electroluminescence emission of the LED structure is very broad with peaks around 550 nm and 510 nm at the 100 mA current injection for samples A and B, respectively, and exhibits a significant blue-shift with increasing drive current.
Structural basis for the substrate specificity of PepA from Streptococcus pneumoniae, a dodecameric tetrahedral protease.

PubMed

Kim, Doyoun; San, Boi Hoa; Moh, Sang Hyun; Park, Hyejin; Kim, Dong Young; Lee, Sangho; Kim, Kyeong Kyu

2010-01-01

Regulated cytosolic proteolysis is one of the key cellular processes ensuring proper functioning of a cell. M42 family proteases show a broad spectrum of substrate specificities, but the structural basis for such diversity of the substrate specificities is lagging behind biochemical data. Here we report the crystal structure of PepA from Streptococcus pneumoniae, a glutamyl aminopeptidase belonging to M42 family (SpPepA). We found that Arg-257 in the substrate binding pocket is strategically positioned so that Arg-257 can make electrostatic interactions with the acidic residue of a substrate at its N-terminus. Structural comparison of the substrate binding pocket of the M42 family proteases, along with the structure-based multiple sequence alignment, argues that the appropriate electrostatic interactions contribute to the selective substrate specificity of SpPepA. Copyright 2009 Elsevier Inc. All rights reserved.
Customisation of the exome data analysis pipeline using a combinatorial approach.

PubMed

Pattnaik, Swetansu; Vaidyanathan, Srividya; Pooja, Durgad G; Deepak, Sa; Panda, Binay

2012-01-01

The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis. Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination determined by a simple framework of pre-existing metrics to create significant datasets.
Bioinformatics prediction of siRNAs as potential antiviral agents against dengue viruses

PubMed Central

Villegas-Rosales, Paula M; Méndez-Tenorio, Alfonso; Ortega-Soto, Elizabeth; Barrón, Blanca L

2012-01-01

Dengue virus (DENV 1-4) represents the major emerging arthropod-borne viral infection in the world. Currently, there is neither an available vaccine nor a specific treatment. Hence, there is a need of antiviral drugs for these viral infections; we describe the prediction of short interfering RNA (siRNA) as potential therapeutic agents against the four DENV serotypes. Our strategy was to carry out a series of multiple alignments using ClustalX program to find conserved sequences among the four DENV serotype genomes to obtain a consensus sequence for siRNAs design. A highly conserved sequence among the four DENV serotypes, located in the encoding sequence for NS4B and NS5 proteins was found. A total of 2,893 complete DENV genomes were downloaded from the NCBI, and after a depuration procedure to identify identical sequences, 220 complete DENV genomes were left. They were edited to select the NS4B and NS5 sequences, which were aligned to obtain a consensus sequence. Three different servers were used for siRNA design, and the resulting siRNAs were aligned to identify the most prevalent sequences. Three siRNAs were chosen, one targeted the genome region that codifies for NS4B protein and the other two; the region for NS5 protein. Predicted secondary structure for DENV genomes was used to demonstrate that the siRNAs were able to target the viral genome forming double stranded structures, necessary to activate the RNA silencing machinery. PMID:22829722
Automatic prediction of protein domains from sequence information using a hybrid learning system.

PubMed

Nagarajan, Niranjan; Yona, Golan

2004-06-12

We describe a novel method for detecting the domain structure of a protein from sequence information alone. The method is based on analyzing multiple sequence alignments that are derived from a database search. Multiple measures are defined to quantify the domain information content of each position along the sequence and are combined into a single predictor using a neural network. The output is further smoothed and post-processed using a probabilistic model to predict the most likely transition positions between domains. The method was assessed using the domain definitions in SCOP and CATH for proteins of known structure and was compared with several other existing methods. Our method performs well both in terms of accuracy and sensitivity. It improves significantly over the best methods available, even some of the semi-manual ones, while being fully automatic. Our method can also be used to suggest and verify domain partitions based on structural data. A few examples of predicted domain definitions and alternative partitions, as suggested by our method, are also discussed. An online domain-prediction server is available at http://biozon.org/tools/domains/
Automated batch fiducial-less tilt-series alignment in Appion using Protomo

PubMed Central

Noble, Alex J.; Stagg, Scott M.

2015-01-01

The field of electron tomography has benefited greatly from manual and semi-automated approaches to marker-based tilt-series alignment that have allowed for the structural determination of multitudes of in situ cellular structures as well as macromolecular structures of individual protein complexes. The emergence of complementary metal-oxide semiconductor detectors capable of detecting individual electrons has enabled the collection of low dose, high contrast images, opening the door for reliable correlation-based tilt-series alignment. Here we present a set of automated, correlation-based tilt-series alignment, contrast transfer function (CTF) correction, and reconstruction workflows for use in conjunction with the Appion/Leginon package that are primarily targeted at automating structure determination with cryogenic electron microscopy. PMID:26455557
Phylogenetic relationships within the cyst-forming nematodes (Nematoda, Heteroderidae) based on analysis of sequences from the ITS regions of ribosomal DNA.

PubMed

Subbotin, S A; Vierstraete, A; De Ley, P; Rowe, J; Waeyenberge, L; Moens, M; Vanfleteren, J R

2001-10-01

The ITS1, ITS2, and 5.8S gene sequences of nuclear ribosomal DNA from 40 taxa of the family Heteroderidae (including the genera Afenestrata, Cactodera, Heterodera, Globodera, Punctodera, Meloidodera, Cryphodera, and Thecavermiculatus) were sequenced and analyzed. The ITS regions displayed high levels of sequence divergence within Heteroderinae and compared to outgroup taxa. Unlike recent findings in root knot nematodes, ITS sequence polymorphism does not appear to complicate phylogenetic analysis of cyst nematodes. Phylogenetic analyses with maximum-parsimony, minimum-evolution, and maximum-likelihood methods were performed with a range of computer alignments, including elision and culled alignments. All multiple alignments and phylogenetic methods yielded similar basic structure for phylogenetic relationships of Heteroderidae. The cyst-forming nematodes are represented by six main clades corresponding to morphological characters and host specialization, with certain clades assuming different positions depending on alignment procedure and/or method of phylogenetic inference. Hypotheses of monophyly of Punctoderinae and Heteroderinae are, respectively, strongly and moderately supported by the ITS data across most alignments. Close relationships were revealed between the Avenae and the Sacchari groups and between the Humuli group and the species H. salixophila within Heteroderinae. The Goettingiana group occupies a basal position within this subfamily. The validity of the genera Afenestrata and Bidera was tested and is discussed based on molecular data. We conclude that ITS sequence data are appropriate for studies of relationships within the different species groups and less so for recovery of more ancient speciations within Heteroderidae. Copyright 2001 Academic Press.
Fuzzy adaptive iterative learning coordination control of second-order multi-agent systems with imprecise communication topology structure

NASA Astrophysics Data System (ADS)

Chen, Jiaxi; Li, Junmin

2018-02-01

In this paper, we investigate the perfect consensus problem for second-order linearly parameterised multi-agent systems (MAS) with imprecise communication topology structure. Takagi-Sugeno (T-S) fuzzy models are presented to describe the imprecise communication topology structure of leader-following MAS, and a distributed adaptive iterative learning control protocol is proposed with the dynamic of leader unknown to any of the agent. The proposed protocol guarantees that the follower agents can track the leader perfectly on [0,T] for the consensus problem. Under alignment condition, a sufficient condition of the consensus for closed-loop MAS is given based on Lyapunov stability theory. Finally, a numerical example and a multiple pendulum system are given to illustrate the effectiveness of the proposed algorithm.
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system.

PubMed

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

2015-11-19

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
In-plane, commensurate GaN/AlN junctions: single-layer composite structures, multiple quantum wells and quantum dots

NASA Astrophysics Data System (ADS)

Durgun, Engin; Onen, Abdullatif; Kecik, Deniz; Ciraci, Salim

In-plane composite structures constructed of the stripes or core/shells of single-layer GaN and AlN, which are joined commensurately display diversity of electronic properties, that can be tuned by the size of their constituents. In heterostructures, the dimensionality of electrons change from 2D to 1D upon their confinements in wide constituent stripes leading to the type-I band alignment and hence multiple quantum well structure in the direct space. The δ-doping of one wide stripe by other narrow stripe results in local narrowing or widening of the band gap. The direct-indirect transition of the fundamental band gap of composite structures can be attained depending on the odd or even values of formula unit in the armchair edged heterojunction. In a patterned array of GaN/AlN core/shells, the dimensionality of the electronic states are reduced from 2D to 0D forming multiple quantum dots in large GaN-cores, while 2D electrons propagate in multiply connected AlN shell as if they are in a supercrystal. These predictions are obtained from first-principles calculations based on density functional theory on single-layer GaN and AlN compound semiconductors which were synthesized recently. This work was supported by the Scientific and Technological Research Council of Turkey (TUBITAK) under Project No 115F088.
BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results.

PubMed

Worley, K C; Wiese, B A; Smith, R F

1995-09-01

BEAUTY (BLAST enhanced alignment utility) is an enhanced version of the NCBI's BLAST data base search tool that facilitates identification of the functions of matched sequences. We have created new data bases of conserved regions and functional domains for protein sequences in NCBI's Entrez data base, and BEAUTY allows this information to be incorporated directly into BLAST search results. A Conserved Regions Data Base, containing the locations of conserved regions within Entrez protein sequences, was constructed by (1) clustering the entire data base into families, (2) aligning each family using our PIMA multiple sequence alignment program, and (3) scanning the multiple alignments to locate the conserved regions within each aligned sequence. A separate Annotated Domains Data Base was constructed by extracting the locations of all annotated domains and sites from sequences represented in the Entrez, PROSITE, BLOCKS, and PRINTS data bases. BEAUTY performs a BLAST search of those Entrez sequences with conserved regions and/or annotated domains. BEAUTY then uses the information from the Conserved Regions and Annotated Domains data bases to generate, for each matched sequence, a schematic display that allows one to directly compare the relative locations of (1) the conserved regions, (2) annotated domains and sites, and (3) the locally aligned regions matched in the BLAST search. In addition, BEAUTY search results include World-Wide Web hypertext links to a number of external data bases that provide a variety of additional types of information on the function of matched sequences. This convenient integration of protein families, conserved regions, annotated domains, alignment displays, and World-Wide Web resources greatly enhances the biological informativeness of sequence similarity searches. BEAUTY searches can be performed remotely on our system using the "BCM Search Launcher" World-Wide Web pages (URL is < http:/ /gc.bcm.tmc.edu:8088/ search-launcher/launcher.html > ).
Anisotropic piezoresistivity characteristics of aligned carbon nanotube-polymer nanocomposites

NASA Astrophysics Data System (ADS)

Sengezer, Engin C.; Seidel, Gary D.; Bodnar, Robert J.

2017-09-01

Dielectrophoresis under the application of AC electric fields is one of the primary fabrication techniques for obtaining aligned carbon nanotube (CNT)-polymer nanocomposites, and is used here to generate long range alignment of CNTs at the structural level. The degree of alignment of CNTs within this long range architecture is observed via polarized Raman spectroscopy so that its influence on the electrical conductivity and piezoresistive response in both the alignment and transverse to alignment directions can be assessed. Nanocomposite samples consisting of randomly oriented, well dispersed single-wall carbon nanotubes (SWCNTs) and of long range electric field aligned SWCNTs in a photopolymerizable monomer blend (urethane dimethacrylate and 1,6-hexanediol dimethacrylate) are quantitatively and qualitatively evaluated. Piezoresistive sensitivities in form of gauge factors were measured for randomly oriented, well dispersed specimens with 0.03, 0.1 and 0.5 wt% SWCNTs and compared with gauge factors in both the axial and transverse to SWCNT alignment directions for electric field aligned 0.03 wt% specimens under both quasi-static monotonic and cyclic tensile loading. Gauge factors in the axial direction were observed to be on the order of 2, while gauge factors in the transverse direction demonstrated a 5 fold increase with values on the order of 10 for aligned specimens. Based on Raman analysis, it is believed the higher sensitivity of the transverse direction is related to architectural evolution of misaligned bridging structures which connect alignment structures under load due to Poisson’s contraction.
DR-TAMAS: Diffeomorphic Registration for Tensor Accurate alignMent of Anatomical Structures

PubMed Central

Irfanoglu, M. Okan; Nayak, Amritha; Jenkins, Jeffrey; Hutchinson, Elizabeth B.; Sadeghi, Neda; Thomas, Cibu P.; Pierpaoli, Carlo

2016-01-01

In this work, we propose DR-TAMAS (Diffeomorphic Registration for Tensor Accurate alignMent of Anatomical Structures), a novel framework for intersubject registration of Diffusion Tensor Imaging (DTI) data sets. This framework is optimized for brain data and its main goal is to achieve an accurate alignment of all brain structures, including white matter (WM), gray matter (GM), and spaces containing cerebrospinal fluid (CSF). Currently most DTI-based spatial normalization algorithms emphasize alignment of anisotropic structures. While some diffusion-derived metrics, such as diffusion anisotropy and tensor eigenvector orientation, are highly informative for proper alignment of WM, other tensor metrics such as the trace or mean diffusivity (MD) are fundamental for a proper alignment of GM and CSF boundaries. Moreover, it is desirable to include information from structural MRI data, e.g., T1-weighted or T2-weighted images, which are usually available together with the diffusion data. The fundamental property of DR-TAMAS is to achieve global anatomical accuracy by incorporating in its cost function the most informative metrics locally. Another important feature of DR-TAMAS is a symmetric time-varying velocity-based transformation model, which enables it to account for potentially large anatomical variability in healthy subjects and patients. The performance of DR-TAMAS is evaluated with several data sets and compared with other widely-used diffeomorphic image registration techniques employing both full tensor information and/or DTI-derived scalar maps. Our results show that the proposed method has excellent overall performance in the entire brain, while being equivalent to the best existing methods in WM. PMID:26931817
CCD Camera Lens Interface for Real-Time Theodolite Alignment

NASA Technical Reports Server (NTRS)

Wake, Shane; Scott, V. Stanley, III

2012-01-01

Theodolites are a common instrument in the testing, alignment, and building of various systems ranging from a single optical component to an entire instrument. They provide a precise way to measure horizontal and vertical angles. They can be used to align multiple objects in a desired way at specific angles. They can also be used to reference a specific location or orientation of an object that has moved. Some systems may require a small margin of error in position of components. A theodolite can assist with accurately measuring and/or minimizing that error. The technology is an adapter for a CCD camera with lens to attach to a Leica Wild T3000 Theodolite eyepiece that enables viewing on a connected monitor, and thus can be utilized with multiple theodolites simultaneously. This technology removes a substantial part of human error by relying on the CCD camera and monitors. It also allows image recording of the alignment, and therefore provides a quantitative means to measure such error.
PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.

PubMed

Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming

2016-07-08

The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Precision alignment device

DOEpatents

Jones, N.E.

1988-03-10

Apparatus for providing automatic alignment of beam devices having an associated structure for directing, collimating, focusing, reflecting, or otherwise modifying the main beam. A reference laser is attached to the structure enclosing the main beam producing apparatus and produces a reference beam substantially parallel to the main beam. Detector modules containing optical switching devices and optical detectors are positioned in the path of the reference beam and are effective to produce an electrical output indicative of the alignment of the main beam. This electrical output drives servomotor operated adjustment screws to adjust the position of elements of the structure associated with the main beam to maintain alignment of the main beam. 5 figs.
Precision alignment device

DOEpatents

Jones, Nelson E.

1990-01-01

Apparatus for providing automatic alignment of beam devices having an associated structure for directing, collimating, focusing, reflecting, or otherwise modifying the main beam. A reference laser is attached to the structure enclosing the main beam producing apparatus and produces a reference beam substantially parallel to the main beam. Detector modules containing optical switching devices and optical detectors are positioned in the path of the reference beam and are effective to produce an electrical output indicative of the alignment of the main beam. This electrical output drives servomotor operated adjustment screws to adjust the position of elements of the structure associated with the main beam to maintain alignment of the main beam.
Desktop aligner for fabrication of multilayer microfluidic devices.

PubMed

Li, Xiang; Yu, Zeta Tak For; Geraldo, Dalton; Weng, Shinuo; Alve, Nitesh; Dun, Wu; Kini, Akshay; Patel, Karan; Shu, Roberto; Zhang, Feng; Li, Gang; Jin, Qinghui; Fu, Jianping

2015-07-01

Multilayer assembly is a commonly used technique to construct multilayer polydimethylsiloxane (PDMS)-based microfluidic devices with complex 3D architecture and connectivity for large-scale microfluidic integration. Accurate alignment of structure features on different PDMS layers before their permanent bonding is critical in determining the yield and quality of assembled multilayer microfluidic devices. Herein, we report a custom-built desktop aligner capable of both local and global alignments of PDMS layers covering a broad size range. Two digital microscopes were incorporated into the aligner design to allow accurate global alignment of PDMS structures up to 4 in. in diameter. Both local and global alignment accuracies of the desktop aligner were determined to be about 20 μm cm(-1). To demonstrate its utility for fabrication of integrated multilayer PDMS microfluidic devices, we applied the desktop aligner to achieve accurate alignment of different functional PDMS layers in multilayer microfluidics including an organs-on-chips device as well as a microfluidic device integrated with vertical passages connecting channels located in different PDMS layers. Owing to its convenient operation, high accuracy, low cost, light weight, and portability, the desktop aligner is useful for microfluidic researchers to achieve rapid and accurate alignment for generating multilayer PDMS microfluidic devices.
Desktop aligner for fabrication of multilayer microfluidic devices

PubMed Central

Li, Xiang; Yu, Zeta Tak For; Geraldo, Dalton; Weng, Shinuo; Alve, Nitesh; Dun, Wu; Kini, Akshay; Patel, Karan; Shu, Roberto; Zhang, Feng; Li, Gang; Jin, Qinghui; Fu, Jianping

2015-01-01

Multilayer assembly is a commonly used technique to construct multilayer polydimethylsiloxane (PDMS)-based microfluidic devices with complex 3D architecture and connectivity for large-scale microfluidic integration. Accurate alignment of structure features on different PDMS layers before their permanent bonding is critical in determining the yield and quality of assembled multilayer microfluidic devices. Herein, we report a custom-built desktop aligner capable of both local and global alignments of PDMS layers covering a broad size range. Two digital microscopes were incorporated into the aligner design to allow accurate global alignment of PDMS structures up to 4 in. in diameter. Both local and global alignment accuracies of the desktop aligner were determined to be about 20 μm cm−1. To demonstrate its utility for fabrication of integrated multilayer PDMS microfluidic devices, we applied the desktop aligner to achieve accurate alignment of different functional PDMS layers in multilayer microfluidics including an organs-on-chips device as well as a microfluidic device integrated with vertical passages connecting channels located in different PDMS layers. Owing to its convenient operation, high accuracy, low cost, light weight, and portability, the desktop aligner is useful for microfluidic researchers to achieve rapid and accurate alignment for generating multilayer PDMS microfluidic devices. PMID:26233409
CORAL: aligning conserved core regions across domain families.

PubMed

Fong, Jessica H; Marchler-Bauer, Aron

2009-08-01

Homologous protein families share highly conserved sequence and structure regions that are frequent targets for comparative analysis of related proteins and families. Many protein families, such as the curated domain families in the Conserved Domain Database (CDD), exhibit similar structural cores. To improve accuracy in aligning such protein families, we propose a profile-profile method CORAL that aligns individual core regions as gap-free units. CORAL computes optimal local alignment of two profiles with heuristics to preserve continuity within core regions. We benchmarked its performance on curated domains in CDD, which have pre-defined core regions, against COMPASS, HHalign and PSI-BLAST, using structure superpositions and comprehensive curator-optimized alignments as standards of truth. CORAL improves alignment accuracy on core regions over general profile methods, returning a balanced score of 0.57 for over 80% of all domain families in CDD, compared with the highest balanced score of 0.45 from other methods. Further, CORAL provides E-values to aid in detecting homologous protein families and, by respecting block boundaries, produces alignments with improved 'readability' that facilitate manual refinement. CORAL will be included in future versions of the NCBI Cn3D/CDTree software, which can be downloaded at http://www.ncbi.nlm.nih.gov/Structure/cdtree/cdtree.shtml. Supplementary data are available at Bioinformatics online.
Unified Alignment of Protein-Protein Interaction Networks.

PubMed

Malod-Dognin, Noël; Ban, Kristina; Pržulj, Nataša

2017-04-19

Paralleling the increasing availability of protein-protein interaction (PPI) network data, several network alignment methods have been proposed. Network alignments have been used to uncover functionally conserved network parts and to transfer annotations. However, due to the computational intractability of the network alignment problem, aligners are heuristics providing divergent solutions and no consensus exists on a gold standard, or which scoring scheme should be used to evaluate them. We comprehensively evaluate the alignment scoring schemes and global network aligners on large scale PPI data and observe that three methods, HUBALIGN, L-GRAAL and NATALIE, regularly produce the most topologically and biologically coherent alignments. We study the collective behaviour of network aligners and observe that PPI networks are almost entirely aligned with a handful of aligners that we unify into a new tool, Ulign. Ulign enables complete alignment of two networks, which traditional global and local aligners fail to do. Also, multiple mappings of Ulign define biologically relevant soft clusterings of proteins in PPI networks, which may be used for refining the transfer of annotations across networks. Hence, PPI networks are already well investigated by current aligners, so to gain additional biological insights, a paradigm shift is needed. We propose such a shift come from aligning all available data types collectively rather than any particular data type in isolation from others.
Centroid stabilization in alignment of FOA corner cube: designing of a matched filter

NASA Astrophysics Data System (ADS)

Awwal, Abdul; Wilhelmsen, Karl; Roberts, Randy; Leach, Richard; Miller Kamm, Victoria; Ngo, Tony; Lowe-Webb, Roger

2015-02-01

The current automation of image-based alignment of NIF high energy laser beams is providing the capability of executing multiple target shots per day. An important aspect of performing multiple shots in a day is to reduce additional time spent aligning specific beams due to perturbations in those beam images. One such alignment is beam centration through the second and third harmonic generating crystals in the final optics assembly (FOA), which employs two retro-reflecting corner cubes to represent the beam center. The FOA houses the frequency conversion crystals for third harmonic generation as the beams enters the target chamber. Beam-to-beam variations and systematic beam changes over time in the FOA corner-cube images can lead to a reduction in accuracy as well as increased convergence durations for the template based centroid detector. This work presents a systematic approach of maintaining FOA corner cube centroid templates so that stable position estimation is applied thereby leading to fast convergence of alignment control loops. In the matched filtering approach, a template is designed based on most recent images taken in the last 60 days. The results show that new filter reduces the divergence of the position estimation of FOA images.
GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters.

PubMed

Sela, Itamar; Ashkenazy, Haim; Katoh, Kazutaka; Pupko, Tal

2015-07-01

Inference of multiple sequence alignments (MSAs) is a critical part of phylogenetic and comparative genomics studies. However, from the same set of sequences different MSAs are often inferred, depending on the methodologies used and the assumed parameters. Much effort has recently been devoted to improving the ability to identify unreliable alignment regions. Detecting such unreliable regions was previously shown to be important for downstream analyses relying on MSAs, such as the detection of positive selection. Here we developed GUIDANCE2, a new integrative methodology that accounts for: (i) uncertainty in the process of indel formation, (ii) uncertainty in the assumed guide tree and (iii) co-optimal solutions in the pairwise alignments, used as building blocks in progressive alignment algorithms. We compared GUIDANCE2 with seven methodologies to detect unreliable MSA regions using extensive simulations and empirical benchmarks. We show that GUIDANCE2 outperforms all previously developed methodologies. Furthermore, GUIDANCE2 also provides a set of alternative MSAs which can be useful for downstream analyses. The novel algorithm is implemented as a web-server, available at: http://guidance.tau.ac.il. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns

PubMed Central

2013-01-01

Background It is well known that the search for homologous RNAs is more effective if both sequence and structure information is incorporated into the search. However, current tools for searching with RNA sequence-structure patterns cannot fully handle mutations occurring on both these levels or are simply not fast enough for searching large sequence databases because of the high computational costs of the underlying sequence-structure alignment problem. Results We present new fast index-based and online algorithms for approximate matching of RNA sequence-structure patterns supporting a full set of edit operations on single bases and base pairs. Our methods efficiently compute semi-global alignments of structural RNA patterns and substrings of the target sequence whose costs satisfy a user-defined sequence-structure edit distance threshold. For this purpose, we introduce a new computing scheme to optimally reuse the entries of the required dynamic programming matrices for all substrings and combine it with a technique for avoiding the alignment computation of non-matching substrings. Our new index-based methods exploit suffix arrays preprocessed from the target database and achieve running times that are sublinear in the size of the searched sequences. To support the description of RNA molecules that fold into complex secondary structures with multiple ordered sequence-structure patterns, we use fast algorithms for the local or global chaining of approximate sequence-structure pattern matches. The chaining step removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our improved online algorithm is faster than the best previous method by up to factor 45. Our best new index-based algorithm achieves a speedup of factor 560. Conclusions The presented methods achieve considerable speedups compared to the best previous method. This, together with the expected sublinear running time of the presented index-based algorithms, allows for the first time approximate matching of RNA sequence-structure patterns in large sequence databases. Beyond the algorithmic contributions, we provide with RaligNAtor a robust and well documented open-source software package implementing the algorithms presented in this manuscript. The RaligNAtor software is available at http://www.zbh.uni-hamburg.de/ralignator. PMID:23865810
Analysis of multiple scattering contributions in electron-impact ionization of molecular hydrogen

NASA Astrophysics Data System (ADS)

Ren, Xueguang; Hossen, Khokon; Wang, Enliang; Pindzola, M. S.; Dorn, Alexander; Colgan, James

2017-10-01

We report a combined experimental and theoretical study on the low-energy (E 0 = 31.5 eV) electron-impact ionization of molecular hydrogen (H2). Triple differential cross sections are measured for a range of fixed emission angles of one outgoing electron between {θ }1=-70^\\circ and -130° covering the full 4π solid angle of the second electron. The energy sharing of the outgoing electrons varies from symmetric ({E}1={E}2=8 eV) to highly asymmetric (E 1 = 1 eV and E 2 = 15 eV). In addition to the binary and recoil lobes, a structure is observed perpendicular to the incoming beam direction which is due to multiple scattering of the projectile inside the molecular potential. The absolutely normalized experimental cross sections are compared with results from the time-dependent close-coupling (TDCC) calculations. Molecular alignment dependent TDCC results demonstrate that these structures are only present if the molecule axis is lying in the scattering plane.
Structural Inheritance of the Actin Cytoskeletal Organization Determines the Body Axis in Regenerating Hydra.

PubMed

Livshits, Anton; Shani-Zerbib, Lital; Maroudas-Sacks, Yonit; Braun, Erez; Keren, Kinneret

2017-02-07

Understanding how mechanics complement bio-signaling in defining patterns during morphogenesis is an outstanding challenge. Here, we utilize the multicellular polyp Hydra to investigate the role of the actomyosin cytoskeleton in morphogenesis. We find that the supra-cellular actin fiber organization is inherited from the parent Hydra and determines the body axis in regenerating tissue segments. This form of structural inheritance is non-trivial because of the tissue folding and dynamic actin reorganization involved. We further show that the emergence of multiple body axes can be traced to discrepancies in actin fiber alignment at early stages of the regeneration process. Mechanical constraints induced by anchoring regenerating Hydra on stiff wires suppressed the emergence of multiple body axes, highlighting the importance of mechanical feedbacks in defining and stabilizing the body axis. Together, these results constitute an important step toward the development of an integrated view of morphogenesis that incorporates mechanics. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Genetic Testing Registry

MedlinePlus

... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...
Automated batch fiducial-less tilt-series alignment in Appion using Protomo.

PubMed

Noble, Alex J; Stagg, Scott M

2015-11-01

The field of electron tomography has benefited greatly from manual and semi-automated approaches to marker-based tilt-series alignment that have allowed for the structural determination of multitudes of in situ cellular structures as well as macromolecular structures of individual protein complexes. The emergence of complementary metal-oxide semiconductor detectors capable of detecting individual electrons has enabled the collection of low dose, high contrast images, opening the door for reliable correlation-based tilt-series alignment. Here we present a set of automated, correlation-based tilt-series alignment, contrast transfer function (CTF) correction, and reconstruction workflows for use in conjunction with the Appion/Leginon package that are primarily targeted at automating structure determination with cryogenic electron microscopy. Copyright © 2015 Elsevier Inc. All rights reserved.
Alignment of Common Wheat and Other Grass Genomes Establishes a Comparative Genomics Research Platform

PubMed Central

Sun, Sangrong; Wang, Jinpeng; Yu, Jigao; Meng, Fanbo; Xia, Ruiyan; Wang, Li; Wang, Zhenyi; Ge, Weina; Liu, Xiaojian; Li, Yuxian; Liu, Yinzhe; Yang, Nanshan; Wang, Xiyin

2017-01-01

Grass genomes are complicated structures as they share a common tetraploidization, and particular genomes have been further affected by extra polyploidizations. These events and the following genomic re-patternings have resulted in a complex, interweaving gene homology both within a genome, and between genomes. Accurately deciphering the structure of these complicated plant genomes would help us better understand their compositional and functional evolution at multiple scales. Here, we build on our previous research by performing a hierarchical alignment of the common wheat genome vis-à-vis eight other sequenced grass genomes with most up-to-date assemblies, and annotations. With this data, we constructed a list of the homologous genes, and then, in a layer-by-layer process, separated their orthology, and paralogy that were established by speciations and recursive polyploidizations, respectively. Compared with the other grasses, the far fewer collinear outparalogous genes within each of three subgenomes of common wheat suggest that homoeologous recombination, and genomic fractionation should have occurred after its formation. In sum, this work contributes to the establishment of an important and timely comparative genomics platform for researchers in the grass community and possibly beyond. Homologous gene list can be found in Supplemental material. PMID:28912789
Multiple nodes transfer alignment for airborne missiles based on inertial sensor network

NASA Astrophysics Data System (ADS)

Si, Fan; Zhao, Yan

2017-09-01

Transfer alignment is an important initialization method for airborne missiles because the alignment accuracy largely determines the performance of the missile. However, traditional alignment methods are limited by complicated and unknown flexure angle, and cannot meet the actual requirement when wing flexure deformation occurs. To address this problem, we propose a new method that uses the relative navigation parameters between the weapons and fighter to achieve transfer alignment. First, in the relative inertial navigation algorithm, the relative attitudes and positions are constantly computed in wing flexure deformation situations. Secondly, the alignment results of each weapon are processed using a data fusion algorithm to improve the overall performance. Finally, the feasibility and performance of the proposed method were evaluated under two typical types of deformation, and the simulation results demonstrated that the new transfer alignment method is practical and has high-precision.
Cloud-Coffee: implementation of a parallel consistency-based multiple alignment algorithm in the T-Coffee package and its benchmarking on the Amazon Elastic-Cloud.

PubMed

Di Tommaso, Paolo; Orobitg, Miquel; Guirado, Fernando; Cores, Fernado; Espinosa, Toni; Notredame, Cedric

2010-08-01

We present the first parallel implementation of the T-Coffee consistency-based multiple aligner. We benchmark it on the Amazon Elastic Cloud (EC2) and show that the parallelization procedure is reasonably effective. We also conclude that for a web server with moderate usage (10K hits/month) the cloud provides a cost-effective alternative to in-house deployment. T-Coffee is a freeware open source package available from http://www.tcoffee.org/homepage.html
Dali server update.

PubMed

Holm, Liisa; Laakso, Laura M

2016-07-08

The Dali server (http://ekhidna2.biocenter.helsinki.fi/dali) is a network service for comparing protein structures in 3D. In favourable cases, comparing 3D structures may reveal biologically interesting similarities that are not detectable by comparing sequences. The Dali server has been running in various places for over 20 years and is used routinely by crystallographers on newly solved structures. The latest update of the server provides enhanced analytics for the study of sequence and structure conservation. The server performs three types of structure comparisons: (i) Protein Data Bank (PDB) search compares one query structure against those in the PDB and returns a list of similar structures; (ii) pairwise comparison compares one query structure against a list of structures specified by the user; and (iii) all against all structure comparison returns a structural similarity matrix, a dendrogram and a multidimensional scaling projection of a set of structures specified by the user. Structural superimpositions are visualized using the Java-free WebGL viewer PV. The structural alignment view is enhanced by sequence similarity searches against Uniprot. The combined structure-sequence alignment information is compressed to a stack of aligned sequence logos. In the stack, each structure is structurally aligned to the query protein and represented by a sequence logo. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Multiple alignment-free sequence comparison

PubMed Central

Ren, Jie; Song, Kai; Sun, Fengzhu; Deng, Minghua; Reinert, Gesine

2013-01-01

Motivation: Recently, a range of new statistics have become available for the alignment-free comparison of two sequences based on k-tuple word content. Here, we extend these statistics to the simultaneous comparison of more than two sequences. Our suite of statistics contains, first, and , extensions of statistics for pairwise comparison of the joint k-tuple content of all the sequences, and second, , and , averages of sums of pairwise comparison statistics. The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences. Results: Our investigation uses both simulated data as well as cis-regulatory module data where the task is to identify cis-regulatory modules with similar transcription factor binding sites. We find that although for real data, all of our statistics show a similar performance, on simulated data the Shepp-type statistics are in some instances outperformed by star-type statistics. The multiple alignment-free statistics are more sensitive to contamination in the data than the pairwise average statistics. Availability: Our implementation of the five statistics is available as R package named ‘multiAlignFree’ at be http://www-rcf.usc.edu/∼fsun/Programs/multiAlignFree/multiAlignFreemain.html. Contact: reinert@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23990418
Genetic Algorithm Phase Retrieval for the Systematic Image-Based Optical Alignment Testbed

NASA Technical Reports Server (NTRS)

Rakoczy, John; Steincamp, James; Taylor, Jaime

2003-01-01

A reduced surrogate, one point crossover genetic algorithm with random rank-based selection was used successfully to estimate the multiple phases of a segmented optical system modeled on the seven-mirror Systematic Image-Based Optical Alignment testbed located at NASA's Marshall Space Flight Center.
National Center for Biotechnology Information

MedlinePlus

... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...

R2R--software to speed the depiction of aesthetic consensus RNA secondary structures.

PubMed

Weinberg, Zasha; Breaker, Ronald R

2011-01-04

With continuing identification of novel structured noncoding RNAs, there is an increasing need to create schematic diagrams showing the consensus features of these molecules. RNA structural diagrams are typically made either with general-purpose drawing programs like Adobe Illustrator, or with automated or interactive programs specific to RNA. Unfortunately, the use of applications like Illustrator is extremely time consuming, while existing RNA-specific programs produce figures that are useful, but usually not of the same aesthetic quality as those produced at great cost in Illustrator. Additionally, most existing RNA-specific applications are designed for drawing single RNA molecules, not consensus diagrams. We created R2R, a computer program that facilitates the generation of aesthetic and readable drawings of RNA consensus diagrams in a fraction of the time required with general-purpose drawing programs. Since the inference of a consensus RNA structure typically requires a multiple-sequence alignment, the R2R user annotates the alignment with commands directing the layout and annotation of the RNA. R2R creates SVG or PDF output that can be imported into Adobe Illustrator, Inkscape or CorelDRAW. R2R can be used to create consensus sequence and secondary structure models for novel RNA structures or to revise models when new representatives for known RNA classes become available. Although R2R does not currently have a graphical user interface, it has proven useful in our efforts to create 100 schematic models of distinct noncoding RNA classes. R2R makes it possible to obtain high-quality drawings of the consensus sequence and structural models of many diverse RNA structures with a more practical amount of effort. R2R software is available at http://breaker.research.yale.edu/R2R and as an Additional file.
ChromA: signal-based retention time alignment for chromatography–mass spectrometry data

PubMed Central

Hoffmann, Nils; Stoye, Jens

2009-01-01

Summary: We describe ChromA, a web-based alignment tool for chromatography–mass spectrometry data from the metabolomics and proteomics domains. Users can supply their data in open and standardized file formats for retention time alignment using dynamic time warping with different configurable local distance and similarity functions. Additionally, user-defined anchors can be used to constrain and speedup the alignment. A neighborhood around each anchor can be added to increase the flexibility of the constrained alignment. ChromA offers different visualizations of the alignment for easier qualitative interpretation and comparison of the data. For the multiple alignment of more than two data files, the center-star approximation is applied to select a reference among input files to align to. Availability: ChromA is available at http://bibiserv.techfak.uni-bielefeld.de/chroma. Executables and source code under the L-GPL v3 license are provided for download at the same location. Contact: stoye@techfak.uni-bielefeld.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19505941
A statistical physics perspective on alignment-independent protein sequence comparison.

PubMed

Chattopadhyay, Amit K; Nasiev, Diar; Flower, Darren R

2015-08-01

Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from 'first passage probability distribution' to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach. © The Author 2015. Published by Oxford University Press.
Photoelectron diffraction from single oriented molecules: Towards ultrafast structure determination of molecules using x-ray free-electron lasers

NASA Astrophysics Data System (ADS)

Kazama, Misato; Fujikawa, Takashi; Kishimoto, Naoki; Mizuno, Tomoya; Adachi, Jun-ichi; Yagishita, Akira

2013-06-01

We provide a molecular structure determination method, based on multiple-scattering x-ray photoelectron diffraction (XPD) calculations. This method is applied to our XPD data on several molecules having different equilibrium geometries. Then it is confirmed that, by our method, bond lengths and bond angles can be determined with a resolution of less than 0.1 Å and 10∘, respectively. Differently from any other scenario of ultrafast structure determination, we measure the two- or three-dimensional XPD of aligned or oriented molecules in the energy range from 100 to 200 eV with a 4π detection velocity map imaging spectrometer. Thanks to the intense and ultrashort pulse properties of x-ray free-electron lasers, our approach exhibits the most probable method for obtaining ultrafast real-time structural information on small to medium-sized molecules consisting of light elements, i.e., a “molecular movie.”
Protein docking by the interface structure similarity: how much structure is needed?

PubMed

Sinha, Rohita; Kundrotas, Petras J; Vakser, Ilya A

2012-01-01

The increasing availability of co-crystallized protein-protein complexes provides an opportunity to use template-based modeling for protein-protein docking. Structure alignment techniques are useful in detection of remote target-template similarities. The size of the structure involved in the alignment is important for the success in modeling. This paper describes a systematic large-scale study to find the optimal definition/size of the interfaces for the structure alignment-based docking applications. The results showed that structural areas corresponding to the cutoff values <12 Å across the interface inadequately represent structural details of the interfaces. With the increase of the cutoff beyond 12 Å, the success rate for the benchmark set of 99 protein complexes, did not increase significantly for higher accuracy models, and decreased for lower-accuracy models. The 12 Å cutoff was optimal in our interface alignment-based docking, and a likely best choice for the large-scale (e.g., on the scale of the entire genome) applications to protein interaction networks. The results provide guidelines for the docking approaches, including high-throughput applications to modeled structures.
A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

PubMed

Luczak, Brian B; James, Benjamin T; Girgis, Hani Z

2017-12-06

Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.
Hyperfine-Structure-Induced Depolarization of Impulsively Aligned I2 Molecules

NASA Astrophysics Data System (ADS)

Thomas, Esben F.; Søndergaard, Anders A.; Shepperson, Benjamin; Henriksen, Niels E.; Stapelfeldt, Henrik

2018-04-01

A moderately intense 450 fs laser pulse is used to create rotational wave packets in gas phase I2 molecules. The ensuing time-dependent alignment, measured by Coulomb explosion imaging with a delayed probe pulse, exhibits the characteristic revival structures expected for rotational wave packets but also a complex nonperiodic substructure and decreasing mean alignment not observed before. A quantum mechanical model attributes the phenomena to coupling between the rotational angular momenta and the nuclear spins through the electric quadrupole interaction. The calculated alignment trace agrees very well with the experimental results.
Multi-subject Manifold Alignment of Functional Network Structures via Joint Diagonalization.

PubMed

Nenning, Karl-Heinz; Kollndorfer, Kathrin; Schöpf, Veronika; Prayer, Daniela; Langs, Georg

2015-01-01

Functional magnetic resonance imaging group studies rely on the ability to establish correspondence across individuals. This enables location specific comparison of functional brain characteristics. Registration is often based on morphology and does not take variability of functional localization into account. This can lead to a loss of specificity, or confounds when studying diseases. In this paper we propose multi-subject functional registration by manifold alignment via coupled joint diagonalization. The functional network structure of each subject is encoded in a diffusion map, where functional relationships are decoupled from spatial position. Two-step manifold alignment estimates initial correspondences between functionally equivalent regions. Then, coupled joint diagonalization establishes common eigenbases across all individuals, and refines the functional correspondences. We evaluate our approach on fMRI data acquired during a language paradigm. Experiments demonstrate the benefits in matching accuracy achieved by coupled joint diagonalization compared to previously proposed functional alignment approaches, or alignment based on structural correspondences.
Evol and ProDy for bridging protein sequence evolution and structural dynamics.

PubMed

Bakan, Ahmet; Dutta, Anindita; Mao, Wenzhi; Liu, Ying; Chennubhotla, Chakra; Lezon, Timothy R; Bahar, Ivet

2014-09-15

Correlations between sequence evolution and structural dynamics are of utmost importance in understanding the molecular mechanisms of function and their evolution. We have integrated Evol, a new package for fast and efficient comparative analysis of evolutionary patterns and conformational dynamics, into ProDy, a computational toolbox designed for inferring protein dynamics from experimental and theoretical data. Using information-theoretic approaches, Evol coanalyzes conservation and coevolution profiles extracted from multiple sequence alignments of protein families with their inferred dynamics. ProDy and Evol are open-source and freely available under MIT License from http://prody.csb.pitt.edu/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Mind Map Our Way into Effective Student Questioning: a Principle-Based Scenario

NASA Astrophysics Data System (ADS)

Stokhof, Harry; de Vries, Bregje; Bastiaens, Theo; Martens, Rob

2017-07-01

Student questioning is an important self-regulative strategy and has multiple benefits for teaching and learning science. Teachers, however, need support to align student questioning to curricular goals. This study tests a prototype of a principle-based scenario that supports teachers in guiding effective student questioning. In the scenario, mind mapping is used to provide both curricular structure as well as support for student questioning. The fidelity of structure and the process of implementation were verified by interviews, video data and a product collection. Results show that the scenario was relevant for teachers, practical in use and effective for guiding student questioning. Results also suggest that shared responsibility for classroom mind maps contributed to more intensive collective knowledge construction.
PrimerDesign-M: A multiple-alignment based multiple-primer design tool for walking across variable genomes

DOE PAGES

Yoon, Hyejin; Leitner, Thomas

2014-12-17

Analyses of entire viral genomes or mtDNA requires comprehensive design of many primers across their genomes. In addition, simultaneous optimization of several DNA primer design criteria may improve overall experimental efficiency and downstream bioinformatic processing. To achieve these goals, we developed PrimerDesign-M. It includes several options for multiple-primer design, allowing researchers to efficiently design walking primers that cover long DNA targets, such as entire HIV-1 genomes, and that optimizes primers simultaneously informed by genetic diversity in multiple alignments and experimental design constraints given by the user. PrimerDesign-M can also design primers that include DNA barcodes and minimize primer dimerization. PrimerDesign-Mmore » finds optimal primers for highly variable DNA targets and facilitates design flexibility by suggesting alternative designs to adapt to experimental conditions.« less
A molecular-field-based similarity study of non-nucleoside HIV-1 reverse transcriptase inhibitors

NASA Astrophysics Data System (ADS)

Mestres, Jordi; Rohrer, Douglas C.; Maggiora, Gerald M.

1999-01-01

This article describes a molecular-field-based similarity method for aligning molecules by matching their steric and electrostatic fields and an application of the method to the alignment of three structurally diverse non-nucleoside HIV-1 reverse transcriptase inhibitors. A brief description of the method, as implemented in the program MIMIC, is presented, including a discussion of pairwise and multi-molecule similarity-based matching. The application provides an example that illustrates how relative binding orientations of molecules can be determined in the absence of detailed structural information on their target protein. In the particular system studied here, availability of the X-ray crystal structures of the respective ligand-protein complexes provides a means for constructing an 'experimental model' of the relative binding orientations of the three inhibitors. The experimental model is derived by using MIMIC to align the steric fields of the three protein P66 subunit main chains, producing an overlay with a 1.41 Å average rms distance between the corresponding Cα's in the three chains. The inter-chain residue similarities for the backbone structures show that the main-chain conformations are conserved in the region of the inhibitor-binding site, with the major deviations located primarily in the 'finger' and RNase H regions. The resulting inhibitor structure overlay provides an experimental-based model that can be used to evaluate the quality of the direct a priori inhibitor alignment obtained using MIMIC. It is found that the 'best' pairwise alignments do not always correspond to the experimental model alignments. Therefore, simply combining the best pairwise alignments will not necessarily produce the optimal multi-molecule alignment. However, the best simultaneous three-molecule alignment was found to reproduce the experimental inhibitor alignment model. A pairwise consistency index has been derived which gauges the quality of combining the pairwise alignments and aids in efficiently forming the optimal multi-molecule alignment analysis. Two post-alignment procedures are described that provide information on feature-based and field-based pharmacophoric patterns. The former corresponds to traditional pharmacophore models and is derived from the contribution of individual atoms to the total similarity. The latter is based on molecular regions rather than atoms and is constructed by computing the percent contribution to the similarity of individual points in a regular lattice surrounding the molecules, which when contoured and colored visually depict regions of highly conserved similarity. A discussion of how the information provided by each of the procedures is useful in drug design is also presented.
Copper-encapsulated vertically aligned carbon nanotube arrays.

PubMed

Stano, Kelly L; Chapla, Rachel; Carroll, Murphy; Nowak, Joshua; McCord, Marian; Bradford, Philip D

2013-11-13

A new procedure is described for the fabrication of vertically aligned carbon nanotubes (VACNTs) that are decorated, and even completely encapsulated, by a dense network of copper nanoparticles. The process involves the conformal deposition of pyrolytic carbon (Py-C) to stabilize the aligned carbon-nanotube structure during processing. The stabilized arrays are mildly functionalized using oxygen plasma treatment to improve wettability, and they are then infiltrated with an aqueous, supersaturated Cu salt solution. Once dried, the salt forms a stabilizing crystal network throughout the array. After calcination and H2 reduction, Cu nanoparticles are left decorating the CNT surfaces. Studies were carried out to determine the optimal processing parameters to maximize Cu content in the composite. These included the duration of Py-C deposition and system process pressure as well as the implementation of subsequent and multiple Cu salt solution infiltrations. The optimized procedure yielded a nanoscale hybrid material where the anisotropic alignment from the VACNT array was preserved, and the mass of the stabilized arrays was increased by over 24-fold because of the addition of Cu. The procedure has been adapted for other Cu salts and can also be used for other metal salts altogether, including Ni, Co, Fe, and Ag. The resulting composite is ideally suited for application in thermal management devices because of its low density, mechanical integrity, and potentially high thermal conductivity. Additionally, further processing of the material via pressing and sintering can yield consolidated, dense bulk composites.
Nonparametric Combinatorial Sequence Models

NASA Astrophysics Data System (ADS)

Wauthier, Fabian L.; Jordan, Michael I.; Jojic, Nebojsa

This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This paper presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three sequence datasets which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution induced by the prior. By integrating out the posterior our method compares favorably to leading binding predictors.
Comparative study of structural models of Leishmania donovani and human GDP-mannose pyrophosphorylases.

PubMed

Daligaux, Pierre; Bernadat, Guillaume; Tran, Linh; Cavé, Christian; Loiseau, Philippe M; Pomel, Sébastien; Ha-Duong, Tâp

2016-01-01

Leishmania is the parasite responsible for the neglected disease leishmaniasis. Its virulence and survival require biosynthesis of glycoconjugates, whose guanosine diphospho-d-mannose pyrophosphorylase (GDP-MP) is a key player. However, experimentally resolved structures of this enzyme are still lacking. We herein propose structural models of the GDP-MP from human and Leishmania donovani. Based on a multiple sequences alignment, the models were built with MODELLER and then carefully refined with all atom molecular dynamics simulations in explicit solvent. Their quality was evaluated against several standard criteria, including their ability to bind GDP-mannose assessed by redocking calculations. Special attention was given in this study to interactions of the catalytic site residues with the enzyme substrate and competitive inhibitors, opening the perspective of medicinal chemistry developments. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
BatMis: a fast algorithm for k-mismatch mapping.

PubMed

Tennakoon, Chandana; Purbojati, Rikky W; Sung, Wing-Kin

2012-08-15

Second-generation sequencing (SGS) generates millions of reads that need to be aligned to a reference genome allowing errors. Although current aligners can efficiently map reads allowing a small number of mismatches, they are not well suited for handling a large number of mismatches. The efficiency of aligners can be improved using various heuristics, but the sensitivity and accuracy of the alignments are sacrificed. In this article, we introduce Basic Alignment tool for Mismatches (BatMis)--an efficient method to align short reads to a reference allowing k mismatches. BatMis is a Burrows-Wheeler transformation based aligner that uses a seed and extend approach, and it is an exact method. Benchmark tests show that BatMis performs better than competing aligners in solving the k-mismatch problem. Furthermore, it can compete favorably even when compared with the heuristic modes of the other aligners. BatMis is a useful alternative for applications where fast k-mismatch mappings, unique mappings or multiple mappings of SGS data are required. BatMis is written in C/C++ and is freely available from http://code.google.com/p/batmis/
DR-TAMAS: Diffeomorphic Registration for Tensor Accurate Alignment of Anatomical Structures.

PubMed

Irfanoglu, M Okan; Nayak, Amritha; Jenkins, Jeffrey; Hutchinson, Elizabeth B; Sadeghi, Neda; Thomas, Cibu P; Pierpaoli, Carlo

2016-05-15

In this work, we propose DR-TAMAS (Diffeomorphic Registration for Tensor Accurate alignMent of Anatomical Structures), a novel framework for intersubject registration of Diffusion Tensor Imaging (DTI) data sets. This framework is optimized for brain data and its main goal is to achieve an accurate alignment of all brain structures, including white matter (WM), gray matter (GM), and spaces containing cerebrospinal fluid (CSF). Currently most DTI-based spatial normalization algorithms emphasize alignment of anisotropic structures. While some diffusion-derived metrics, such as diffusion anisotropy and tensor eigenvector orientation, are highly informative for proper alignment of WM, other tensor metrics such as the trace or mean diffusivity (MD) are fundamental for a proper alignment of GM and CSF boundaries. Moreover, it is desirable to include information from structural MRI data, e.g., T1-weighted or T2-weighted images, which are usually available together with the diffusion data. The fundamental property of DR-TAMAS is to achieve global anatomical accuracy by incorporating in its cost function the most informative metrics locally. Another important feature of DR-TAMAS is a symmetric time-varying velocity-based transformation model, which enables it to account for potentially large anatomical variability in healthy subjects and patients. The performance of DR-TAMAS is evaluated with several data sets and compared with other widely-used diffeomorphic image registration techniques employing both full tensor information and/or DTI-derived scalar maps. Our results show that the proposed method has excellent overall performance in the entire brain, while being equivalent to the best existing methods in WM. Copyright © 2016 Elsevier Inc. All rights reserved.
Reconstructing evolutionary trees in parallel for massive sequences.

PubMed

Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam

2017-12-14

Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .
Taxonaut: an application software for comparative display of multiple taxonomies with a use case of GBIF Species API.

PubMed

Ytow, Nozomi

2016-01-01

The Species API of the Global Biodiversity Information Facility (GBIF) provides public access to taxonomic data aggregated from multiple data sources. Each data source follows its own classification which can be inconsistent with classifications from other sources. Even with a reference classification e.g. the GBIF Backbone taxonomy, a comprehensive method to compare classifications in the data aggregation is essential, especially for non-expert users. A Java application was developed to compare multiple taxonomies graphically using classification data acquired from GBIF's ChecklistBank via the GBIF Species API. It uses a table to display taxonomies where each column represents a taxonomy under comparison, with an aligner column to organise taxa by name. Each cell contains the name of a taxon if the classification in that column contains the name. Each column also has a cell showing the hierarchy of the taxonomy by a folder metaphor where taxa are aligned and synchronised in the aligner column. A set of those comparative tables shows taxa categorised by relationship between taxonomies. The result set is also available as tables in an Excel format file.
Temporal Effects of Alignment in Text-Based, Task-Oriented Discourse

ERIC Educational Resources Information Center

Foltz, Anouschka; Gaspers, Judith; Meyer, Carolin; Thiele, Kristina; Cimiano, Philipp; Stenneken, Prisca

2015-01-01

Communicative alignment refers to adaptation to one's communication partner. Temporal aspects of such alignment have been little explored. This article examines temporal aspects of lexical and syntactic alignment (i.e., tendencies to use the interlocutor's lexical items and syntactic structures) in task-oriented discourse. In particular, we…

Parental alignments and rejection: an empirical study of alienation in children of divorce.

PubMed

Johnston, Janet R

2003-01-01

This study of family relationships after divorce examined the frequency and extent of child-parent alignments and correlates of children's rejection of a parent, these being basic components of the controversial idea of "parental alienation syndrome." The sample consisted of 215 children from the family courts and general community two to three years after parental separation. The findings indicate that children's attitudes toward their parents range from positive to negative, with relatively few being extremely aligned or rejecting. Rejection of a parent has multiple determinants, with both the aligned and rejected parents contributing to the problem, in addition to vulnerabilities within children themselves.
Parallel alignment of bacteria using near-field optical force array for cell sorting

NASA Astrophysics Data System (ADS)

Zhao, H. T.; Zhang, Y.; Chin, L. K.; Yap, P. H.; Wang, K.; Ser, W.; Liu, A. Q.

2017-08-01

This paper presents a near-field approach to align multiple rod-shaped bacteria based on the interference pattern in silicon nano-waveguide arrays. The bacteria in the optical field will be first trapped by the gradient force and then rotated by the scattering force to the equilibrium position. In the experiment, the Shigella bacteria is rotated 90 deg and aligned to horizontal direction in 9.4 s. Meanwhile, 150 Shigella is trapped on the surface in 5 min and 86% is aligned with angle < 5 deg. This method is a promising toolbox for the research of parallel single-cell biophysical characterization, cell-cell interaction, etc.
Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions.

PubMed

Krissinel, E; Henrick, K

2004-12-01

The present paper describes the SSM algorithm of protein structure comparison in three dimensions, which includes an original procedure of matching graphs built on the protein's secondary-structure elements, followed by an iterative three-dimensional alignment of protein backbone Calpha atoms. The SSM results are compared with those obtained from other protein comparison servers, and the advantages and disadvantages of different scores that are used for structure recognition are discussed. A new score, balancing the r.m.s.d. and alignment length Nalign, is proposed. It is found that different servers agree reasonably well on the new score, while showing considerable differences in r.m.s.d. and Nalign.
A Stochastic Evolutionary Model for Protein Structure Alignment and Phylogeny

PubMed Central

Challis, Christopher J.; Schmidler, Scott C.

2012-01-01

We present a stochastic process model for the joint evolution of protein primary and tertiary structure, suitable for use in alignment and estimation of phylogeny. Indels arise from a classic Links model, and mutations follow a standard substitution matrix, whereas backbone atoms diffuse in three-dimensional space according to an Ornstein–Uhlenbeck process. The model allows for simultaneous estimation of evolutionary distances, indel rates, structural drift rates, and alignments, while fully accounting for uncertainty. The inclusion of structural information enables phylogenetic inference on time scales not previously attainable with sequence evolution models. The model also provides a tool for testing evolutionary hypotheses and improving our understanding of protein structural evolution. PMID:22723302
AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

PubMed Central

2010-01-01

Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid) obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used to reliably detect divergent regions via several scoring methods that provide different levels of selectivity. Its predictions have been verified by experimental means. Hence, it is expected that its usage will save researchers' time and ensure an objective selection of the best-possible divergent region when closely related sequences are analysed. AlignMiner is freely available at http://www.scbi.uma.es/alignminer. PMID:20525162
Increased alignment sensitivity improves the usage of genome alignments for comparative gene annotation.

PubMed

Sharma, Virag; Hiller, Michael

2017-08-21

Genome alignments provide a powerful basis to transfer gene annotations from a well-annotated reference genome to many other aligned genomes. The completeness of these annotations crucially depends on the sensitivity of the underlying genome alignment. Here, we investigated the impact of the genome alignment parameters and found that parameters with a higher sensitivity allow the detection of thousands of novel alignments between orthologous exons that have been missed before. In particular, comparisons between species separated by an evolutionary distance of >0.75 substitutions per neutral site, like human and other non-placental vertebrates, benefit from increased sensitivity. To systematically test if increased sensitivity improves comparative gene annotations, we built a multiple alignment of 144 vertebrate genomes and used this alignment to map human genes to the other 143 vertebrates with CESAR. We found that higher alignment sensitivity substantially improves the completeness of comparative gene annotations by adding on average 2382 and 7440 novel exons and 117 and 317 novel genes for mammalian and non-mammalian species, respectively. Our results suggest a more sensitive alignment strategy that should generally be used for genome alignments between distantly-related species. Our 144-vertebrate genome alignment and the comparative gene annotations (https://bds.mpi-cbg.de/hillerlab/144VertebrateAlignment_CESAR/) are a valuable resource for comparative genomics. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
The Caterpillar Game: A SW-PBIS Aligned Classroom Management System

ERIC Educational Resources Information Center

Floress, Margaret T.; Jacoby, Amber L.

2017-01-01

The Caterpillar Game is a classroom management system that is aligned with School-wide Positive Behavioral Interventions and Supports standards. A single-case, multiple-baseline design was used to evaluate the effects of the Caterpillar Game on disruptive student behavior and teacher praise. Three classrooms were included in the study (preschool,…
Instructional Alignment as a Measure of Teaching Quality

ERIC Educational Resources Information Center

Polikoff, Morgan S.; Porter, Andrew C.

2014-01-01

Recent years have seen the convergence of two major policy streams in U.S. K-12 education: standards/accountability and teacher quality reforms. Work in these areas has led to the creation of multiple measures of teacher quality, including measures of their instructional alignment to standards/assessments, observational and student survey measures…
SEAN: SNP prediction and display program utilizing EST sequence clusters.

PubMed

Huntley, Derek; Baldo, Angela; Johri, Saurabh; Sergot, Marek

2006-02-15

SEAN is an application that predicts single nucleotide polymorphisms (SNPs) using multiple sequence alignments produced from expressed sequence tag (EST) clusters. The algorithm uses rules of sequence identity and SNP abundance to determine the quality of the prediction. A Java viewer is provided to display the EST alignments and predicted SNPs.
Drug Promiscuity in PDB: Protein Binding Site Similarity Is Key.

PubMed

Haupt, V Joachim; Daminelli, Simone; Schroeder, Michael

2013-01-01

Drug repositioning applies established drugs to new disease indications with increasing success. A pre-requisite for drug repurposing is drug promiscuity (polypharmacology) - a drug's ability to bind to several targets. There is a long standing debate on the reasons for drug promiscuity. Based on large compound screens, hydrophobicity and molecular weight have been suggested as key reasons. However, the results are sometimes contradictory and leave space for further analysis. Protein structures offer a structural dimension to explain promiscuity: Can a drug bind multiple targets because the drug is flexible or because the targets are structurally similar or even share similar binding sites? We present a systematic study of drug promiscuity based on structural data of PDB target proteins with a set of 164 promiscuous drugs. We show that there is no correlation between the degree of promiscuity and ligand properties such as hydrophobicity or molecular weight but a weak correlation to conformational flexibility. However, we do find a correlation between promiscuity and structural similarity as well as binding site similarity of protein targets. In particular, 71% of the drugs have at least two targets with similar binding sites. In order to overcome issues in detection of remotely similar binding sites, we employed a score for binding site similarity: LigandRMSD measures the similarity of the aligned ligands and uncovers remote local similarities in proteins. It can be applied to arbitrary structural binding site alignments. Three representative examples, namely the anti-cancer drug methotrexate, the natural product quercetin and the anti-diabetic drug acarbose are discussed in detail. Our findings suggest that global structural and binding site similarity play a more important role to explain the observed drug promiscuity in the PDB than physicochemical drug properties like hydrophobicity or molecular weight. Additionally, we find ligand flexibility to have a minor influence.
Sequence Diversity Diagram for comparative analysis of multiple sequence alignments.

PubMed

Sakai, Ryo; Aerts, Jan

2014-01-01

The sequence logo is a graphical representation of a set of aligned sequences, commonly used to depict conservation of amino acid or nucleotide sequences. Although it effectively communicates the amount of information present at every position, this visual representation falls short when the domain task is to compare between two or more sets of aligned sequences. We present a new visual presentation called a Sequence Diversity Diagram and validate our design choices with a case study. Our software was developed using the open-source program called Processing. It loads multiple sequence alignment FASTA files and a configuration file, which can be modified as needed to change the visualization. The redesigned figure improves on the visual comparison of two or more sets, and it additionally encodes information on sequential position conservation. In our case study of the adenylate kinase lid domain, the Sequence Diversity Diagram reveals unexpected patterns and new insights, for example the identification of subgroups within the protein subfamily. Our future work will integrate this visual encoding into interactive visualization tools to support higher level data exploration tasks.
Evidence for Syntactic Alignment in Children with Autism

ERIC Educational Resources Information Center

Allen, Melissa L.; Haywood, Sarah; Rajendran, Gnanathusharan; Branigan, Holly

2011-01-01

We report an experiment that examined whether children with Autistic Spectrum Disorder (ASD) spontaneously converge, or align, syntactic structure with a conversational partner. Children with ASD were more likely to produce a passive structure to describe a picture after hearing their interlocutor use a passive structure to describe an unrelated…
Finding similar nucleotide sequences using network BLAST searches.

PubMed

Ladunga, Istvan

2009-06-01

The Basic Local Alignment Search Tool (BLAST) is a keystone of bioinformatics due to its performance and user-friendliness. Beginner and intermediate users will learn how to design and submit blastn and Megablast searches on the Web pages at the National Center for Biotechnology Information. We map nucleic acid sequences to genomes, find identical or similar mRNA, expressed sequence tag, and noncoding RNA sequences, and run Megablast searches, which are much faster than blastn. Understanding results is assisted by taxonomy reports, genomic views, and multiple alignments. We interpret expected frequency thresholds, biological significance, and statistical significance. Weak hits provide no evidence, but hints for further analyses. We find genes that may code for homologous proteins by translated BLAST. We reduce false positives by filtering out low-complexity regions. Parsed BLAST results can be integrated into analysis pipelines. Links in the output connect to Entrez, PUBMED, structural, sequence, interaction, and expression databases. This facilitates integration with a wide spectrum of biological knowledge.
Learning of Alignment Rules between Concept Hierarchies

NASA Astrophysics Data System (ADS)

Ichise, Ryutaro; Takeda, Hideaki; Honiden, Shinichi

With the rapid advances of information technology, we are acquiring much information than ever before. As a result, we need tools for organizing this data. Concept hierarchies such as ontologies and information categorizations are powerful and convenient methods for accomplishing this goal, which have gained wide spread acceptance. Although each concept hierarchy is useful, it is difficult to employ multiple concept hierarchies at the same time because it is hard to align their conceptual structures. This paper proposes a rule learning method that inputs information from a source concept hierarchy and finds suitable location for them in a target hierarchy. The key idea is to find the most similar categories in each hierarchy, where similarity is measured by the κ(kappa) statistic that counts instances belonging to both categories. In order to evaluate our method, we conducted experiments using two internet directories: Yahoo! and LYCOS. We map information instances from the source directory into the target directory, and show that our learned rules agree with a human-generated assignment 76% of the time.
Demonstration of a Monolithic Micro-Spectrometer System

NASA Technical Reports Server (NTRS)

Rajic, S.; Egert, C. M.

1995-01-01

The starting design of a spectrometer based on a modified Czerny-Turner configuration containing five precision surfaces encapsulated in a monolithic structure is described. Since the purpose at the early stages of the development was to demonstrate the feasibility of the technology and not an attempt to address a specific sensing problem, the first substrate material chosen was optical quality polymethyl methacrylate (PMMA). The final system design decision was narrowed down to two possible configurations containing five and six precision surfaces. The five surface design was chosen since it contained one less precision optical surface, yet included multiple off-axis spheres. In this particular design and material system, the mass was kept below 7 g. The wavelength range (bandpass) design goal was 1 micrometer (0.6 - 1.6 micrometers). The PMMA is particularly transparent in this wavelength region and there are interesting effects to monitor within this band. The optical system was designed and optimized using the ZEMAX optical design software program to be entirely alignment free (self aligning).
A parallel approach of COFFEE objective function to multiple sequence alignment

NASA Astrophysics Data System (ADS)

Zafalon, G. F. D.; Visotaky, J. M. V.; Amorim, A. R.; Valêncio, C. R.; Neves, L. A.; de Souza, R. C. G.; Machado, J. M.

2015-09-01

The computational tools to assist genomic analyzes show even more necessary due to fast increasing of data amount available. With high computational costs of deterministic algorithms for sequence alignments, many works concentrate their efforts in the development of heuristic approaches to multiple sequence alignments. However, the selection of an approach, which offers solutions with good biological significance and feasible execution time, is a great challenge. Thus, this work aims to show the parallelization of the processing steps of MSA-GA tool using multithread paradigm in the execution of COFFEE objective function. The standard objective function implemented in the tool is the Weighted Sum of Pairs (WSP), which produces some distortions in the final alignments when sequences sets with low similarity are aligned. Then, in studies previously performed we implemented the COFFEE objective function in the tool to smooth these distortions. Although the nature of COFFEE objective function implies in the increasing of execution time, this approach presents points, which can be executed in parallel. With the improvements implemented in this work, we can verify the execution time of new approach is 24% faster than the sequential approach with COFFEE. Moreover, the COFFEE multithreaded approach is more efficient than WSP, because besides it is slightly fast, its biological results are better.
Multilayer Microfluidic Devices Created From A Single Photomask

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kelly, Ryan T.; Sheen, Allison M.; Jambovane, Sachin R.

2013-08-28

The time and expense associated with high quality photomask production can discourage the creation of multilayer microfluidic devices, as each layer currently requires a separate photomask. Here we describe an approach in which multilayer microfabricated devices can be created from a single photomask. The separate layers and their corresponding alignment marks are arranged in separate halves of the mask for two layer devices or quadrants for four layer devices. Selective exposure of the photomask features and rotation of the device substrate between exposures result in multiple copies of the devices on each wafer. Subsequent layers are aligned to patterned featuresmore » on the substrate with the same alignment accuracy as when multiple photomasks are used. We demonstrate this approach for fabricating devices employing multilayer soft lithography (MSL) for pneumatic valving. MSL devices containing as many as 5 layers (4 aligned fluidic layers plus a manually aligned control layer) were successfully created using this approach. Device design is also modularized, enabling the presence or absence of features as well as channel heights to be selected independently from one another. The use of a single photomask to create multilayer devices results in a dramatic savings of time and/or money required to advance from device design to completed prototype.« less
Minerals and aligned collagen fibrils in tilapia fish scales: structural analysis using dark-field and energy-filtered transmission electron microscopy and electron tomography.

PubMed

Okuda, Mitsuhiro; Ogawa, Nobuhiro; Takeguchi, Masaki; Hashimoto, Ayako; Tagaya, Motohiro; Chen, Song; Hanagata, Nobutaka; Ikoma, Toshiyuki

2011-10-01

The mineralized structure of aligned collagen fibrils in a tilapia fish scale was investigated using transmission electron microscopy (TEM) techniques after a thin sample was prepared using aqueous techniques. Electron diffraction and electron energy loss spectroscopy data indicated that a mineralized internal layer consisting of aligned collagen fibrils contains hydroxyapatite crystals. Bright-field imaging, dark-field imaging, and energy-filtered TEM showed that the hydroxyapatite was mainly distributed in the hole zones of the aligned collagen fibrils structure, while needle-like materials composed of calcium compounds including hydroxyapatite existed in the mineralized internal layer. Dark-field imaging and three-dimensional observation using electron tomography revealed that hydroxyapatite and needle-like materials were mainly found in the matrix between the collagen fibrils. It was observed that hydroxyapatite and needle-like materials were preferentially distributed on the surface of the hole zones in the aligned collagen fibrils structure and in the matrix between the collagen fibrils in the mineralized internal layer of the scale.
Aligning for accountable care: Strategic practices for change in accountable care organizations.

PubMed

Hilligoss, Brian; Song, Paula H; McAlearney, Ann Scheck

Alignment within accountable care organizations (ACOs) is crucial if these new entities are to achieve their lofty goals. However, the concept of alignment remains underexamined, and we know little about the work entailed in creating alignment. The aim of this study was to develop the concept of aligning by identifying and describing the strategic practices administrators use to align the structures, processes, and behaviors of their organizations and individual providers in pursuit of accountable care. We conducted 2-year qualitative case studies of four ACOs that have assumed full risk for the costs and quality of care for defined populations. Five strategic aligning practices were used by all four ACOs. Informing both aligns providers' understandings with the goals and value proposition of the ACO and aligns the providers' attention with the drivers of performance. Involving both aligns ACO leaders' understandings with the realities facing providers and aligns the policies of the ACO with the needs of providers. Enhancing both aligns the operations of individual provider practices with the operations of the ACO and aligns the trust of providers with the ACO. Motivating aligns what providers value with the goals of the ACO. Finally, evolving is a metapractice of learning and adapting that guides the execution of the other four practices. Our findings suggest that there are second-order cognitive (e.g., understandings and attention) and cultural (e.g., trust and values) levels of alignment, as well as a first-order operational level (organizational structures, processes, and incentives). A well-aligned organization may require ongoing repositioning at each of these levels, as well as attention to both cooperative and coordinative dimensions of alignment. Implications for research and practice are discussed.
Ground and satellite observations of multiple sun-aligned auroral arcs on the duskside

NASA Astrophysics Data System (ADS)

Hosokawa, K.; Maggiolo, R.; Zhang, Y.; Fear, R. C.; Fontaine, D.; Cumnock, J. A.; Kullen, A.; Milan, S. E.; Kozlovsky, A.; Echim, M.; Shiokawa, K.

2014-12-01

Sun-aligned auroral arcs (SAAs) are one of the outstanding phenomena in the high-latitude region during periods of northward interplanetary magnetic field (IMF). Smaller scale SAAs tend to occur either in the duskside or dawnside of the polar cap and are known to drift in the dawn-dusk direction depending on the sign of the IMF By. Studies of SAAs are of particular importance because they represent dynamical characteristics of their source plasma in the magnetosphere, for example in the interaction region between the solar wind and magnetosphere or in the boundary between the plasma sheet and tail lobe. To date, however, very little has been known about the spatial structure and/or temporal evolution of the magnetospheric counterpart of SAAs. In order to gain more comprehensive understanding of the field-aligned plasma transport in the vicinity of SAAs, we have investigated an event of SAAs on November 10, 2005, during which multiple SAAs were detected by a ground-based all-sky camera at Resolute Bay, Canada. During this interval, several SAAs were detached from the duskside oval and moved poleward. The large-scale structure of these arcs was visualized by space-based imagers of TIMED/GUVI and DMSP/SSUSI. In addition to these optical observations, we employ the Cluster satellites to reveal the high-altitude particle signature corresponding to the small-scale SAAs. The ionospheric footprints of the 4 Cluster satellites encountered the SAAs sequentially and observed well correlated enhancements of electron fluxes at weak energies (< 1 keV). The Cluster satellites also detected signatures of upflowing beams of ions and electrons in the vicinity of the SAAs. This implies that these ions and electrons were accelerated upward by a quasi-stationary electric field existing in the vicinity of the SAAs and constitute a current system in the magnetosphere-ionosphere coupling system. Ionospheric convection measurement from one of the SuperDARN radars shows an indication that the SAAs are embedded in the lobe cell during northward IMF conditions. In the presentation, we will show the results of detailed comparison between the ground-based radio and optical signatures of the SAAs and those obtained by the Cluster spacecraft at magnetospheric altitudes.

1H and 15N NMR resonance assignments and secondary structure of titin type I domains.

PubMed

Muhle-Goll, C; Nilges, M; Pastore, A

1997-01-01

Titin/connectin is a giant muscle protein with a highly modular architecture consisting of multiple repeats of two sequence motifs, named type I and type II. Type I modules have been suggested to be intracellular members of the fibronectin type III (Fn3) domain family. Along the titin sequence they are exclusively present in the region of the molecule located in the sarcomere A-band. This region has been shown to interact with myosin and C-protein. One of the most noticeable features of type I modules is that they are particularly rich in semiconserved prolines, since these residues account for about 8% of their sequence. We have determined the secondary structure of a representative type I domain (A71) by 15N and 1H NMR. We show that the type I domains of titin have the Fn3 fold as proposed, consisting of a three- and a four-stranded beta-sheet. When the two sheets are placed on top of each other to form the beta-sandwich characteristic of the Fn3 fold, 8 out of 10 prolines are found on the same side of the molecule and form an exposed hydrophobic patch. This suggests that the semiconserved prolines might be relevant for the function of type I modules, providing a surface for binding to other A-band proteins. The secondary structure of A71 was structurally aligned to other extracellular Fn3 modules of known 3D structure. The alignment shows that titin type I modules have closest similarity to the first Fn3 domain of Drosophila neuroglian.
Multiple genome alignment for identifying the core structure among moderately related microbial genomes.

PubMed

Uchiyama, Ikuo

2008-10-31

Identifying the set of intrinsically conserved genes, or the genomic core, among related genomes is crucial for understanding prokaryotic genomes where horizontal gene transfers are common. Although core genome identification appears to be obvious among very closely related genomes, it becomes more difficult when more distantly related genomes are compared. Here, we consider the core structure as a set of sufficiently long segments in which gene orders are conserved so that they are likely to have been inherited mainly through vertical transfer, and developed a method for identifying the core structure by finding the order of pre-identified orthologous groups (OGs) that maximally retains the conserved gene orders. The method was applied to genome comparisons of two well-characterized families, Bacillaceae and Enterobacteriaceae, and identified their core structures comprising 1438 and 2125 OGs, respectively. The core sets contained most of the essential genes and their related genes, which were primarily included in the intersection of the two core sets comprising around 700 OGs. The definition of the genomic core based on gene order conservation was demonstrated to be more robust than the simpler approach based only on gene conservation. We also investigated the core structures in terms of G+C content homogeneity and phylogenetic congruence, and found that the core genes primarily exhibited the expected characteristic, i.e., being indigenous and sharing the same history, more than the non-core genes. The results demonstrate that our strategy of genome alignment based on gene order conservation can provide an effective approach to identify the genomic core among moderately related microbial genomes.
Layered magnetic structures: Antiferromagnetic-type interlayer coupling and magnetoresistance due to antiparallel alignment

NASA Astrophysics Data System (ADS)

Grünberg, P.; Demokritov, S.; Fuss, A.; Vohl, M.; Wolf, J. A.

1991-04-01

Layered Fe/Cr structures are known to display antiferromagnetic-type interlayer coupling and a new magnetoresistance (MR) effect due to antiparallel magnetization alignment. The strength of the coupling is found to be similar in multilayered structures and in double layers. The oscillatory behavior of the coupling, previously found by Parkin, More, and Roche [Phys. Rev. Lett. 64, 2304 (1990)] on sputtered polycrystalline samples, is here confirmed for epitaxial samples, obtained by thermal evaporation. The new MR effect is interpreted as due to a spin-dependent scattering of the electrons at the Fe-Cr interfaces. The investigations have been extended to Fe/V, Fe/Mn, Fe/Cu, Co/Au, Co/Cr, and Co/Cu structures where the antiparallel alignment of the ferromagnetic layers is obtained via hysteresis effects. A MR effect due to antiparallel alignment, which is strong for Co/Au and Co/Cu but weak in the other cases, has been found.
Fusion bonding and alignment fixture

DOEpatents

Ackler, Harold D.; Swierkowski, Stefan P.; Tarte, Lisa A.; Hicks, Randall K.

2000-01-01

An improved vacuum fusion bonding structure and process for aligned bonding of large area glass plates, patterned with microchannels and access holes and slots, for elevated glass fusion temperatures. Vacuum pumpout of all the components is through the bottom platform which yields an untouched, defect free top surface which greatly improves optical access through this smooth surface. Also, a completely non-adherent interlayer, such as graphite, with alignment and location features is located between the main steel platform and the glass plate pair, which makes large improvements in quality, yield, and ease of use, and enables aligned bonding of very large glass structures.
Tissue-engineered spiral nerve guidance conduit for peripheral nerve regeneration.

PubMed

Chang, Wei; Shah, Munish B; Lee, Paul; Yu, Xiaojun

2018-06-01

Recently in peripheral nerve regeneration, preclinical studies have shown that the use of nerve guidance conduits (NGCs) with multiple longitudinally channels and intra-luminal topography enhance the functional outcomes when bridging a nerve gap caused by traumatic injury. These features not only provide guidance cues for regenerating nerve, but also become the essential approaches for developing a novel NGC. In this study, a novel spiral NGC with aligned nanofibers and wrapped with an outer nanofibrous tube was first developed and investigated. Using the common rat sciatic 10-mm nerve defect model, the in vivo study showed that a novel spiral NGC (with and without inner nanofibers) increased the successful rate of nerve regeneration after 6 weeks recovery. Substantial improvements in nerve regeneration were achieved by combining the spiral NGC with inner nanofibers and outer nanofibrous tube, based on the results of walking track analysis, electrophysiology, nerve histological assessment, and gastrocnemius muscle measurement. This demonstrated that the novel spiral NGC with inner aligned nanofibers and wrapped with an outer nanofibrous tube provided a better environment for peripheral nerve regeneration than standard tubular NGCs. Results from this study will benefit for future NGC design to optimize tissue-engineering strategies for peripheral nerve regeneration. We developed a novel spiral nerve guidance conduit (NGC) with coated aligned nanofibers. The spiral structure increases surface area by 4.5 fold relative to a tubular NGC. Furthermore, the aligned nanofibers was coated on the spiral walls, providing cues for guiding neurite extension. Finally, the outside of spiral NGC was wrapped with randomly nanofibers to enhance mechanical strength that can stabilize the spiral NGC. Our nerve histological data have shown that the spiral NGC had 50% more myelinated axons than a tubular structure for nerve regeneration across a 10 mm gap in a rat sciatic nerve. Results from this study can help further optimize tissue engineering strategies for peripheral nerve repair. Copyright © 2018 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
Quantifying the Hierarchical Order in Self-Aligned Carbon Nanotubes from Atomic to Micrometer Scale.

PubMed

Meshot, Eric R; Zwissler, Darwin W; Bui, Ngoc; Kuykendall, Tevye R; Wang, Cheng; Hexemer, Alexander; Wu, Kuang Jen J; Fornasiero, Francesco

2017-06-27

Fundamental understanding of structure-property relationships in hierarchically organized nanostructures is crucial for the development of new functionality, yet quantifying structure across multiple length scales is challenging. In this work, we used nondestructive X-ray scattering to quantitatively map the multiscale structure of hierarchically self-organized carbon nanotube (CNT) "forests" across 4 orders of magnitude in length scale, from 2.0 Å to 1.5 μm. Fully resolved structural features include the graphitic honeycomb lattice and interlayer walls (atomic), CNT diameter (nano), as well as the greater CNT ensemble (meso) and large corrugations (micro). Correlating orientational order across hierarchical levels revealed a cascading decrease as we probed finer structural feature sizes with enhanced sensitivity to small-scale disorder. Furthermore, we established qualitative relationships for single-, few-, and multiwall CNT forest characteristics, showing that multiscale orientational order is directly correlated with number density spanning 10 9 -10 12 cm -2 , yet order is inversely proportional to CNT diameter, number of walls, and atomic defects. Lastly, we captured and quantified ultralow-q meridional scattering features and built a phenomenological model of the large-scale CNT forest morphology, which predicted and confirmed that these features arise due to microscale corrugations along the vertical forest direction. Providing detailed structural information at multiple length scales is important for design and synthesis of CNT materials as well as other hierarchically organized nanostructures.
Structure-Property Relations in Carbon Nanotube Fibers by Downscaling Solution Processing.

PubMed

Headrick, Robert J; Tsentalovich, Dmitri E; Berdegué, Julián; Bengio, Elie Amram; Liberman, Lucy; Kleinerman, Olga; Lucas, Matthew S; Talmon, Yeshayahu; Pasquali, Matteo

2018-03-01

At the microscopic scale, carbon nanotubes (CNTs) combine impressive tensile strength and electrical conductivity; however, their macroscopic counterparts have not met expectations. The reasons are variously attributed to inherent CNT sample properties (diameter and helicity polydispersity, high defect density, insufficient length) and manufacturing shortcomings (inadequate ordering and packing), which can lead to poor transmission of stress and current. To efficiently investigate the disparity between microscopic and macroscopic properties, a new method is introduced for processing microgram quantities of CNTs into highly oriented and well-packed fibers. CNTs are dissolved into chlorosulfonic acid and processed into aligned films; each film can be peeled and twisted into multiple discrete fibers. Fibers fabricated by this method and solution-spinning are directly compared to determine the impact of alignment, twist, packing density, and length. Surprisingly, these discrete fibers can be twice as strong as their solution-spun counterparts despite a lower degree of alignment. Strength appears to be more sensitive to internal twist and packing density, while fiber conductivity is essentially equivalent among the two sets of samples. Importantly, this rapid fiber manufacturing method uses three orders of magnitude less material than solution spinning, expanding the experimental parameter space and enabling the exploration of unique CNT sources. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Electrospun fibrinogen-PLA nanofibres for vascular tissue engineering.

PubMed

Gugutkov, D; Gustavsson, J; Cantini, M; Salmeron-Sánchez, M; Altankov, G

2017-10-01

Here we report on the development of a new type of hybrid fibrinogen-polylactic acid (FBG-PLA) nanofibres (NFs) with improved stiffness, combining the good mechanical properties of PLA with the excellent cell recognition properties of native FBG. We were particularly interested in the dorsal and ventral cell response to the nanofibres' organization (random or aligned), using human umbilical endothelial cells (HUVECs) as a model system. Upon ventral contact with random NFs, the cells developed a stellate-like morphology with multiple projections. The well-developed focal adhesion complexes suggested a successful cellular interaction. However, time-lapse analysis shows significantly lowered cell movements, resulting in the cells traversing a relatively short distance in multiple directions. Conversely, an elongated cell shape and significantly increased cell mobility were observed in aligned NFs. To follow the dorsal cell response, artificial wounds were created on confluent cell layers previously grown on glass slides and covered with either random or aligned NFs. Time-lapse analysis showed significantly faster wound coverage (within 12 h) of HUVECs on aligned samples vs. almost absent directional migration on random ones. However, nitric oxide (NO) release shows that endothelial cells possess lowered functionality on aligned NFs compared to random ones, where significantly higher NO production was found. Collectively, our studies show that randomly organized NFs could support the endothelization of implants while aligned NFs would rather direct cell locomotion for guided neovascularization. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
BiPACE 2D--graph-based multiple alignment for comprehensive 2D gas chromatography-mass spectrometry.

PubMed

Hoffmann, Nils; Wilhelm, Mathias; Doebbe, Anja; Niehaus, Karsten; Stoye, Jens

2014-04-01

Comprehensive 2D gas chromatography-mass spectrometry is an established method for the analysis of complex mixtures in analytical chemistry and metabolomics. It produces large amounts of data that require semiautomatic, but preferably automatic handling. This involves the location of significant signals (peaks) and their matching and alignment across different measurements. To date, there exist only a few openly available algorithms for the retention time alignment of peaks originating from such experiments that scale well with increasing sample and peak numbers, while providing reliable alignment results. We describe BiPACE 2D, an automated algorithm for retention time alignment of peaks from 2D gas chromatography-mass spectrometry experiments and evaluate it on three previously published datasets against the mSPA, SWPA and Guineu algorithms. We also provide a fourth dataset from an experiment studying the H2 production of two different strains of Chlamydomonas reinhardtii that is available from the MetaboLights database together with the experimental protocol, peak-detection results and manually curated multiple peak alignment for future comparability with newly developed algorithms. BiPACE 2D is contained in the freely available Maltcms framework, version 1.3, hosted at http://maltcms.sf.net, under the terms of the L-GPL v3 or Eclipse Open Source licenses. The software used for the evaluation along with the underlying datasets is available at the same location. The C.reinhardtii dataset is freely available at http://www.ebi.ac.uk/metabolights/MTBLS37.
An Accurate Scalable Template-based Alignment Algorithm

PubMed Central

Gardner, David P.; Xu, Weijia; Miranker, Daniel P.; Ozer, Stuart; Cannone, Jamie J.; Gutell, Robin R.

2013-01-01

The rapid determination of nucleic acid sequences is increasing the number of sequences that are available. Inherent in a template or seed alignment is the culmination of structural and functional constraints that are selecting those mutations that are viable during the evolution of the RNA. While we might not understand these structural and functional, template-based alignment programs utilize the patterns of sequence conservation to encapsulate the characteristics of viable RNA sequences that are aligned properly. We have developed a program that utilizes the different dimensions of information in rCAD, a large RNA informatics resource, to establish a profile for each position in an alignment. The most significant include sequence identity and column composition in different phylogenetic taxa. We have compared our methods with a maximum of eight alternative alignment methods on different sets of 16S and 23S rRNA sequences with sequence percent identities ranging from 50% to 100%. The results showed that CRWAlign outperformed the other alignment methods in both speed and accuracy. A web-based alignment server is available at http://www.rna.ccbb.utexas.edu/SAE/2F/CRWAlign. PMID:24772376
The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes.

PubMed

Treangen, Todd J; Ondov, Brian D; Koren, Sergey; Phillippy, Adam M

2014-01-01

Whole-genome sequences are now available for many microbial species and clades, however existing whole-genome alignment methods are limited in their ability to perform sequence comparisons of multiple sequences simultaneously. Here we present the Harvest suite of core-genome alignment and visualization tools for the rapid and simultaneous analysis of thousands of intraspecific microbial strains. Harvest includes Parsnp, a fast core-genome multi-aligner, and Gingr, a dynamic visual platform. Together they provide interactive core-genome alignments, variant calls, recombination detection, and phylogenetic trees. Using simulated and real data we demonstrate that our approach exhibits unrivaled speed while maintaining the accuracy of existing methods. The Harvest suite is open-source and freely available from: http://github.com/marbl/harvest.
The protein structure prediction problem could be solved using the current PDB library

PubMed Central

Zhang, Yang; Skolnick, Jeffrey

2005-01-01

For single-domain proteins, we examine the completeness of the structures in the current Protein Data Bank (PDB) library for use in full-length model construction of unknown sequences. To address this issue, we employ a comprehensive benchmark set of 1,489 medium-size proteins that cover the PDB at the level of 35% sequence identity and identify templates by structure alignment. With homologous proteins excluded, we can always find similar folds to native with an average rms deviation (RMSD) from native of 2.5 Å with ≈82% alignment coverage. These template structures often contain a significant number of insertions/deletions. The tasser algorithm was applied to build full-length models, where continuous fragments are excised from the top-scoring templates and reassembled under the guide of an optimized force field, which includes consensus restraints taken from the templates and knowledge-based statistical potentials. For almost all targets (except for 2/1,489), the resultant full-length models have an RMSD to native below 6 Å (97% of them below 4 Å). On average, the RMSD of full-length models is 2.25 Å, with aligned regions improved from 2.5 Å to 1.88 Å, comparable with the accuracy of low-resolution experimental structures. Furthermore, starting from state-of-the-art structural alignments, we demonstrate a methodology that can consistently bring template-based alignments closer to native. These results are highly suggestive that the protein-folding problem can in principle be solved based on the current PDB library by developing efficient fold recognition algorithms that can recover such initial alignments. PMID:15653774
Using droplet-on-demand based printing to guide self-assembly in a peptide-protein based bioink

NASA Astrophysics Data System (ADS)

Hedegaard, Clara; Collin, Estelle; Redondo-Gomez, Carlos; Nguyen, Luong T. H.; Ng, Kee Woei; Castrejon-Pita, Alfonso A.; Castrejon-Pita, J. Rafael; Mata, Alvaro

2017-11-01

Tissue engineering aims to capture details of the extracellular matrix (ECM) that stimulate tissue regeneration. Advanced biofabrication techniques have enabled structural complexity, however they are restricted by the choice of material due to stringent printing requirements, leading to a lack of nanoscale control and molecular versatility. In this project, we exploit the dynamics of droplet fluid interactions combined with the co-assembly of peptide amphiphiles (PAs) with biomolecules/proteins to develop a new approach to droplet-based biofabrication. A custom-made droplet generator was developed and used to controllably dispense droplets of PA into a protein solution resulting in gel formation within milliseconds. Taking advantage of the interfacial and inertial forces during the droplet/liquid interaction, it is possible to control the co-assembly kinetics, to give rise to aligned or disordered nanofibers, hydrogel structures of different geometries and sizes, surface topographies, and higher-ordered structures made from multiple hydrogels. The process allows multiple cell types to be spatially distributed on the outside or embedded within the ECM mimetic scaffolds, whilst exhibiting high cell viability (>88%). ERC Starting Grant (STROFUNSCAFF), FP7-PEOPLE-2013-CIG Biomorph and the Royal Society.
Ion acceleration by multiple reflections at Martian bow shock

NASA Astrophysics Data System (ADS)

Yamauchi, M.; Futaana, Y.; Fedorov, A.; Frahm, R. A.; Dubinin, E.; Lundin, R.; Sauvaud, J.-A.; Winningham, J. D.; Barabash, S.; Holmström, M.

2012-02-01

The ion mass analyzer (IMA) on board Mars Express revealed bundled structures of ions in the energy domain within a distance of a proton gyroradius from the Martian bow shock. Seven prominent traversals during 2005 were examined when the energy-bunched structure was observed together with pick-up ions of exospheric origin, the latter of which is used to determine the local magnetic field orientation from its circular trajectory in velocity space. These seven traversals include different bow shock configurations: (a) quasi-perpendicular shock with its specular direction of the solar wind more perpendicular to the magnetic field (QT), (b) quasi-perpendicular shock with its specular reflection direction of the solar wind more along the magnetic field (FS), and (c) quasi-parallel (QL) shock. In all seven cases, the velocity components of the energy-bunched structure are consistent with multiple specular reflections of the solar wind at the bow shock up to at least two reflections. The accelerated solar wind ions after two specular reflections have large parallel components with respect to the magnetic field for both QL cases whereas the field-aligned speed is much smaller than the perpendicular speed for all QT cases.
Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices.

PubMed

Li, Guang; Wang, Yadong; Su, Xiaohong

2012-10-01

When developing personal DNA databases, there must be an appropriate guarantee of anonymity, which means that the data cannot be related back to individuals. DNA lattice anonymization (DNALA) is a successful method for making personal DNA sequences anonymous. However, it uses time-consuming multiple sequence alignment and a low-accuracy greedy clustering algorithm. Furthermore, DNALA is not an online algorithm, and so it cannot quickly return results when the database is updated. This study improves the DNALA method. Specifically, we replaced the multiple sequence alignment in DNALA with global pairwise sequence alignment to save time, and we designed a hybrid clustering algorithm comprised of a maximum weight matching (MWM)-based algorithm and an online algorithm. The MWM-based algorithm is more accurate than the greedy algorithm in DNALA and has the same time complexity. The online algorithm can process data quickly when the database is updated. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Cryo-EM image alignment based on nonuniform fast Fourier transform.

PubMed

Yang, Zhengfan; Penczek, Pawel A

2008-08-01

In single particle analysis, two-dimensional (2-D) alignment is a fundamental step intended to put into register various particle projections of biological macromolecules collected at the electron microscope. The efficiency and quality of three-dimensional (3-D) structure reconstruction largely depends on the computational speed and alignment accuracy of this crucial step. In order to improve the performance of alignment, we introduce a new method that takes advantage of the highly accurate interpolation scheme based on the gridding method, a version of the nonuniform fast Fourier transform, and utilizes a multi-dimensional optimization algorithm for the refinement of the orientation parameters. Using simulated data, we demonstrate that by using less than half of the sample points and taking twice the runtime, our new 2-D alignment method achieves dramatically better alignment accuracy than that based on quadratic interpolation. We also apply our method to image to volume registration, the key step in the single particle EM structure refinement protocol. We find that in this case the accuracy of the method not only surpasses the accuracy of the commonly used real-space implementation, but results are achieved in much shorter time, making gridding-based alignment a perfect candidate for efficient structure determination in single particle analysis.
Cryo-EM Image Alignment Based on Nonuniform Fast Fourier Transform

PubMed Central

Yang, Zhengfan; Penczek, Pawel A.

2008-01-01

In single particle analysis, two-dimensional (2-D) alignment is a fundamental step intended to put into register various particle projections of biological macromolecules collected at the electron microscope. The efficiency and quality of three-dimensional (3-D) structure reconstruction largely depends on the computational speed and alignment accuracy of this crucial step. In order to improve the performance of alignment, we introduce a new method that takes advantage of the highly accurate interpolation scheme based on the gridding method, a version of the nonuniform Fast Fourier Transform, and utilizes a multi-dimensional optimization algorithm for the refinement of the orientation parameters. Using simulated data, we demonstrate that by using less than half of the sample points and taking twice the runtime, our new 2-D alignment method achieves dramatically better alignment accuracy than that based on quadratic interpolation. We also apply our method to image to volume registration, the key step in the single particle EM structure refinement protocol. We find that in this case the accuracy of the method not only surpasses the accuracy of the commonly used real-space implementation, but results are achieved in much shorter time, making gridding-based alignment a perfect candidate for efficient structure determination in single particle analysis. PMID:18499351
'Compromise position' image alignment to accommodate independent motion of multiple clinical target volumes during radiotherapy: A high risk prostate cancer example.

PubMed

Rosewall, Tara; Yan, Jing; Alasti, Hamideh; Cerase, Carla; Bayley, Andrew

2017-04-01

Inclusion of multiple independently moving clinical target volumes (CTVs) in the irradiated volume causes an image guidance conundrum. The purpose of this research was to use high risk prostate cancer as a clinical example to evaluate a 'compromise' image alignment strategy. The daily pre-treatment orthogonal EPI for 14 consecutive patients were included in this analysis. Image matching was performed by aligning to the prostate only, the bony pelvis only and using the 'compromise' strategy. Residual CTV surrogate displacements were quantified for each of the alignment strategies. Analysis of the 388 daily fractions indicated surrogate displacements were well-correlated in all directions (r 2 = 0.95 (LR), 0.67 (AP) and 0.59 (SI). Differences between the surrogates displacements (95% range) were -0.4 to 1.8 mm (LR), -1.2 to 5.2 mm (SI) and -1.2 to 5.2 mm (AP). The distribution of the residual displacements was significantly smaller using the 'compromise' strategy, compared to the other strategies (p 0.005). The 'compromise' strategy ensured the CTV was encompassed by the PTV in all fractions, compared to 47 PTV violations when aligned to prostate only. This study demonstrated the feasibility of a compromise position image guidance strategy to accommodate simultaneous displacements of two independently moving CTVs. Application of this strategy was facilitated by correlation between the CTV displacements and resulted in no geometric excursions of the CTVs beyond standard sized PTVs. This simple image guidance strategy may also be applicable to other disease sites that concurrently irradiate multiple CTVs, such as head and neck, lung and cervix cancer. © 2016 The Royal Australian and New Zealand College of Radiologists.
GASP: Gapped Ancestral Sequence Prediction for proteins

PubMed Central

Edwards, Richard J; Shields, Denis C

2004-01-01

Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199
A new star (sensor) is born

NASA Astrophysics Data System (ADS)

Leijtens, Johan; Vliegenthart, Willem; Lampridis, Dimitris; Vacanti, Giuseppe; Monna, Bert; Bechthum, Elbert; Hagenaars, Koen; van der Heide, Erik; Kruijff, Michiel; van Breukelen, Eddie; LeMair, Anita

2017-11-01

In the frame of the Dutch Prequalification for ESA Programs(PEP), as part of the efforts to design an integrated optical attitude control subsytem (IOPACS), a consortium of TNO and several SME's in the Netherlands have been working on a novel type of startracker called MABS (Multiple Aperture Baffled Startracker). The system comprises a single cast metal housing with four reflective optical telescopes which use only structural internal baffling. Inherent to the design are a very high stability and excellent co-alignment between the apertures, a significant decrease in system size and low recurring production cost. The concept is a radical change from more common multiple startracker setups. The presentation will concentrate on the validity of the concept, the predicted performance and benefits for space applications, the produced breadboard and measured performances as well as the costing aspects.

Alignments of Dark Matter Halos with Large-scale Tidal Fields: Mass and Redshift Dependence

NASA Astrophysics Data System (ADS)

Chen, Sijie; Wang, Huiyuan; Mo, H. J.; Shi, Jingjing

2016-07-01

Large-scale tidal fields estimated directly from the distribution of dark matter halos are used to investigate how halo shapes and spin vectors are aligned with the cosmic web. The major, intermediate, and minor axes of halos are aligned with the corresponding tidal axes, and halo spin axes tend to be parallel with the intermediate axes and perpendicular to the major axes of the tidal field. The strengths of these alignments generally increase with halo mass and redshift, but the dependence is only on the peak height, ν \\equiv {δ }{{c}}/σ ({M}{{h}},z). The scaling relations of the alignment strengths with the value of ν indicate that the alignment strengths remain roughly constant when the structures within which the halos reside are still in a quasi-linear regime, but decreases as nonlinear evolution becomes more important. We also calculate the alignments in projection so that our results can be compared directly with observations. Finally, we investigate the alignments of tidal tensors on large scales, and use the results to understand alignments of halo pairs separated at various distances. Our results suggest that the coherent structure of the tidal field is the underlying reason for the alignments of halos and galaxies seen in numerical simulations and in observations.
Quantum size and electric field modulations on electronic structures of SnS2/BN hetero-multilayers

NASA Astrophysics Data System (ADS)

Xia, Congxin; Zhang, Qian; Xiao, Wenbo; Du, Juan; Li, Xueping; Li, Jingbo

2018-05-01

Through first-principles calculations, we study the stability, band structures, band alignment, and interlayer charge transfer of SnS2/BN hetero-multilayers, considering quantum size and electric field effects. We find that SnS2/BN hetero-multilayers possess the characteristics of direct band structures and type-II band alignment. Moreover, increasing the BN layer number can decrease the band gap value and work function. Additionally, type-II can be tuned to type-I band alignment in the presence of an electric field. These results indicate that the SnS2/BN system is different from that of other BN-based hybrid materials, such as MoS2/BN with type-I band alignment, which is promising for optoelectronic device applications.
Active alignment/contact verification system

DOEpatents

Greenbaum, William M.

2000-01-01

A system involving an active (i.e. electrical) technique for the verification of: 1) close tolerance mechanical alignment between two component, and 2) electrical contact between mating through an elastomeric interface. For example, the two components may be an alumina carrier and a printed circuit board, two mating parts that are extremely small, high density parts and require alignment within a fraction of a mil, as well as a specified interface point of engagement between the parts. The system comprises pairs of conductive structures defined in the surfaces layers of the alumina carrier and the printed circuit board, for example. The first pair of conductive structures relate to item (1) above and permit alignment verification between mating parts. The second pair of conductive structures relate to item (2) above and permit verification of electrical contact between mating parts.
On crystal versus fiber formation in dipeptide hydrogelator systems.

PubMed

Houton, Kelly A; Morris, Kyle L; Chen, Lin; Schmidtmann, Marc; Jones, James T A; Serpell, Louise C; Lloyd, Gareth O; Adams, Dave J

2012-06-26

Naphthalene dipeptides have been shown to be useful low-molecular-weight gelators. Here we have used a library to explore the relationship between the dipeptide sequence and the hydrogelation efficiency. A number of the naphthalene dipeptides are crystallizable from water, enabling us to investigate the comparison between the gel/fiber phase and the crystal phase. We succeeded in crystallizing one example directly from the gel phase. Using X-ray crystallography, molecular modeling, and X-ray fiber diffraction, we show that the molecular packing of this crystal structure differs from the structure of the gel/fiber phase. Although the crystal structures may provide important insights into stabilizing interactions, our analysis indicates a rearrangement of structural packing within the fibers. These observations are consistent with the fibrillar interactions and interatomic separations promoting 1D assembly whereas in the crystals the peptides are aligned along multiple axes, allowing 3D growth. This observation has an impact on the use of crystal structures to determine supramolecular synthons for gelators.
WEBnm@ v2.0: Web server and services for comparing protein flexibility.

PubMed

Tiwari, Sandhya P; Fuglebakk, Edvin; Hollup, Siv M; Skjærven, Lars; Cragnolini, Tristan; Grindhaug, Svenn H; Tekle, Kidane M; Reuter, Nathalie

2014-12-30

Normal mode analysis (NMA) using elastic network models is a reliable and cost-effective computational method to characterise protein flexibility and by extension, their dynamics. Further insight into the dynamics-function relationship can be gained by comparing protein motions between protein homologs and functional classifications. This can be achieved by comparing normal modes obtained from sets of evolutionary related proteins. We have developed an automated tool for comparative NMA of a set of pre-aligned protein structures. The user can submit a sequence alignment in the FASTA format and the corresponding coordinate files in the Protein Data Bank (PDB) format. The computed normalised squared atomic fluctuations and atomic deformation energies of the submitted structures can be easily compared on graphs provided by the web user interface. The web server provides pairwise comparison of the dynamics of all proteins included in the submitted set using two measures: the Root Mean Squared Inner Product and the Bhattacharyya Coefficient. The Comparative Analysis has been implemented on our web server for NMA, WEBnm@, which also provides recently upgraded functionality for NMA of single protein structures. This includes new visualisations of protein motion, visualisation of inter-residue correlations and the analysis of conformational change using the overlap analysis. In addition, programmatic access to WEBnm@ is now available through a SOAP-based web service. Webnm@ is available at http://apps.cbu.uib.no/webnma . WEBnm@ v2.0 is an online tool offering unique capability for comparative NMA on multiple protein structures. Along with a convenient web interface, powerful computing resources, and several methods for mode analyses, WEBnm@ facilitates the assessment of protein flexibility within protein families and superfamilies. These analyses can give a good view of how the structures move and how the flexibility is conserved over the different structures.
Clinical Phenotype Classifications Based on Static Varus Alignment and Varus Thrust in Japanese Patients With Medial Knee Osteoarthritis

PubMed Central

Iijima, Hirotaka; Fukutani, Naoto; Fukumoto, Takahiko; Uritani, Daisuke; Kaneda, Eishi; Ota, Kazuo; Kuroki, Hiroshi; Matsuda, Shuichi

2015-01-01

Objective To investigate the association between knee pain during gait and 4 clinical phenotypes based on static varus alignment and varus thrust in patients with medial knee osteoarthritis (OA). Methods Patients in an orthopedic clinic (n = 266) diagnosed as having knee OA (Kellgren/Lawrence [K/L] grade ≥1) were divided into 4 phenotype groups according to the presence or absence of static varus alignment and varus thrust (dynamic varus): no varus (n = 173), dynamic varus (n = 17), static varus (n = 50), and static varus + dynamic varus (n = 26). The knee range of motion, spatiotemporal gait parameters, visual analog scale scores for knee pain, and scores on the Japanese Knee Osteoarthritis Measure were used to assess clinical outcomes. Multiple logistic regression analyses identified the relationship between knee pain during gait and the 4 phenotypes, adjusted for possible risk factors, including age, sex, body mass index, K/L grade, and gait velocity. Results Multiple logistic regression analysis showed that varus thrust without varus alignment was associated with knee pain during gait (odds ratio [OR] 3.30, 95% confidence interval [95% CI] 1.08–12.4), and that varus thrust combined with varus alignment was strongly associated with knee pain during gait (OR 17.1, 95% CI 3.19–320.0). Sensitivity analyses applying alternative cutoff values for defining static varus alignment showed comparable results. Conclusion Varus thrust with or without static varus alignment was associated with the occurrence of knee pain during gait. Tailored interventions based on individual malalignment phenotypes may improve clinical outcomes in patients with knee OA. PMID:26017348
DNA Nanotubes for NMR Structure Determination of Membrane Proteins

PubMed Central

Bellot, Gaëtan; McClintock, Mark A.; Chou, James J; Shih, William M.

2013-01-01

Structure determination of integral membrane proteins by solution NMR represents one of the most important challenges of structural biology. A Residual-Dipolar-Coupling-based refinement approach can be used to solve the structure of membrane proteins up to 40 kDa in size, however, a weak-alignment medium that is detergent-resistant is required. Previously, availability of media suitable for weak alignment of membrane proteins was severely limited. We describe here a protocol for robust, large-scale synthesis of detergent-resistant DNA nanotubes that can be assembled into dilute liquid crystals for application as weak-alignment media in solution NMR structure determination of membrane proteins in detergent micelles. The DNA nanotubes are heterodimers of 400nm-long six-helix bundles each self-assembled from a M13-based p7308 scaffold strand and >170 short oligonucleotide staple strands. Compatibility with proteins bearing considerable positive charge as well as modulation of molecular alignment, towards collection of linearly independent restraints, can be introduced by reducing the negative charge of DNA nanotubes via counter ions and small DNA binding molecules. This detergent-resistant liquid-crystal media offers a number of properties conducive for membrane protein alignment, including high-yield production, thermal stability, buffer compatibility, and structural programmability. Production of sufficient nanotubes for 4–5 NMR experiments can be completed in one week by a single individual. PMID:23518667
Local-global alignment for finding 3D similarities in protein structures

DOEpatents

Zemla, Adam T [Brentwood, CA

2011-09-20

A method of finding 3D similarities in protein structures of a first molecule and a second molecule. The method comprises providing preselected information regarding the first molecule and the second molecule. Comparing the first molecule and the second molecule using Longest Continuous Segments (LCS) analysis. Comparing the first molecule and the second molecule using Global Distance Test (GDT) analysis. Comparing the first molecule and the second molecule using Local Global Alignment Scoring function (LGA_S) analysis. Verifying constructed alignment and repeating the steps to find the regions of 3D similarities in protein structures.
T-RMSD: a web server for automated fine-grained protein structural classification.

PubMed

Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric

2013-07-01

This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd.
T-RMSD: a web server for automated fine-grained protein structural classification

PubMed Central

Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric

2013-01-01

This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd. PMID:23716642
Accurate Simulation and Detection of Coevolution Signals in Multiple Sequence Alignments

PubMed Central

Ackerman, Sharon H.; Tillier, Elisabeth R.; Gatti, Domenico L.

2012-01-01

Background While the conserved positions of a multiple sequence alignment (MSA) are clearly of interest, non-conserved positions can also be important because, for example, destabilizing effects at one position can be compensated by stabilizing effects at another position. Different methods have been developed to recognize the evolutionary relationship between amino acid sites, and to disentangle functional/structural dependencies from historical/phylogenetic ones. Methodology/Principal Findings We have used two complementary approaches to test the efficacy of these methods. In the first approach, we have used a new program, MSAvolve, for the in silico evolution of MSAs, which records a detailed history of all covarying positions, and builds a global coevolution matrix as the accumulated sum of individual matrices for the positions forced to co-vary, the recombinant coevolution, and the stochastic coevolution. We have simulated over 1600 MSAs for 8 protein families, which reflect sequences of different sizes and proteins with widely different functions. The calculated coevolution matrices were compared with the coevolution matrices obtained for the same evolved MSAs with different coevolution detection methods. In a second approach we have evaluated the capacity of the different methods to predict close contacts in the representative X-ray structures of an additional 150 protein families using only experimental MSAs. Conclusions/Significance Methods based on the identification of global correlations between pairs were found to be generally superior to methods based only on local correlations in their capacity to identify coevolving residues using either simulated or experimental MSAs. However, the significant variability in the performance of different methods with different proteins suggests that the simulation of MSAs that replicate the statistical properties of the experimental MSA can be a valuable tool to identify the coevolution detection method that is most effective in each case. PMID:23091608
Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures

PubMed Central

2012-01-01

Background The NCBI Conserved Domain Database (CDD) consists of a collection of multiple sequence alignments of protein domains that are at various stages of being manually curated into evolutionary hierarchies based on conserved and divergent sequence and structural features. These domain models are annotated to provide insights into the relationships between sequence, structure and function via web-based BLAST searches. Results Here we automate the generation of conserved domain (CD) hierarchies using a combination of heuristic and Markov chain Monte Carlo (MCMC) sampling procedures and starting from a (typically very large) multiple sequence alignment. This procedure relies on statistical criteria to define each hierarchy based on the conserved and divergent sequence patterns associated with protein functional-specialization. At the same time this facilitates the sequence and structural annotation of residues that are functionally important. These statistical criteria also provide a means to objectively assess the quality of CD hierarchies, a non-trivial task considering that the protein subgroups are often very distantly related—a situation in which standard phylogenetic methods can be unreliable. Our aim here is to automatically generate (typically sub-optimal) hierarchies that, based on statistical criteria and visual comparisons, are comparable to manually curated hierarchies; this serves as the first step toward the ultimate goal of obtaining optimal hierarchical classifications. A plot of runtimes for the most time-intensive (non-parallelizable) part of the algorithm indicates a nearly linear time complexity so that, even for the extremely large Rossmann fold protein class, results were obtained in about a day. Conclusions This approach automates the rapid creation of protein domain hierarchies and thus will eliminate one of the most time consuming aspects of conserved domain database curation. At the same time, it also facilitates protein domain annotation by identifying those pattern residues that most distinguish each protein domain subgroup from other related subgroups. PMID:22726767
Visual exploration of parameter influence on phylogenetic trees.

PubMed

Hess, Martin; Bremm, Sebastian; Weissgraeber, Stephanie; Hamacher, Kay; Goesele, Michael; Wiemeyer, Josef; von Landesberger, Tatiana

2014-01-01

Evolutionary relationships between organisms are frequently derived as phylogenetic trees inferred from multiple sequence alignments (MSAs). The MSA parameter space is exponentially large, so tens of thousands of potential trees can emerge for each dataset. A proposed visual-analytics approach can reveal the parameters' impact on the trees. Given input trees created with different parameter settings, it hierarchically clusters the trees according to their structural similarity. The most important clusters of similar trees are shown together with their parameters. This view offers interactive parameter exploration and automatic identification of relevant parameters. Biologists applied this approach to real data of 16S ribosomal RNA and protein sequences of ion channels. It revealed which parameters affected the tree structures. This led to a more reliable selection of the best trees.
Sequence and structural characterization of Trx-Grx type of monothiol glutaredoxins from Ashbya gossypii.

PubMed

Yadav, Saurabh; Kumari, Pragati; Kushwaha, Hemant Ritturaj

2013-01-01

Glutaredoxins are enzymatic antioxidants which are small, ubiquitous, glutathione dependent and essentially classified under thioredoxin-fold superfamily. Glutaredoxins are classified into two types: dithiol and monothiol. Monothiol glutaredoxins which carry the signature "CGFS" as a redox active motif is known for its role in oxidative stress, inside the cell. In the present analysis, the 138 amino acid long monothiol glutaredoxin, AgGRX1 from Ashbya gossypii was identified and has been used for the analysis. The multiple sequence alignment of the AgGRX1 protein sequence revealed the characteristic motif of typical monothiol glutaredoxin as observed in various other organisms. The proposed structure of the AgGRX1 protein was used to analyze signature folds related to the thioredoxin superfamily. Further, the study highlighted the structural features pertaining to the complex mechanism of glutathione docking and interacting residues.
Toxin structures as evolutionary tools: Using conserved 3D folds to study the evolution of rapidly evolving peptides.

PubMed

Undheim, Eivind A B; Mobli, Mehdi; King, Glenn F

2016-06-01

Three-dimensional (3D) structures have been used to explore the evolution of proteins for decades, yet they have rarely been utilized to study the molecular evolution of peptides. Here, we highlight areas in which 3D structures can be particularly useful for studying the molecular evolution of peptide toxins. Although we focus our discussion on animal toxins, including one of the most widespread disulfide-rich peptide folds known, the inhibitor cystine knot, our conclusions should be widely applicable to studies of the evolution of disulfide-constrained peptides. We show that conserved 3D folds can be used to identify evolutionary links and test hypotheses regarding the evolutionary origin of peptides with extremely low sequence identity; construct accurate multiple sequence alignments; and better understand the evolutionary forces that drive the molecular evolution of peptides. Also watch the video abstract. © 2016 WILEY Periodicals, Inc.
Identification of a Herbal Powder by Deoxyribonucleic Acid Barcoding and Structural Analyses.

PubMed

Sheth, Bhavisha P; Thaker, Vrinda S

2015-10-01

Authentic identification of plants is essential for exploiting their medicinal properties as well as to stop the adulteration and malpractices with the trade of the same. To identify a herbal powder obtained from a herbalist in the local vicinity of Rajkot, Gujarat, using deoxyribonucleic acid (DNA) barcoding and molecular tools. The DNA was extracted from a herbal powder and selected Cassia species, followed by the polymerase chain reaction (PCR) and sequencing of the rbcL barcode locus. Thereafter the sequences were subjected to National Center for Biotechnology Information (NCBI) basic local alignment search tool (BLAST) analysis, followed by the protein three-dimension structure determination of the rbcL protein from the herbal powder and Cassia species namely Cassia fistula, Cassia tora and Cassia javanica (sequences obtained in the present study), Cassia Roxburghii, and Cassia abbreviata (sequences retrieved from Genbank). Further, the multiple and pairwise structural alignment were carried out in order to identify the herbal powder. The nucleotide sequences obtained from the selected species of Cassia were submitted to Genbank (Accession No. JX141397, JX141405, JX141420). The NCBI BLAST analysis of the rbcL protein from the herbal powder showed an equal sequence similarity (with reference to different parameters like E value, maximum identity, total score, query coverage) to C. javanica and C. roxburghii. In order to solve the ambiguities of the BLAST result, a protein structural approach was implemented. The protein homology models obtained in the present study were submitted to the protein model database (PM0079748-PM0079753). The pairwise structural alignment of the herbal powder (as template) and C. javanica and C. roxburghii (as targets individually) revealed a close similarity of the herbal powder with C. javanica. A strategy as used here, incorporating the integrated use of DNA barcoding and protein structural analyses could be adopted, as a novel rapid and economic procedure, especially in cases when protein coding loci are considered. Authentic identification of plants is essential for exploiting their medicinal properties as well as to stop the adulteration and malpractices with the trade of the same. A herbal powder was obtained from a herbalist in the local vicinity of Rajkot, Gujarat. An integrated approach using DNA barcoding and structural analyses was carried out to identify the herbal powder. The herbal powder was identified as Cassia javanica L.
Representing and comparing protein structures as paths in three-dimensional space

PubMed Central

Zhi, Degui; Krishna, S Sri; Cao, Haibo; Pevzner, Pavel; Godzik, Adam

2006-01-01

Background Most existing formulations of protein structure comparison are based on detailed atomic level descriptions of protein structures and bypass potential insights that arise from a higher-level abstraction. Results We propose a structure comparison approach based on a simplified representation of proteins that describes its three-dimensional path by local curvature along the generalized backbone of the polypeptide. We have implemented a dynamic programming procedure that aligns curvatures of proteins by optimizing a defined sum turning angle deviation measure. Conclusion Although our procedure does not directly optimize global structural similarity as measured by RMSD, our benchmarking results indicate that it can surprisingly well recover the structural similarity defined by structure classification databases and traditional structure alignment programs. In addition, our program can recognize similarities between structures with extensive conformation changes that are beyond the ability of traditional structure alignment programs. We demonstrate the applications of procedure to several contexts of structure comparison. An implementation of our procedure, CURVE, is available as a public webserver. PMID:17052359
Seeing the Song: Left Auditory Structures May Track Auditory-Visual Dynamic Alignment

PubMed Central

Mossbridge, Julia A.; Grabowecky, Marcia; Suzuki, Satoru

2013-01-01

Auditory and visual signals generated by a single source tend to be temporally correlated, such as the synchronous sounds of footsteps and the limb movements of a walker. Continuous tracking and comparison of the dynamics of auditory-visual streams is thus useful for the perceptual binding of information arising from a common source. Although language-related mechanisms have been implicated in the tracking of speech-related auditory-visual signals (e.g., speech sounds and lip movements), it is not well known what sensory mechanisms generally track ongoing auditory-visual synchrony for non-speech signals in a complex auditory-visual environment. To begin to address this question, we used music and visual displays that varied in the dynamics of multiple features (e.g., auditory loudness and pitch; visual luminance, color, size, motion, and organization) across multiple time scales. Auditory activity (monitored using auditory steady-state responses, ASSR) was selectively reduced in the left hemisphere when the music and dynamic visual displays were temporally misaligned. Importantly, ASSR was not affected when attentional engagement with the music was reduced, or when visual displays presented dynamics clearly dissimilar to the music. These results appear to suggest that left-lateralized auditory mechanisms are sensitive to auditory-visual temporal alignment, but perhaps only when the dynamics of auditory and visual streams are similar. These mechanisms may contribute to correct auditory-visual binding in a busy sensory environment. PMID:24194873
Automated visual inspection of brake shoe wear

NASA Astrophysics Data System (ADS)

Lu, Shengfang; Liu, Zhen; Nan, Guo; Zhang, Guangjun

2015-10-01

With the rapid development of high-speed railway, the automated fault inspection is necessary to ensure train's operation safety. Visual technology is paid more attention in trouble detection and maintenance. For a linear CCD camera, Image alignment is the first step in fault detection. To increase the speed of image processing, an improved scale invariant feature transform (SIFT) method is presented. The image is divided into multiple levels of different resolution. Then, we do not stop to extract the feature from the lowest resolution to the highest level until we get sufficient SIFT key points. At that level, the image is registered and aligned quickly. In the stage of inspection, we devote our efforts to finding the trouble of brake shoe, which is one of the key components in brake system on electrical multiple units train (EMU). Its pre-warning on wear limitation is very important in fault detection. In this paper, we propose an automatic inspection approach to detect the fault of brake shoe. Firstly, we use multi-resolution pyramid template matching technology to fast locate the brake shoe. Then, we employ Hough transform to detect the circles of bolts in brake region. Due to the rigid characteristic of structure, we can identify whether the brake shoe has a fault. The experiments demonstrate that the way we propose has a good performance, and can meet the need of practical applications.
Rice Crop Monitoring Using Microwave and Optical Remotely Sensed Image Data

NASA Astrophysics Data System (ADS)

Suga, Y.; Konishi, T.; Takeuchi, S.; Kitano, Y.; Ito, S.

Hiroshima Institute of Technology HIT is operating the direct down-links of microwave and optical satellite data in Japan This study focuses on the validation for rice crop monitoring using microwave and optical remotely sensed image data acquired by satellites referring to ground truth data such as height of crop ratio of crop vegetation cover and leaf area index in the test sites of Japan ENVISAT-1 ASAR data has a capability to capture regularly and to monitor during the rice growing cycle by alternating cross polarization mode images However ASAR data is influenced by several parameters such as landcover structure direction and alignment of rice crop fields in the test sites In this study the validation was carried out combined with microwave and optical satellite image data and ground truth data regarding rice crop fields to investigate the above parameters Multi-temporal multi-direction descending and ascending and multi-angle ASAR alternating cross polarization mode images were used to investigate rice crop growing cycle LANDSAT data were used to detect landcover structure direction and alignment of rice crop fields corresponding to the backscatter of ASAR As the result of this study it was indicated that rice crop growth can be precisely monitored using multiple remotely sensed data and ground truth data considering with spatial spectral temporal and radiometric resolutions

Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10

PubMed Central

Zhang, Yang

2014-01-01

We develop and test a new pipeline in CASP10 to predict protein structures based on an interplay of I-TASSER and QUARK for both free-modeling (FM) and template-based modeling (TBM) targets. The most noteworthy observation is that sorting through the threading template pool using the QUARK-based ab initio models as probes allows the detection of distant-homology templates which might be ignored by the traditional sequence profile-based threading alignment algorithms. Further template assembly refinement by I-TASSER resulted in successful folding of two medium-sized FM targets with >150 residues. For TBM, the multiple threading alignments from LOMETS are, for the first time, incorporated into the ab initio QUARK simulations, which were further refined by I-TASSER assembly refinement. Compared with the traditional threading assembly refinement procedures, the inclusion of the threading-constrained ab initio folding models can consistently improve the quality of the full-length models as assessed by the GDT-HA and hydrogen-bonding scores. Despite the success, significant challenges still exist in domain boundary prediction and consistent folding of medium-size proteins (especially beta-proteins) for nonhomologous targets. Further developments of sensitive fold-recognition and ab initio folding methods are critical for solving these problems. PMID:23760925
Interplay of I-TASSER and QUARK for template-based and ab initio protein structure prediction in CASP10.

PubMed

Zhang, Yang

2014-02-01

We develop and test a new pipeline in CASP10 to predict protein structures based on an interplay of I-TASSER and QUARK for both free-modeling (FM) and template-based modeling (TBM) targets. The most noteworthy observation is that sorting through the threading template pool using the QUARK-based ab initio models as probes allows the detection of distant-homology templates which might be ignored by the traditional sequence profile-based threading alignment algorithms. Further template assembly refinement by I-TASSER resulted in successful folding of two medium-sized FM targets with >150 residues. For TBM, the multiple threading alignments from LOMETS are, for the first time, incorporated into the ab initio QUARK simulations, which were further refined by I-TASSER assembly refinement. Compared with the traditional threading assembly refinement procedures, the inclusion of the threading-constrained ab initio folding models can consistently improve the quality of the full-length models as assessed by the GDT-HA and hydrogen-bonding scores. Despite the success, significant challenges still exist in domain boundary prediction and consistent folding of medium-size proteins (especially beta-proteins) for nonhomologous targets. Further developments of sensitive fold-recognition and ab initio folding methods are critical for solving these problems. Copyright © 2013 Wiley Periodicals, Inc.
Living nanofiber yarn-based woven biotextiles for tendon tissue engineering using cell tri-culture and mechanical stimulation.

PubMed

Wu, Shaohua; Wang, Ying; Streubel, Philipp N; Duan, Bin

2017-10-15

Non-woven nanofibrous scaffolds have been developed for tendon graft application by using electrospinning strategies. However, electrospun nanofibrous scaffolds face some obstacles and limitations, including suboptimal scaffold structure, weak tensile and suture-retention strengths, and compact structure for cell infiltration. In this work, a novel nanofibrous, woven biotextile, fabricated based on electrospun nanofiber yarns, was implemented as a tissue engineered tendon scaffold. Based on our modified electrospinning setup, polycaprolactone (PCL) nanofiber yarns were fabricated with reproducible quality, and were further processed into plain-weaving fabrics interlaced with polylactic acid (PLA) multifilaments. Nonwoven nanofibrous PCL meshes with random or aligned fiber structures were generated using typical electrospinning as comparative counterparts. The woven fabrics contained 3D aligned microstructures with significantly larger pore size and obviously enhanced tensile mechanical properties than their nonwoven counterparts. The biological results revealed that cell proliferation and infiltration, along with the expression of tendon-specific genes by human adipose derived mesenchymal stem cells (HADMSC) and human tenocytes (HT), were significantly enhanced on the woven fabrics compared with those on randomly-oriented or aligned nanofiber meshes. Co-cultures of HADMSC with HT or human umbilical vein endothelial cells (HUVEC) on woven fabrics significantly upregulated the functional expression of most tenogenic markers. HADMSC/HT/HUVEC tri-culture on woven fabrics showed the highest upregulation of most tendon-associated markers than all the other mono- and co-culture groups. Furthermore, we conditioned the tri-cultured constructs with dynamic conditioning and demonstrated that dynamic stretch promoted total collagen secretion and tenogenic differentiation. Our nanofiber yarn-based biotextiles have significant potential to be used as engineered scaffolds to synergize the multiple cell interaction and mechanical stimulation for promoting tendon regeneration. Tendon grafts are essential for the treatment of various tendon-related conditions due to the inherently poor healing capacity of native tendon tissues. In this study, we combined electrospun nanofiber yarns with textile manufacturing strategies to fabricate nanofibrous woven biotextiles with hierarchical features, aligned fibrous topography, and sufficient mechanical properties as tendon tissue engineered scaffolds. Comparing to traditional electrospun random or aligned meshes, our novel nanofibrous woven fabrics possess strong tensile and suture-retention strengths and larger pore size. We also demonstrated that the incorporation of tendon cells and vascular cells promoted the tenogenic differentiation of the engineered tendon constructs, especially under dynamic stretch. This study not only presents a novel tissue engineered tendon scaffold fabrication technique but also provides a useful strategy to promote tendon differentiation and regeneration. Copyright © 2017 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
Formation of the Sun-aligned arc region and the void (polar slot) under the null-separator structure

NASA Astrophysics Data System (ADS)

Tanaka, T.; Obara, T.; Watanabe, M.; Fujita, S.; Ebihara, Y.; Kataoka, R.

2017-04-01

From the global magnetosphere-ionosphere coupling simulation, we examined the formation of the Sun-aligned arc region and the void (polar slot) under the northward interplanetary magnetic field (IMF) with negative By condition. In the magnetospheric null-separator structure, the separatrices generated from two null points and two separators divide the entire space into four types of magnetic region, i.e., the IMF, the northern open magnetic field, the southern open magnetic field, and the closed magnetic field. In the ionosphere, the Sun-aligned arc region and the void are reproduced in the distributions of simulated plasma pressure and field-aligned current. The outermost closed magnetic field lines on the boundary (separatrix) between the northern open magnetic field and the closed magnetic field are projected to the northern ionosphere at the boundary between the Sun-aligned arc region and the void, both on the morning and evening sides. The magnetic field lines at the plasma sheet inner edge are projected to the equatorward boundary of the oval. Therefore, the Sun-aligned arc region is on the closed magnetic field lines of the plasma sheet. In the plasma sheet, an inflated structure (bulge) is generated at the junction of the tilted plasma sheet in the far-to-middle tail and nontilted plasma sheet in the ring current region. In the Northern Hemisphere, the bulge is on the evening side wrapped by the outermost closed magnetic field lines that are connected to the northern evening ionosphere. This inflated structure (bulge) is associated with shear flows that cause the Sun-aligned arc.
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

DOE Office of Scientific and Technical Information (OSTI.GOV)

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

DOE PAGES

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

2015-11-19

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
High-harmonic spectroscopy of aligned molecules

NASA Astrophysics Data System (ADS)

Yun, Hyeok; Yun, Sang Jae; Lee, Gae Hwang; Nam, Chang Hee

2017-01-01

High harmonics emitted from aligned molecules driven by intense femtosecond laser pulses provide the opportunity to explore the structural information of molecules. The field-free molecular alignment technique is an expedient tool for investigating the structural characteristics of linear molecules. The underlying physics of field-free alignment, showing the characteristic revival structure specific to molecular species, is clearly explained from the quantum-phase analysis of molecular rotational states. The anisotropic nature of molecules is shown from the harmonic polarization measurement performed with spatial interferometry. The multi-orbital characteristics of molecules are investigated using high-harmonic spectroscopy, applied to molecules of N2 and CO2. In the latter case the two-dimensional high-harmonic spectroscopy, implemented using a two-color laser field, is applied to distinguish harmonics from different orbitals. Molecular high-harmonic spectroscopy will open a new route to investigate ultrafast dynamics of molecules.
High accuracy prediction of beta-turns and their types using propensities and multiple alignments.

PubMed

Fuchs, Patrick F J; Alix, Alain J P

2005-06-01

We have developed a method that predicts both the presence and the type of beta-turns, using a straightforward approach based on propensities and multiple alignments. The propensities were calculated classically, but the way to use them for prediction was completely new: starting from a tetrapeptide sequence on which one wants to evaluate the presence of a beta-turn, the propensity for a given residue is modified by taking into account all the residues present in the multiple alignment at this position. The evaluation of a score is then done by weighting these propensities by the use of Position-specific score matrices generated by PSI-BLAST. The introduction of secondary structure information predicted by PSIPRED or SSPRO2 as well as taking into account the flanking residues around the tetrapeptide improved the accuracy greatly. This latter evaluated on a database of 426 reference proteins (previously used on other studies) by a sevenfold crossvalidation gave very good results with a Matthews Correlation Coefficient (MCC) of 0.42 and an overall prediction accuracy of 74.8%; this places our method among the best ones. A jackknife test was also done, which gave results within the same range. This shows that it is possible to reach neural networks accuracy with considerably less computional cost and complexity. Furthermore, propensities remain excellent descriptors of amino acid tendencies to belong to beta-turns, which can be useful for peptide or protein engineering and design. For beta-turn type prediction, we reached the best accuracy ever published in terms of MCC (except for the irregular type IV) in the range of 0.25-0.30 for types I, II, and I' and 0.13-0.15 for types VIII, II', and IV. To our knowledge, our method is the only one available on the Web that predicts types I' and II'. The accuracy evaluated on two larger databases of 547 and 823 proteins was not improved significantly. All of this was implemented into a Web server called COUDES (French acronym for: Chercher Ou Une Deviation Existe Surement), which is available at the following URL: http://bioserv.rpbs.jussieu.fr/Coudes/index.html within the new bioinformatics platform RPBS.
How do we deal with multiple goals for care within an individual patient trajectory? A document content analysis of health service research papers on goals for care

PubMed Central

Berntsen, G K R; Gammon, D; Steinsbekk, A; Salamonsen, A; Foss, N; Ruland, C; Fønnebø, V

2015-01-01

Objectives Patients with complex long-term needs experience multiple parallel care processes, which may have conflicting or competing goals, within their individual patient trajectory (iPT). The alignment of multiple goals is often implicit or non-existent, and has received little attention in the literature. Research questions: (1) What goals for care relevant for the iPT can be identified from the literature? (2) What goal typology can be proposed based on goal characteristics? (3) How can professionals negotiate a consistent set of goals for the iPT? Design Document content analysis of health service research papers, on the topic of ‘goals for care’. Setting With the increasing prevalence of multimorbidity, guidance regarding the identification and alignment of goals for care across organisations and disciplines is urgently needed. Participants 70 papers that describe ‘goals for care’, ‘health’ or ‘the good healthcare process’ relevant to a general iPT, identified in a step-wise structured search of MEDLINE, Web of Science and Google Scholar. Results We developed a goal typology with four categories. Three categories are professionally defined: (1) Functional, (2) Biological/Disease and (3) Adaptive goals. The fourth category is the patient's personally defined goals. Professional and personal goals may conflict, in which case goal prioritisation by creation of a goal hierarchy can be useful. We argue that the patient has the moral and legal right to determine the goals at the top of such a goal hierarchy. Professionals can then translate personal goals into realistic professional goals such as standardised health outcomes linked to evidence-based guidelines. Thereby, when goals are aligned with one another, the iPT will be truly patient centred, while care follows professional guidelines. Conclusions Personal goals direct professional goals and define the success criteria of the iPT. However, making personal goals count requires brave and wide-sweeping attitudinal, organisational and regulatory transformation of care delivery. PMID:26656243
Hidden Markov models of biological primary sequence information.

PubMed Central

Baldi, P; Chauvin, Y; Hunkapiller, T; McClure, M A

1994-01-01

Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences. PMID:8302831
Gemi: PCR Primers Prediction from Multiple Alignments

PubMed Central

Sobhy, Haitham; Colson, Philippe

2012-01-01

Designing primers and probes for polymerase chain reaction (PCR) is a preliminary and critical step that requires the identification of highly conserved regions in a given set of sequences. This task can be challenging if the targeted sequences display a high level of diversity, as frequently encountered in microbiologic studies. We developed Gemi, an automated, fast, and easy-to-use bioinformatics tool with a user-friendly interface to design primers and probes based on multiple aligned sequences. This tool can be used for the purpose of real-time and conventional PCR and can deal efficiently with large sets of sequences of a large size. PMID:23316117
Open-Phylo: a customizable crowd-computing platform for multiple sequence alignment

PubMed Central

2013-01-01

Citizen science games such as Galaxy Zoo, Foldit, and Phylo aim to harness the intelligence and processing power generated by crowds of online gamers to solve scientific problems. However, the selection of the data to be analyzed through these games is under the exclusive control of the game designers, and so are the results produced by gamers. Here, we introduce Open-Phylo, a freely accessible crowd-computing platform that enables any scientist to enter our system and use crowds of gamers to assist computer programs in solving one of the most fundamental problems in genomics: the multiple sequence alignment problem. PMID:24148814
Taxonaut: an application software for comparative display of multiple taxonomies with a use case of GBIF Species API

PubMed Central

2016-01-01

Abstract Background The Species API of the Global Biodiversity Information Facility (GBIF) provides public access to taxonomic data aggregated from multiple data sources. Each data source follows its own classification which can be inconsistent with classifications from other sources. Even with a reference classification e.g. the GBIF Backbone taxonomy, a comprehensive method to compare classifications in the data aggregation is essential, especially for non-expert users. New information A Java application was developed to compare multiple taxonomies graphically using classification data acquired from GBIF’s ChecklistBank via the GBIF Species API. It uses a table to display taxonomies where each column represents a taxonomy under comparison, with an aligner column to organise taxa by name. Each cell contains the name of a taxon if the classification in that column contains the name. Each column also has a cell showing the hierarchy of the taxonomy by a folder metaphor where taxa are aligned and synchronised in the aligner column. A set of those comparative tables shows taxa categorised by relationship between taxonomies. The result set is also available as tables in an Excel format file. PMID:27932916
Combining multiple thresholding binarization values to improve OCR output

NASA Astrophysics Data System (ADS)

Lund, William B.; Kennard, Douglas J.; Ringger, Eric K.

2013-01-01

For noisy, historical documents, a high optical character recognition (OCR) word error rate (WER) can render the OCR text unusable. Since image binarization is often the method used to identify foreground pixels, a body of research seeks to improve image-wide binarization directly. Instead of relying on any one imperfect binarization technique, our method incorporates information from multiple simple thresholding binarizations of the same image to improve text output. Using a new corpus of 19th century newspaper grayscale images for which the text transcription is known, we observe WERs of 13.8% and higher using current binarization techniques and a state-of-the-art OCR engine. Our novel approach combines the OCR outputs from multiple thresholded images by aligning the text output and producing a lattice of word alternatives from which a lattice word error rate (LWER) is calculated. Our results show a LWER of 7.6% when aligning two threshold images and a LWER of 6.8% when aligning five. From the word lattice we commit to one hypothesis by applying the methods of Lund et al. (2011) achieving an improvement over the original OCR output and a 8.41% WER result on this data set.
Method for alignment of microwires

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beardslee, Joseph A.; Lewis, Nathan S.; Sadtler, Bryce

2017-01-24

A method of aligning microwires includes modifying the microwires so they are more responsive to a magnetic field. The method also includes using a magnetic field so as to magnetically align the microwires. The method can further include capturing the microwires in a solid support structure that retains the longitudinal alignment of the microwires when the magnetic field is not applied to the microwires.
Combining Physicochemical and Evolutionary Information for Protein Contact Prediction

PubMed Central

Schneider, Michael; Brock, Oliver

2014-01-01

We introduce a novel contact prediction method that achieves high prediction accuracy by combining evolutionary and physicochemical information about native contacts. We obtain evolutionary information from multiple-sequence alignments and physicochemical information from predicted ab initio protein structures. These structures represent low-energy states in an energy landscape and thus capture the physicochemical information encoded in the energy function. Such low-energy structures are likely to contain native contacts, even if their overall fold is not native. To differentiate native from non-native contacts in those structures, we develop a graph-based representation of the structural context of contacts. We then use this representation to train an support vector machine classifier to identify most likely native contacts in otherwise non-native structures. The resulting contact predictions are highly accurate. As a result of combining two sources of information—evolutionary and physicochemical—we maintain prediction accuracy even when only few sequence homologs are present. We show that the predicted contacts help to improve ab initio structure prediction. A web service is available at http://compbio.robotics.tu-berlin.de/epc-map/. PMID:25338092
Line Up, Line Up: Using Technology to Align and Enhance Peer Learning and Assessment in a Student Centred Foundation Organic Chemistry Module

ERIC Educational Resources Information Center

Ryan, Barry J.

2013-01-01

This paper describes how three technologies were utilised in combination to align student learning and assessment as part of a case study. Multiple choice questions (MCQs) were central to all these technologies. The peer learning technologies; Personal Response Devices (a.k.a. "Clickers") and "PeerWise"…
System and method for detecting components of a mixture including tooth elements for alignment

DOEpatents

Sommer, Gregory Jon; Schaff, Ulrich Y.

2016-11-22

Examples are described including assay platforms having tooth elements. An impinging element may sequentially engage tooth elements on the assay platform to sequentially align corresponding detection regions with a detection unit. In this manner, multiple measurements may be made of detection regions on the assay platform without necessarily requiring the starting and stopping of a motor.
Exact calculation of distributions on integers, with application to sequence alignment.

PubMed

Newberg, Lee A; Lawrence, Charles E

2009-01-01

Computational biology is replete with high-dimensional discrete prediction and inference problems. Dynamic programming recursions can be applied to several of the most important of these, including sequence alignment, RNA secondary-structure prediction, phylogenetic inference, and motif finding. In these problems, attention is frequently focused on some scalar quantity of interest, a score, such as an alignment score or the free energy of an RNA secondary structure. In many cases, score is naturally defined on integers, such as a count of the number of pairing differences between two sequence alignments, or else an integer score has been adopted for computational reasons, such as in the test of significance of motif scores. The probability distribution of the score under an appropriate probabilistic model is of interest, such as in tests of significance of motif scores, or in calculation of Bayesian confidence limits around an alignment. Here we present three algorithms for calculating the exact distribution of a score of this type; then, in the context of pairwise local sequence alignments, we apply the approach so as to find the alignment score distribution and Bayesian confidence limits.
Dimension towers of SICs. I. Aligned SICs and embedded tight frames

NASA Astrophysics Data System (ADS)

Appleby, Marcus; Bengtsson, Ingemar; Dumitru, Irina; Flammia, Steven

2017-11-01

Algebraic number theory relates SIC-POVMs in dimension d > 3 to those in dimension d(d - 2). We define a SIC in dimension d(d - 2) to be aligned to a SIC in dimension d if and only if the squares of the overlap phases in dimension d appear as a subset of the overlap phases in dimension d(d - 2) in a specified way. We give 19 (mostly numerical) examples of aligned SICs. We conjecture that given any SIC in dimension d, there exists an aligned SIC in dimension d(d - 2). In all our examples, the aligned SIC has lower dimensional equiangular tight frames embedded in it. If d is odd so that a natural tensor product structure exists, we prove that the individual vectors in the aligned SIC have a very special entanglement structure, and the existence of the embedded tight frames follows as a theorem. If d - 2 is an odd prime number, we prove that a complete set of mutually unbiased bases can be obtained by reducing an aligned SIC to this dimension.

Physically motivated global alignment method for electron tomography

DOE PAGES

Sanders, Toby; Prange, Micah; Akatay, Cem; ...

2015-04-08

Electron tomography is widely used for nanoscale determination of 3-D structures in many areas of science. Determining the 3-D structure of a sample from electron tomography involves three major steps: acquisition of sequence of 2-D projection images of the sample with the electron microscope, alignment of the images to a common coordinate system, and 3-D reconstruction and segmentation of the sample from the aligned image data. The resolution of the 3-D reconstruction is directly influenced by the accuracy of the alignment, and therefore, it is crucial to have a robust and dependable alignment method. In this paper, we develop amore » new alignment method which avoids the use of markers and instead traces the computed paths of many identifiable ‘local’ center-of-mass points as the sample is rotated. Compared with traditional correlation schemes, the alignment method presented here is resistant to cumulative error observed from correlation techniques, has very rigorous mathematical justification, and is very robust since many points and paths are used, all of which inevitably improves the quality of the reconstruction and confidence in the scientific results.« less
Technology Alignment and Portfolio Prioritization (TAPP)

NASA Technical Reports Server (NTRS)

Funaro, Gregory V.; Alexander, Reginald A.

2015-01-01

Technology Alignment and Portfolio Prioritization (TAPP) is a method being developed by the Advanced Concepts Office, at NASA Marshall Space Flight Center. The TAPP method expands on current technology assessment methods by incorporating the technological structure underlying technology development, e.g., organizational structures and resources, institutional policy and strategy, and the factors that motivate technological change. This paper discusses the methods ACO is currently developing to better perform technology assessments while taking into consideration Strategic Alignment, Technology Forecasting, and Long Term Planning.
fRMSDPred: Predicting Local RMSD Between Structural Fragments Using Sequence Information

DTIC Science & Technology

2007-04-04

machine learning approaches for estimating the RMSD value of a pair of protein fragments. These estimated fragment-level RMSD values can be used to construct the alignment, assess the quality of an alignment, and identify high-quality alignment segments. We present algorithms to solve this fragment-level RMSD prediction problem using a supervised learning framework based on support vector regression and classification that incorporates protein profiles, predicted secondary structure, effective information encoding schemes, and novel second-order pairwise exponential kernel
GenPlay Multi-Genome, a tool to compare and analyze multiple human genomes in a graphical interface.

PubMed

Lajugie, Julien; Fourel, Nicolas; Bouhassira, Eric E

2015-01-01

Parallel visualization of multiple individual human genomes is a complex endeavor that is rapidly gaining importance with the increasing number of personal, phased and cancer genomes that are being generated. It requires the display of variants such as SNPs, indels and structural variants that are unique to specific genomes and the introduction of multiple overlapping gaps in the reference sequence. Here, we describe GenPlay Multi-Genome, an application specifically written to visualize and analyze multiple human genomes in parallel. GenPlay Multi-Genome is ideally suited for the comparison of allele-specific expression and functional genomic data obtained from multiple phased genomes in a graphical interface with access to multiple-track operation. It also allows the analysis of data that have been aligned to custom genomes rather than to a standard reference and can be used as a variant calling format file browser and as a tool to compare different genome assembly, such as hg19 and hg38. GenPlay is available under the GNU public license (GPL-3) from http://genplay.einstein.yu.edu. The source code is available at https://github.com/JulienLajugie/GenPlay. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
LenVarDB: database of length-variant protein domains.

PubMed

Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan

2014-01-01

Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.
An alternative view of protein fold space.

PubMed

Shindyalov, I N; Bourne, P E

2000-02-15

Comparing and subsequently classifying protein structures information has received significant attention concurrent with the increase in the number of experimentally derived 3-dimensional structures. Classification schemes have focused on biological function found within protein domains and on structure classification based on topology. Here an alternative view is presented that groups substructures. Substructures are long (50-150 residue) highly repetitive near-contiguous pieces of polypeptide chain that occur frequently in a set of proteins from the PDB defined as structurally non-redundant over the complete polypeptide chain. The substructure classification is based on a previously reported Combinatorial Extension (CE) algorithm that provides a significantly different set of structure alignments than those previously described, having, for example, only a 40% overlap with FSSP. Qualitatively the algorithm provides longer contiguous aligned segments at the price of a slightly higher root-mean-square deviation (rmsd). Clustering these alignments gives a discreet and highly repetitive set of substructures not detectable by sequence similarity alone. In some cases different substructures represent all or different parts of well known folds indicative of the Russian doll effect--the continuity of protein fold space. In other cases they fall into different structure and functional classifications. It is too early to determine whether these newly classified substructures represent new insights into the evolution of a structural framework important to many proteins. What is apparent from on-going work is that these substructures have the potential to be useful probes in finding remote sequence homology and in structure prediction studies. The characteristics of the complete all-by-all comparison of the polypeptide chains present in the PDB and details of the filtering procedure by pair-wise structure alignment that led to the emergent substructure gallery are discussed. Substructure classification, alignments, and tools to analyze them are available at http://cl.sdsc.edu/ce.html.
G protein-coupled odorant receptors: From sequence to structure.

PubMed

de March, Claire A; Kim, Soo-Kyung; Antonczak, Serge; Goddard, William A; Golebiowski, Jérôme

2015-09-01

Odorant receptors (ORs) are the largest subfamily within class A G protein-coupled receptors (GPCRs). No experimental structural data of any OR is available to date and atomic-level insights are likely to be obtained by means of molecular modeling. In this article, we critically align sequences of ORs with those GPCRs for which a structure is available. Here, an alignment consistent with available site-directed mutagenesis data on various ORs is proposed. Using this alignment, the choice of the template is deemed rather minor for identifying residues that constitute the wall of the binding cavity or those involved in G protein recognition. © 2015 The Protein Society.
Energy Level Alignment at the Interface between Linear-Structured Benzenediamine Molecules and Au(111) Surface

NASA Astrophysics Data System (ADS)

Li, Guo; Rangel, Tonatiuh; Liu, Zhenfei; Cooper, Valentino; Neaton, Jeffrey

Using density functional theory with model self-energy corrections, we calculate the adsorption energetics and geometry, and the energy level alignment of benzenediamine (BDA) molecules adsorbed on Au(111) surfaces. Our calculations show that linear structures of BDA, stabilized via hydrogen bonds between amine groups, are energetically more favorable than monomeric phases. Moreover, our self-energy-corrected calculations of energy level alignment show that the highest occupied molecular orbital energy of the BDA linear structure is deeper relative to the Fermi level relative to the isolated monomer and agrees well with the values measured with photoemission spectroscopy. This work supported by DOE.
G protein-coupled odorant receptors: From sequence to structure

PubMed Central

de March, Claire A; Kim, Soo-Kyung; Antonczak, Serge; Goddard, William A; Golebiowski, Jérôme

2015-01-01

Odorant receptors (ORs) are the largest subfamily within class A G protein-coupled receptors (GPCRs). No experimental structural data of any OR is available to date and atomic-level insights are likely to be obtained by means of molecular modeling. In this article, we critically align sequences of ORs with those GPCRs for which a structure is available. Here, an alignment consistent with available site-directed mutagenesis data on various ORs is proposed. Using this alignment, the choice of the template is deemed rather minor for identifying residues that constitute the wall of the binding cavity or those involved in G protein recognition. PMID:26044705
FEAST: sensitive local alignment with multiple rates of evolution.

PubMed

Hudek, Alexander K; Brown, Daniel G

2011-01-01

We present a pairwise local aligner, FEAST, which uses two new techniques: a sensitive extension algorithm for identifying homologous subsequences, and a descriptive probabilistic alignment model. We also present a new procedure for training alignment parameters and apply it to the human and mouse genomes, producing a better parameter set for these sequences. Our extension algorithm identifies homologous subsequences by considering all evolutionary histories. It has higher maximum sensitivity than Viterbi extensions, and better balances specificity. We model alignments with several submodels, each with unique statistical properties, describing strongly similar and weakly similar regions of homologous DNA. Training parameters using two submodels produces superior alignments, even when we align with only the parameters from the weaker submodel. Our extension algorithm combined with our new parameter set achieves sensitivity 0.59 on synthetic tests. In contrast, LASTZ with default settings achieves sensitivity 0.35 with the same false positive rate. Using the weak submodel as parameters for LASTZ increases its sensitivity to 0.59 with high error. FEAST is available at http://monod.uwaterloo.ca/feast/.
Highly sensitive detection of individual HEAT and ARM repeats with HHpred and COACH.

PubMed

Kippert, Fred; Gerloff, Dietlind L

2009-09-24

HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains.
Highly Sensitive Detection of Individual HEAT and ARM Repeats with HHpred and COACH

PubMed Central

Kippert, Fred; Gerloff, Dietlind L.

2009-01-01

Background HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Methodology and Principal Findings Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. Significance A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains. PMID:19777061
Improving the quality of biomarker candidates in untargeted metabolomics via peak table-based alignment of comprehensive two-dimensional gas chromatography-mass spectrometry data

PubMed Central

Bean, Heather D.; Hill, Jane E.; Dimandja, Jean-Marie D.

2015-01-01

The potential of high-resolution analytical technologies like GC×GC/TOF MS in untargeted metabolomics and biomarker discovery has been limited by the development of fully automated software that can efficiently align and extract information from multiple chromatographic data sets. In this work we report the first investigation on a peak-by-peak basis of the chromatographic factors that impact GC×GC data alignment. A representative set of 16 compounds of different chromatographic characteristics were followed through the alignment of 63 GC×GC chromatograms. We found that varying the mass spectral match parameter had a significant influence on the alignment for poorly- resolved peaks, especially those at the extremes of the detector linear range, and no influence on well- chromatographed peaks. Therefore, optimized chromatography is required for proper GC×GC data alignment. Based on these observations, a workflow is presented for the conservative selection of biomarker candidates from untargeted metabolomics analyses. PMID:25857541
Hal: an automated pipeline for phylogenetic analyses of genomic data.

PubMed

Robbertse, Barbara; Yoder, Ryan J; Boyd, Alex; Reeves, John; Spatafora, Joseph W

2011-02-07

The rapid increase in genomic and genome-scale data is resulting in unprecedented levels of discrete sequence data available for phylogenetic analyses. Major analytical impasses exist, however, prior to analyzing these data with existing phylogenetic software. Obstacles include the management of large data sets without standardized naming conventions, identification and filtering of orthologous clusters of proteins or genes, and the assembly of alignments of orthologous sequence data into individual and concatenated super alignments. Here we report the production of an automated pipeline, Hal that produces multiple alignments and trees from genomic data. These alignments can be produced by a choice of four alignment programs and analyzed by a variety of phylogenetic programs. In short, the Hal pipeline connects the programs BLASTP, MCL, user specified alignment programs, GBlocks, ProtTest and user specified phylogenetic programs to produce species trees. The script is available at sourceforge (http://sourceforge.net/projects/bio-hal/). The results from an example analysis of Kingdom Fungi are briefly discussed.
Flexible, fast and accurate sequence alignment profiling on GPGPU with PaSWAS.

PubMed

Warris, Sven; Yalcin, Feyruz; Jackson, Katherine J L; Nap, Jan Peter

2015-01-01

To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis. With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation.
Unsupervised image matching based on manifold alignment.

PubMed

Pei, Yuru; Huang, Fengchun; Shi, Fuhao; Zha, Hongbin

2012-08-01

This paper challenges the issue of automatic matching between two image sets with similar intrinsic structures and different appearances, especially when there is no prior correspondence. An unsupervised manifold alignment framework is proposed to establish correspondence between data sets by a mapping function in the mutual embedding space. We introduce a local similarity metric based on parameterized distance curves to represent the connection of one point with the rest of the manifold. A small set of valid feature pairs can be found without manual interactions by matching the distance curve of one manifold with the curve cluster of the other manifold. To avoid potential confusions in image matching, we propose an extended affine transformation to solve the nonrigid alignment in the embedding space. The comparatively tight alignments and the structure preservation can be obtained simultaneously. The point pairs with the minimum distance after alignment are viewed as the matchings. We apply manifold alignment to image set matching problems. The correspondence between image sets of different poses, illuminations, and identities can be established effectively by our approach.
Rapid shear alignment of sub-10 nm cylinder-forming block copolymer films based on thermal expansion mismatch

NASA Astrophysics Data System (ADS)

Nicaise, Samuel M.; Gadelrab, Karim R.; G, Amir Tavakkoli K.; Ross, Caroline A.; Alexander-Katz, Alfredo; Berggren, Karl K.

2018-01-01

Directed self-assembly of block copolymers (BCPs) provided by shear-stress can produce aligned sub-10 nm structures over large areas for applications in integrated circuits, next-generation data storage, and plasmonic structures. In this work, we present a fast, versatile BCP shear-alignment process based on coefficient of thermal expansion mismatch of the BCP film, a rigid top coat and a substrate. Monolayer and bilayer cylindrical microdomains of poly(styrene-b-dimethylsiloxane) aligned preferentially in-plane and orthogonal to naturally-forming or engineered cracks in the top coat film, allowing for orientation control over 1 cm2 substrates. Annealing temperatures, up to 275 °C, provided low-defect alignment up to 2 mm away from cracks for rapid (<1 min) annealing times. Finite-element simulations of the stress as a function of annealing time, annealing temperature, and distance from cracks showed that shear stress during the cooling phase of the thermal annealing was critical for the observed microdomain alignment.
Statistical Significance of Optical Map Alignments

PubMed Central

Sarkar, Deepayan; Goldstein, Steve; Schwartz, David C.

2012-01-01

Abstract The Optical Mapping System constructs ordered restriction maps spanning entire genomes through the assembly and analysis of large datasets comprising individually analyzed genomic DNA molecules. Such restriction maps uniquely reveal mammalian genome structure and variation, but also raise computational and statistical questions beyond those that have been solved in the analysis of smaller, microbial genomes. We address the problem of how to filter maps that align poorly to a reference genome. We obtain map-specific thresholds that control errors and improve iterative assembly. We also show how an optimal self-alignment score provides an accurate approximation to the probability of alignment, which is useful in applications seeking to identify structural genomic abnormalities. PMID:22506568
GCALIGNER 1.0: an alignment program to compute a multiple sample comparison data matrix from large eco-chemical datasets obtained by GC.

PubMed

Dellicour, Simon; Lecocq, Thomas

2013-10-01

GCALIGNER 1.0 is a computer program designed to perform a preliminary data comparison matrix of chemical data obtained by GC without MS information. The alignment algorithm is based on the comparison between the retention times of each detected compound in a sample. In this paper, we test the GCALIGNER efficiency on three datasets of the chemical secretions of bumble bees. The algorithm performs the alignment with a low error rate (<3%). GCALIGNER 1.0 is a useful, simple and free program based on an algorithm that enables the alignment of table-type data from GC. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Biological intuition in alignment-free methods: response to Posada.

PubMed

Ragan, Mark A; Chan, Cheong Xin

2013-08-01

A recent editorial in Journal of Molecular Evolution highlights opportunities and challenges facing molecular evolution in the era of next-generation sequencing. Abundant sequence data should allow more-complex models to be fit at higher confidence, making phylogenetic inference more reliable and improving our understanding of evolution at the molecular level. However, concern that approaches based on multiple sequence alignment may be computationally infeasible for large datasets is driving the development of so-called alignment-free methods for sequence comparison and phylogenetic inference. The recent editorial characterized these approaches as model-free, not based on the concept of homology, and lacking in biological intuition. We argue here that alignment-free methods have not abandoned models or homology, and can be biologically intuitive.

Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV)

PubMed Central

Martin, Andrew C. R.

2014-01-01

The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and ’dotifying’ repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/. PMID:25653836
Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV).

PubMed

Martin, Andrew C R

2014-01-01

The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and 'dotifying' repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/.
DNA nanotubes for NMR structure determination of membrane proteins.

PubMed

Bellot, Gaëtan; McClintock, Mark A; Chou, James J; Shih, William M

2013-04-01

Finding a way to determine the structures of integral membrane proteins using solution nuclear magnetic resonance (NMR) spectroscopy has proved to be challenging. A residual-dipolar-coupling-based refinement approach can be used to resolve the structure of membrane proteins up to 40 kDa in size, but to do this you need a weak-alignment medium that is detergent-resistant and it has thus far been difficult to obtain such a medium suitable for weak alignment of membrane proteins. We describe here a protocol for robust, large-scale synthesis of detergent-resistant DNA nanotubes that can be assembled into dilute liquid crystals for application as weak-alignment media in solution NMR structure determination of membrane proteins in detergent micelles. The DNA nanotubes are heterodimers of 400-nm-long six-helix bundles, each self-assembled from a M13-based p7308 scaffold strand and >170 short oligonucleotide staple strands. Compatibility with proteins bearing considerable positive charge as well as modulation of molecular alignment, toward collection of linearly independent restraints, can be introduced by reducing the negative charge of DNA nanotubes using counter ions and small DNA-binding molecules. This detergent-resistant liquid-crystal medium offers a number of properties conducive for membrane protein alignment, including high-yield production, thermal stability, buffer compatibility and structural programmability. Production of sufficient nanotubes for four or five NMR experiments can be completed in 1 week by a single individual.
First observation of rotational structures in Re 168

DOE PAGES

Hartley, D. J.; Janssens, R. V. F.; Riedinger, L. L.; ...

2016-11-30

We assigned first rotational sequences to the odd-odd nucleus 168Re. Coincidence relationships of these structures with rhenium x rays confirm the isotopic assignment, while arguments based on the γ-ray multiplicity (K-fold) distributions observed with the new bands lead to the mass assignment. Configurations for the two bands were determined through analysis of the rotational alignments of the structures and a comparison of the experimental B(M1)/B(E2) ratios with theory. Tentative spin assignments are proposed for the πh 11/2νi 13/2 band, based on energy level systematics for other known sequences in neighboring odd-odd rhenium nuclei, as well as on systematics seen formore » the signature inversion feature that is well known in this region. Furthermore, the spin assignment for the πh 11/2ν(h 9/2/f 7/2) structure provides additional validation of the proposed spins and configurations for isomers in the 176Au → 172Ir → 168Re α-decay chain.« less
PredictProtein—an open resource for online prediction of protein structural and functional features

PubMed Central

Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard

2014-01-01

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431
Permanent bending and alignment of ZnO nanowires.

PubMed

Borschel, Christian; Spindler, Susann; Lerose, Damiana; Bochmann, Arne; Christiansen, Silke H; Nietzsche, Sandor; Oertel, Michael; Ronning, Carsten

2011-05-06

Ion beams can be used to permanently bend and re-align nanowires after growth. We have irradiated ZnO nanowires with energetic ions, achieving bending and alignment in different directions. Not only the bending of single nanowires is studied in detail, but also the simultaneous alignment of large ensembles of ZnO nanowires. Computer simulations reveal how the bending is initiated by ion beam induced damage. Detailed structural characterization identifies dislocations to relax stresses and make the bending and alignment permanent, even surviving annealing procedures.
The hierarchical nature of the spin alignment of dark matter haloes in filaments

NASA Astrophysics Data System (ADS)

Aragon-Calvo, M. A.; Yang, Lin Forrest

2014-05-01

Dark matter haloes in cosmological filaments and walls have (in average) their spin vector aligned with their host structure. While haloes in walls are aligned with the plane of the wall independently of their mass, haloes in filaments present a mass-dependent two-regime orientation. Here, we show that the transition mass determining the change in the alignment regime (from parallel to perpendicular) depends on the hierarchical level in which the halo is located, reflecting the hierarchical nature of the Cosmic Web. By explicitly exposing the hierarchical structure of the Cosmic Web, we are able to identify the contributions of different components of the filament network to the alignment signal. We propose a unifying picture of angular momentum acquisition that is based on the results presented here and previous results found by other authors. In order to do a hierarchical characterization of the Cosmic Web, we introduce a new implementation of the multiscale morphology filter, the MMF-2, that significantly improves the identification of structures and explicitly describes their hierarchy. L36
Functional annotation by sequence-weighted structure alignments: statistical analysis and case studies from the Protein 3000 structural genomics project in Japan.

PubMed

Standley, Daron M; Toh, Hiroyuki; Nakamura, Haruki

2008-09-01

A method to functionally annotate structural genomics targets, based on a novel structural alignment scoring function, is proposed. In the proposed score, position-specific scoring matrices are used to weight structurally aligned residue pairs to highlight evolutionarily conserved motifs. The functional form of the score is first optimized for discriminating domains belonging to the same Pfam family from domains belonging to different families but the same CATH or SCOP superfamily. In the optimization stage, we consider four standard weighting functions as well as our own, the "maximum substitution probability," and combinations of these functions. The optimized score achieves an area of 0.87 under the receiver-operating characteristic curve with respect to identifying Pfam families within a sequence-unique benchmark set of domain pairs. Confidence measures are then derived from the benchmark distribution of true-positive scores. The alignment method is next applied to the task of functionally annotating 230 query proteins released to the public as part of the Protein 3000 structural genomics project in Japan. Of these queries, 78 were found to align to templates with the same Pfam family as the query or had sequence identities > or = 30%. Another 49 queries were found to match more distantly related templates. Within this group, the template predicted by our method to be the closest functional relative was often not the most structurally similar. Several nontrivial cases are discussed in detail. Finally, 103 queries matched templates at the fold level, but not the family or superfamily level, and remain functionally uncharacterized. 2008 Wiley-Liss, Inc.
Characterising RNA secondary structure space using information entropy

PubMed Central

2013-01-01

Comparative methods for RNA secondary structure prediction use evolutionary information from RNA alignments to increase prediction accuracy. The model is often described in terms of stochastic context-free grammars (SCFGs), which generate a probability distribution over secondary structures. It is, however, unclear how this probability distribution changes as a function of the input alignment. As prediction programs typically only return a single secondary structure, better characterisation of the underlying probability space of RNA secondary structures is of great interest. In this work, we show how to efficiently compute the information entropy of the probability distribution over RNA secondary structures produced for RNA alignments by a phylo-SCFG, and implement it for the PPfold model. We also discuss interpretations and applications of this quantity, including how it can clarify reasons for low prediction reliability scores. PPfold and its source code are available from http://birc.au.dk/software/ppfold/. PMID:23368905
Bayesian comparison of protein structures using partial Procrustes distance.

PubMed

Ejlali, Nasim; Faghihi, Mohammad Reza; Sadeghi, Mehdi

2017-09-26

An important topic in bioinformatics is the protein structure alignment. Some statistical methods have been proposed for this problem, but most of them align two protein structures based on the global geometric information without considering the effect of neighbourhood in the structures. In this paper, we provide a Bayesian model to align protein structures, by considering the effect of both local and global geometric information of protein structures. Local geometric information is incorporated to the model through the partial Procrustes distance of small substructures. These substructures are composed of β-carbon atoms from the side chains. Parameters are estimated using a Markov chain Monte Carlo (MCMC) approach. We evaluate the performance of our model through some simulation studies. Furthermore, we apply our model to a real dataset and assess the accuracy and convergence rate. Results show that our model is much more efficient than previous approaches.
ITG modes in the presence of inhomogeneous field-aligned flow

NASA Astrophysics Data System (ADS)

Sen, S.; McCarthy, D. R.; Lontano, M.; Lazzaro, E.; Honary, F.

2010-02-01

In a recent paper, Varischetti et al. (Plasma Phys. Contr. F. 2008, 50, 105008-1-15) have found that in a slab geometry the effect of the flow shear in the field-aligned parallel flow on the linear mode stability of the ion temperature gradient (ITG)-driven modes is not very prominent. They found that the flow shear also has a negligible effect on the mode characteristics. The work in this paper shows that the inclusion of flow curvature in the field-aligned flow can have a considerable effect on the mode stability; it can also change the mode structure so as to effect the mixing length transport in the core region of a fusion device. Flow shear, on the other hand, has indeed an insignificant role in the mode stability and mode structure. Inhomogeneous field-aligned flow should therefore still be considered for a viable candidate in controlling the ITG mode stability and mode structure.
Automatic identification of mobile and rigid substructures in molecular dynamics simulations and fractional structural fluctuation analysis.

PubMed

Martínez, Leandro

2015-01-01

The analysis of structural mobility in molecular dynamics plays a key role in data interpretation, particularly in the simulation of biomolecules. The most common mobility measures computed from simulations are the Root Mean Square Deviation (RMSD) and Root Mean Square Fluctuations (RMSF) of the structures. These are computed after the alignment of atomic coordinates in each trajectory step to a reference structure. This rigid-body alignment is not robust, in the sense that if a small portion of the structure is highly mobile, the RMSD and RMSF increase for all atoms, resulting possibly in poor quantification of the structural fluctuations and, often, to overlooking important fluctuations associated to biological function. The motivation of this work is to provide a robust measure of structural mobility that is practical, and easy to interpret. We propose a Low-Order-Value-Optimization (LOVO) strategy for the robust alignment of the least mobile substructures in a simulation. These substructures are automatically identified by the method. The algorithm consists of the iterative superposition of the fraction of structure displaying the smallest displacements. Therefore, the least mobile substructures are identified, providing a clearer picture of the overall structural fluctuations. Examples are given to illustrate the interpretative advantages of this strategy. The software for performing the alignments was named MDLovoFit and it is available as free-software at: http://leandro.iqm.unicamp.br/mdlovofit.
Automatic Identification of Mobile and Rigid Substructures in Molecular Dynamics Simulations and Fractional Structural Fluctuation Analysis

PubMed Central

Martínez, Leandro

2015-01-01

The analysis of structural mobility in molecular dynamics plays a key role in data interpretation, particularly in the simulation of biomolecules. The most common mobility measures computed from simulations are the Root Mean Square Deviation (RMSD) and Root Mean Square Fluctuations (RMSF) of the structures. These are computed after the alignment of atomic coordinates in each trajectory step to a reference structure. This rigid-body alignment is not robust, in the sense that if a small portion of the structure is highly mobile, the RMSD and RMSF increase for all atoms, resulting possibly in poor quantification of the structural fluctuations and, often, to overlooking important fluctuations associated to biological function. The motivation of this work is to provide a robust measure of structural mobility that is practical, and easy to interpret. We propose a Low-Order-Value-Optimization (LOVO) strategy for the robust alignment of the least mobile substructures in a simulation. These substructures are automatically identified by the method. The algorithm consists of the iterative superposition of the fraction of structure displaying the smallest displacements. Therefore, the least mobile substructures are identified, providing a clearer picture of the overall structural fluctuations. Examples are given to illustrate the interpretative advantages of this strategy. The software for performing the alignments was named MDLovoFit and it is available as free-software at: http://leandro.iqm.unicamp.br/mdlovofit PMID:25816325
Fast 3D shape screening of large chemical databases through alignment-recycling

PubMed Central

Fontaine, Fabien; Bolton, Evan; Borodina, Yulia; Bryant, Stephen H

2007-01-01

Background Large chemical databases require fast, efficient, and simple ways of looking for similar structures. Although such tasks are now fairly well resolved for graph-based similarity queries, they remain an issue for 3D approaches, particularly for those based on 3D shape overlays. Inspired by a recent technique developed to compare molecular shapes, we designed a hybrid methodology, alignment-recycling, that enables efficient retrieval and alignment of structures with similar 3D shapes. Results Using a dataset of more than one million PubChem compounds of limited size (< 28 heavy atoms) and flexibility (< 6 rotatable bonds), we obtained a set of a few thousand diverse structures covering entirely the 3D shape space of the conformers of the dataset. Transformation matrices gathered from the overlays between these diverse structures and the 3D conformer dataset allowed us to drastically (100-fold) reduce the CPU time required for shape overlay. The alignment-recycling heuristic produces results consistent with de novo alignment calculation, with better than 80% hit list overlap on average. Conclusion Overlay-based 3D methods are computationally demanding when searching large databases. Alignment-recycling reduces the CPU time to perform shape similarity searches by breaking the alignment problem into three steps: selection of diverse shapes to describe the database shape-space; overlay of the database conformers to the diverse shapes; and non-optimized overlay of query and database conformers using common reference shapes. The precomputation, required by the first two steps, is a significant cost of the method; however, once performed, querying is two orders of magnitude faster. Extensions and variations of this methodology, for example, to handle more flexible and larger small-molecules are discussed. PMID:17880744
Electric-Field-Induced Alignment of Block Copolymer/Nanoparticle Blends

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liedel, Clemens; Schindler, Kerstin; Pavan, Mariela J.

External electric fi elds readily align birefringent block-copolymer mesophases. In this study the effect of gold nanoparticles on the electric-fi eld-induced alignment of a lamellae-forming polystyrene- block -poly(2-vinylpyridine) copolymer is assessed. Nanoparticles are homogeneously dispersed in the styrenic phase and promote the quantitative alignment of lamellar domains by substantially lowering the critical field strength above which alignment proceeds. The results suggest that the electric-fi eldassisted alignment of nanostructured block copolymer/nanoparticle composites may offer a simple way to greatly mitigate structural and orientational defects of such fi lms under benign experimental conditions.
Sequence, structure and function relationships in flaviviruses as assessed by evolutive aspects of its conserved non-structural protein domains.

PubMed

da Fonseca, Néli José; Lima Afonso, Marcelo Querino; Pedersolli, Natan Gonçalves; de Oliveira, Lucas Carrijo; Andrade, Dhiego Souto; Bleicher, Lucas

2017-10-28

Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed. Copyright © 2017 Elsevier Inc. All rights reserved.
StrBioLib: a Java library for development of custom computational structural biology applications.

PubMed

Chandonia, John-Marc

2007-08-01

StrBioLib is a library of Java classes useful for developing software for computational structural biology research. StrBioLib contains classes to represent and manipulate protein structures, biopolymer sequences, sets of biopolymer sequences, and alignments between biopolymers based on either sequence or structure. Interfaces are provided to interact with commonly used bioinformatics applications, including (psi)-blast, modeller, muscle and Primer3, and tools are provided to read and write many file formats used to represent bioinformatic data. The library includes a general-purpose neural network object with multiple training algorithms, the Hooke and Jeeves non-linear optimization algorithm, and tools for efficient C-style string parsing and formatting. StrBioLib is the basis for the Pred2ary secondary structure prediction program, is used to build the astral compendium for sequence and structure analysis, and has been extensively tested through use in many smaller projects. Examples and documentation are available at the site below. StrBioLib may be obtained under the terms of the GNU LGPL license from http://strbio.sourceforge.net/
Impact of Information Technology Governance Structures on Strategic Alignment

ERIC Educational Resources Information Center

Gordon, Fitzroy R.

2013-01-01

This dissertation is a study of the relationship between Information Technology (IT) strategic alignment and IT governance structure within the organization. This dissertation replicates Asante (2010) among a different population where the prior results continue to hold, the non-experimental approach explored two research questions but include two…
Structural alignment sensor. [laser applications and interferometry

NASA Technical Reports Server (NTRS)

Davis, L.; Buholz, N. E.; Gillard, C. W.; Huang, C. C.; Wells, W. M., III

1978-01-01

Comparative Michelson interferometers are discussed as well as the operating range potential of a structural alignment sensor (SAS) which requires only one laser mode. Schematics are presented for the distance measurement logic, the basic SAS system, the SAS optical layout, the coarse measurement signal processor, and the measured range resolution.
Rclick: a web server for comparison of RNA 3D structures.

PubMed

Nguyen, Minh N; Verma, Chandra

2015-03-15

RNA molecules play important roles in key biological processes in the cell and are becoming attractive for developing therapeutic applications. Since the function of RNA depends on its structure and dynamics, comparing and classifying the RNA 3D structures is of crucial importance to molecular biology. In this study, we have developed Rclick, a web server that is capable of superimposing RNA 3D structures by using clique matching and 3D least-squares fitting. Our server Rclick has been benchmarked and compared with other popular servers and methods for RNA structural alignments. In most cases, Rclick alignments were better in terms of structure overlap. Our server also recognizes conformational changes between structures. For this purpose, the server produces complementary alignments to maximize the extent of detectable similarity. Various examples showcase the utility of our web server for comparison of RNA, RNA-protein complexes and RNA-ligand structures. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.