Sample records for target prediction methods

  1. Drug-target interaction prediction via class imbalance-aware ensemble learning.

    PubMed

    Ezzat, Ali; Wu, Min; Li, Xiao-Li; Kwoh, Chee-Keong

    2016-12-22

    Multiple computational methods for predicting drug-target interactions have been developed to facilitate the drug discovery process. These methods use available data on known drug-target interactions to train classifiers with the purpose of predicting new undiscovered interactions. However, a key challenge regarding this data that has not yet been addressed by these methods, namely class imbalance, is potentially degrading the prediction performance. Class imbalance can be divided into two sub-problems. Firstly, the number of known interacting drug-target pairs is much smaller than that of non-interacting drug-target pairs. This imbalance ratio between interacting and non-interacting drug-target pairs is referred to as the between-class imbalance. Between-class imbalance degrades prediction performance due to the bias in prediction results towards the majority class (i.e. the non-interacting pairs), leading to more prediction errors in the minority class (i.e. the interacting pairs). Secondly, there are multiple types of drug-target interactions in the data with some types having relatively fewer members (or are less represented) than others. This variation in representation of the different interaction types leads to another kind of imbalance referred to as the within-class imbalance. In within-class imbalance, prediction results are biased towards the better represented interaction types, leading to more prediction errors in the less represented interaction types. We propose an ensemble learning method that incorporates techniques to address the issues of between-class imbalance and within-class imbalance. Experiments show that the proposed method improves results over 4 state-of-the-art methods. In addition, we simulated cases for new drugs and targets to see how our method would perform in predicting their interactions. New drugs and targets are those for which no prior interactions are known. Our method displayed satisfactory prediction performance and was able to predict many of the interactions successfully. Our proposed method has improved the prediction performance over the existing work, thus proving the importance of addressing problems pertaining to class imbalance in the data.

  2. Literature-based condition-specific miRNA-mRNA target prediction.

    PubMed

    Oh, Minsik; Rhee, Sungmin; Moon, Ji Hwan; Chae, Heejoon; Lee, Sunwon; Kang, Jaewoo; Kim, Sun

    2017-01-01

    miRNAs are small non-coding RNAs that regulate gene expression by binding to the 3'-UTR of genes. Many recent studies have reported that miRNAs play important biological roles by regulating specific mRNAs or genes. Many sequence-based target prediction algorithms have been developed to predict miRNA targets. However, these methods are not designed for condition-specific target predictions and produce many false positives; thus, expression-based target prediction algorithms have been developed for condition-specific target predictions. A typical strategy to utilize expression data is to leverage the negative control roles of miRNAs on genes. To control false positives, a stringent cutoff value is typically set, but in this case, these methods tend to reject many true target relationships, i.e., false negatives. To overcome these limitations, additional information should be utilized. The literature is probably the best resource that we can utilize. Recent literature mining systems compile millions of articles with experiments designed for specific biological questions, and the systems provide a function to search for specific information. To utilize the literature information, we used a literature mining system, BEST, that automatically extracts information from the literature in PubMed and that allows the user to perform searches of the literature with any English words. By integrating omics data analysis methods and BEST, we developed Context-MMIA, a miRNA-mRNA target prediction method that combines expression data analysis results and the literature information extracted based on the user-specified context. In the pathway enrichment analysis using genes included in the top 200 miRNA-targets, Context-MMIA outperformed the four existing target prediction methods that we tested. In another test on whether prediction methods can re-produce experimentally validated target relationships, Context-MMIA outperformed the four existing target prediction methods. In summary, Context-MMIA allows the user to specify a context of the experimental data to predict miRNA targets, and we believe that Context-MMIA is very useful for predicting condition-specific miRNA targets.

  3. Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites.

    PubMed

    Betel, Doron; Koppal, Anjali; Agius, Phaedra; Sander, Chris; Leslie, Christina

    2010-01-01

    mirSVR is a new machine learning method for ranking microRNA target sites by a down-regulation score. The algorithm trains a regression model on sequence and contextual features extracted from miRanda-predicted target sites. In a large-scale evaluation, miRanda-mirSVR is competitive with other target prediction methods in identifying target genes and predicting the extent of their downregulation at the mRNA or protein levels. Importantly, the method identifies a significant number of experimentally determined non-canonical and non-conserved sites.

  4. Tools for in silico target fishing.

    PubMed

    Cereto-Massagué, Adrià; Ojeda, María José; Valls, Cristina; Mulero, Miquel; Pujadas, Gerard; Garcia-Vallve, Santiago

    2015-01-01

    Computational target fishing methods are designed to identify the most probable target of a query molecule. This process may allow the prediction of the bioactivity of a compound, the identification of the mode of action of known drugs, the detection of drug polypharmacology, drug repositioning or the prediction of the adverse effects of a compound. The large amount of information regarding the bioactivity of thousands of small molecules now allows the development of these types of methods. In recent years, we have witnessed the emergence of many methods for in silico target fishing. Most of these methods are based on the similarity principle, i.e., that similar molecules might bind to the same targets and have similar bioactivities. However, the difficult validation of target fishing methods hinders comparisons of the performance of each method. In this review, we describe the different methods developed for target prediction, the bioactivity databases most frequently used by these methods, and the publicly available programs and servers that enable non-specialist users to obtain these types of predictions. It is expected that target prediction will have a large impact on drug development and on the functional food industry. Copyright © 2014 Elsevier Inc. All rights reserved.

  5. Identification of novel plant peroxisomal targeting signals by a combination of machine learning methods and in vivo subcellular targeting analyses.

    PubMed

    Lingner, Thomas; Kataya, Amr R; Antonicelli, Gerardo E; Benichou, Aline; Nilssen, Kjersti; Chen, Xiong-Yan; Siemsen, Tanja; Morgenstern, Burkhard; Meinicke, Peter; Reumann, Sigrun

    2011-04-01

    In the postgenomic era, accurate prediction tools are essential for identification of the proteomes of cell organelles. Prediction methods have been developed for peroxisome-targeted proteins in animals and fungi but are missing specifically for plants. For development of a predictor for plant proteins carrying peroxisome targeting signals type 1 (PTS1), we assembled more than 2500 homologous plant sequences, mainly from EST databases. We applied a discriminative machine learning approach to derive two different prediction methods, both of which showed high prediction accuracy and recognized specific targeting-enhancing patterns in the regions upstream of the PTS1 tripeptides. Upon application of these methods to the Arabidopsis thaliana genome, 392 gene models were predicted to be peroxisome targeted. These predictions were extensively tested in vivo, resulting in a high experimental verification rate of Arabidopsis proteins previously not known to be peroxisomal. The prediction methods were able to correctly infer novel PTS1 tripeptides, which even included novel residues. Twenty-three newly predicted PTS1 tripeptides were experimentally confirmed, and a high variability of the plant PTS1 motif was discovered. These prediction methods will be instrumental in identifying low-abundance and stress-inducible peroxisomal proteins and defining the entire peroxisomal proteome of Arabidopsis and agronomically important crop plants.

  6. Motion prediction of a non-cooperative space target

    NASA Astrophysics Data System (ADS)

    Zhou, Bang-Zhao; Cai, Guo-Ping; Liu, Yun-Meng; Liu, Pan

    2018-01-01

    Capturing a non-cooperative space target is a tremendously challenging research topic. Effective acquisition of motion information of the space target is the premise to realize target capture. In this paper, motion prediction of a free-floating non-cooperative target in space is studied and a motion prediction algorithm is proposed. In order to predict the motion of the free-floating non-cooperative target, dynamic parameters of the target must be firstly identified (estimated), such as inertia, angular momentum and kinetic energy and so on; then the predicted motion of the target can be acquired by substituting these identified parameters into the Euler's equations of the target. Accurate prediction needs precise identification. This paper presents an effective method to identify these dynamic parameters of a free-floating non-cooperative target. This method is based on two steps, (1) the rough estimation of the parameters is computed using the motion observation data to the target, and (2) the best estimation of the parameters is found by an optimization method. In the optimization problem, the objective function is based on the difference between the observed and the predicted motion, and the interior-point method (IPM) is chosen as the optimization algorithm, which starts at the rough estimate obtained in the first step and finds a global minimum to the objective function with the guidance of objective function's gradient. So the speed of IPM searching for the global minimum is fast, and an accurate identification can be obtained in time. The numerical results show that the proposed motion prediction algorithm is able to predict the motion of the target.

  7. How reliable are ligand-centric methods for Target Fishing?

    NASA Astrophysics Data System (ADS)

    Peon, Antonio; Dang, Cuong; Ballester, Pedro

    2016-04-01

    Computational methods for Target Fishing (TF), also known as Target Prediction or Polypharmacology Prediction, can be used to discover new targets for small-molecule drugs. This may result in repositioning the drug in a new indication or improving our current understanding of its efficacy and side effects. While there is a substantial body of research on TF methods, there is still a need to improve their validation, which is often limited to a small part of the available targets and not easily interpretable by the user. Here we discuss how target-centric TF methods are inherently limited by the number of targets that can possibly predict (this number is by construction much larger in ligand-centric techniques). We also propose a new benchmark to validate TF methods, which is particularly suited to analyse how predictive performance varies with the query molecule. On average over approved drugs, we estimate that only five predicted targets will have to be tested to find two true targets with submicromolar potency (a strong variability in performance is however observed). In addition, we find that an approved drug has currently an average of eight known targets, which reinforces the notion that polypharmacology is a common and strong event. Furthermore, with the assistance of a control group of randomly-selected molecules, we show that the targets of approved drugs are generally harder to predict.

  8. Prediction of miRNA targets.

    PubMed

    Oulas, Anastasis; Karathanasis, Nestoras; Louloupi, Annita; Pavlopoulos, Georgios A; Poirazi, Panayiota; Kalantidis, Kriton; Iliopoulos, Ioannis

    2015-01-01

    Computational methods for miRNA target prediction are currently undergoing extensive review and evaluation. There is still a great need for improvement of these tools and bioinformatics approaches are looking towards high-throughput experiments in order to validate predictions. The combination of large-scale techniques with computational tools will not only provide greater credence to computational predictions but also lead to the better understanding of specific biological questions. Current miRNA target prediction tools utilize probabilistic learning algorithms, machine learning methods and even empirical biologically defined rules in order to build models based on experimentally verified miRNA targets. Large-scale protein downregulation assays and next-generation sequencing (NGS) are now being used to validate methodologies and compare the performance of existing tools. Tools that exhibit greater correlation between computational predictions and protein downregulation or RNA downregulation are considered the state of the art. Moreover, efficiency in prediction of miRNA targets that are concurrently verified experimentally provides additional validity to computational predictions and further highlights the competitive advantage of specific tools and their efficacy in extracting biologically significant results. In this review paper, we discuss the computational methods for miRNA target prediction and provide a detailed comparison of methodologies and features utilized by each specific tool. Moreover, we provide an overview of current state-of-the-art high-throughput methods used in miRNA target prediction.

  9. Drug-Target Interaction Prediction through Label Propagation with Linear Neighborhood Information.

    PubMed

    Zhang, Wen; Chen, Yanlin; Li, Dingfang

    2017-11-25

    Interactions between drugs and target proteins provide important information for the drug discovery. Currently, experiments identified only a small number of drug-target interactions. Therefore, the development of computational methods for drug-target interaction prediction is an urgent task of theoretical interest and practical significance. In this paper, we propose a label propagation method with linear neighborhood information (LPLNI) for predicting unobserved drug-target interactions. Firstly, we calculate drug-drug linear neighborhood similarity in the feature spaces, by considering how to reconstruct data points from neighbors. Then, we take similarities as the manifold of drugs, and assume the manifold unchanged in the interaction space. At last, we predict unobserved interactions between known drugs and targets by using drug-drug linear neighborhood similarity and known drug-target interactions. The experiments show that LPLNI can utilize only known drug-target interactions to make high-accuracy predictions on four benchmark datasets. Furthermore, we consider incorporating chemical structures into LPLNI models. Experimental results demonstrate that the model with integrated information (LPLNI-II) can produce improved performances, better than other state-of-the-art methods. The known drug-target interactions are an important information source for computational predictions. The usefulness of the proposed method is demonstrated by cross validation and the case study.

  10. New support vector machine-based method for microRNA target prediction.

    PubMed

    Li, L; Gao, Q; Mao, X; Cao, Y

    2014-06-09

    MicroRNA (miRNA) plays important roles in cell differentiation, proliferation, growth, mobility, and apoptosis. An accurate list of precise target genes is necessary in order to fully understand the importance of miRNAs in animal development and disease. Several computational methods have been proposed for miRNA target-gene identification. However, these methods still have limitations with respect to their sensitivity and accuracy. Thus, we developed a new miRNA target-prediction method based on the support vector machine (SVM) model. The model supplies information of two binding sites (primary and secondary) for a radial basis function kernel as a similarity measure for SVM features. The information is categorized based on structural, thermodynamic, and sequence conservation. Using high-confidence datasets selected from public miRNA target databases, we obtained a human miRNA target SVM classifier model with high performance and provided an efficient tool for human miRNA target gene identification. Experiments have shown that our method is a reliable tool for miRNA target-gene prediction, and a successful application of an SVM classifier. Compared with other methods, the method proposed here improves the sensitivity and accuracy of miRNA prediction. Its performance can be further improved by providing more training examples.

  11. Predicting miRNA targets for head and neck squamous cell carcinoma using an ensemble method.

    PubMed

    Gao, Hong; Jin, Hui; Li, Guijun

    2018-01-01

    This study aimed to uncover potential microRNA (miRNA) targets in head and neck squamous cell carcinoma (HNSCC) using an ensemble method which combined 3 different methods: Pearson's correlation coefficient (PCC), Lasso and a causal inference method (i.e., intervention calculus when the directed acyclic graph (DAG) is absent [IDA]), based on Borda count election. The Borda count election method was used to integrate the top 100 predicted targets of each miRNA generated by individual methods. Afterwards, to validate the performance ability of our method, we checked the TarBase v6.0, miRecords v2013, miRWalk v2.0 and miRTarBase v4.5 databases to validate predictions for miRNAs. Pathway enrichment analysis of target genes in the top 1,000 miRNA-messenger RNA (mRNA) interactions was conducted to focus on significant KEGG pathways. Finally, we extracted target genes based on occurrence frequency ≥3. Based on an absolute value of PCC >0.7, we found 33 miRNAs and 288 mRNAs for further analysis. We extracted 10 target genes with predicted frequencies not less than 3. The target gene MYO5C possessed the highest frequency, which was predicted by 7 different miRNAs. Significantly, a total of 8 pathways were identified; the pathways of cytokine-cytokine receptor interaction and chemokine signaling pathway were the most significant. We successfully predicted target genes and pathways for HNSCC relying on miRNA expression data, mRNA expression profile, an ensemble method and pathway information. Our results may offer new information for the diagnosis and estimation of the prognosis of HNSCC.

  12. Drug-Target Interactions: Prediction Methods and Applications.

    PubMed

    Anusuya, Shanmugam; Kesherwani, Manish; Priya, K Vishnu; Vimala, Antonydhason; Shanmugam, Gnanendra; Velmurugan, Devadasan; Gromiha, M Michael

    2018-01-01

    Identifying the interactions between drugs and target proteins is a key step in drug discovery. This not only aids to understand the disease mechanism, but also helps to identify unexpected therapeutic activity or adverse side effects of drugs. Hence, drug-target interaction prediction becomes an essential tool in the field of drug repurposing. The availability of heterogeneous biological data on known drug-target interactions enabled many researchers to develop various computational methods to decipher unknown drug-target interactions. This review provides an overview on these computational methods for predicting drug-target interactions along with available webservers and databases for drug-target interactions. Further, the applicability of drug-target interactions in various diseases for identifying lead compounds has been outlined. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  13. Drug-target interaction prediction using ensemble learning and dimensionality reduction.

    PubMed

    Ezzat, Ali; Wu, Min; Li, Xiao-Li; Kwoh, Chee-Keong

    2017-10-01

    Experimental prediction of drug-target interactions is expensive, time-consuming and tedious. Fortunately, computational methods help narrow down the search space for interaction candidates to be further examined via wet-lab techniques. Nowadays, the number of attributes/features for drugs and targets, as well as the amount of their interactions, are increasing, making these computational methods inefficient or occasionally prohibitive. This motivates us to derive a reduced feature set for prediction. In addition, since ensemble learning techniques are widely used to improve the classification performance, it is also worthwhile to design an ensemble learning framework to enhance the performance for drug-target interaction prediction. In this paper, we propose a framework for drug-target interaction prediction leveraging both feature dimensionality reduction and ensemble learning. First, we conducted feature subspacing to inject diversity into the classifier ensemble. Second, we applied three different dimensionality reduction methods to the subspaced features. Third, we trained homogeneous base learners with the reduced features and then aggregated their scores to derive the final predictions. For base learners, we selected two classifiers, namely Decision Tree and Kernel Ridge Regression, resulting in two variants of ensemble models, EnsemDT and EnsemKRR, respectively. In our experiments, we utilized AUC (Area under ROC Curve) as an evaluation metric. We compared our proposed methods with various state-of-the-art methods under 5-fold cross validation. Experimental results showed EnsemKRR achieving the highest AUC (94.3%) for predicting drug-target interactions. In addition, dimensionality reduction helped improve the performance of EnsemDT. In conclusion, our proposed methods produced significant improvements for drug-target interaction prediction. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Benchmark data sets for structure-based computational target prediction.

    PubMed

    Schomburg, Karen T; Rarey, Matthias

    2014-08-25

    Structure-based computational target prediction methods identify potential targets for a bioactive compound. Methods based on protein-ligand docking so far face many challenges, where the greatest probably is the ranking of true targets in a large data set of protein structures. Currently, no standard data sets for evaluation exist, rendering comparison and demonstration of improvements of methods cumbersome. Therefore, we propose two data sets and evaluation strategies for a meaningful evaluation of new target prediction methods, i.e., a small data set consisting of three target classes for detailed proof-of-concept and selectivity studies and a large data set consisting of 7992 protein structures and 72 drug-like ligands allowing statistical evaluation with performance metrics on a drug-like chemical space. Both data sets are built from openly available resources, and any information needed to perform the described experiments is reported. We describe the composition of the data sets, the setup of screening experiments, and the evaluation strategy. Performance metrics capable to measure the early recognition of enrichments like AUC, BEDROC, and NSLR are proposed. We apply a sequence-based target prediction method to the large data set to analyze its content of nontrivial evaluation cases. The proposed data sets are used for method evaluation of our new inverse screening method iRAISE. The small data set reveals the method's capability and limitations to selectively distinguish between rather similar protein structures. The large data set simulates real target identification scenarios. iRAISE achieves in 55% excellent or good enrichment a median AUC of 0.67 and RMSDs below 2.0 Å for 74% and was able to predict the first true target in 59 out of 72 cases in the top 2% of the protein data set of about 8000 structures.

  15. Predicting New Indications for Approved Drugs Using a Proteo-Chemometric Method

    PubMed Central

    Dakshanamurthy, Sivanesan; Issa, Naiem T; Assefnia, Shahin; Seshasayee, Ashwini; Peters, Oakland J; Madhavan, Subha; Uren, Aykut; Brown, Milton L; Byers, Stephen W

    2012-01-01

    The most effective way to move from target identification to the clinic is to identify already approved drugs with the potential for activating or inhibiting unintended targets (repurposing or repositioning). This is usually achieved by high throughput chemical screening, transcriptome matching or simple in silico ligand docking. We now describe a novel rapid computational proteo-chemometric method called “Train, Match, Fit, Streamline” (TMFS) to map new drug-target interaction space and predict new uses. The TMFS method combines shape, topology and chemical signatures, including docking score and functional contact points of the ligand, to predict potential drug-target interactions with remarkable accuracy. Using the TMFS method, we performed extensive molecular fit computations on 3,671 FDA approved drugs across 2,335 human protein crystal structures. The TMFS method predicts drug-target associations with 91% accuracy for the majority of drugs. Over 58% of the known best ligands for each target were correctly predicted as top ranked, followed by 66%, 76%, 84% and 91% for agents ranked in the top 10, 20, 30 and 40, respectively, out of all 3,671 drugs. Drugs ranked in the top 1–40, that have not been experimentally validated for a particular target now become candidates for repositioning. Furthermore, we used the TMFS method to discover that mebendazole, an anti-parasitic with recently discovered and unexpected anti-cancer properties, has the structural potential to inhibit VEGFR2. We confirmed experimentally that mebendazole inhibits VEGFR2 kinase activity as well as angiogenesis at doses comparable with its known effects on hookworm. TMFS also predicted, and was confirmed with surface plasmon resonance, that dimethyl celecoxib and the anti-inflammatory agent celecoxib can bind cadherin-11, an adhesion molecule important in rheumatoid arthritis and poor prognosis malignancies for which no targeted therapies exist. We anticipate that expanding our TMFS method to the >27,000 clinically active agents available worldwide across all targets will be most useful in the repositioning of existing drugs for new therapeutic targets. PMID:22780961

  16. Identifying Drug-Target Interactions with Decision Templates.

    PubMed

    Yan, Xiao-Ying; Zhang, Shao-Wu

    2018-01-01

    During the development process of new drugs, identification of the drug-target interactions wins primary concerns. However, the chemical or biological experiments bear the limitation in coverage as well as the huge cost of both time and money. Based on drug similarity and target similarity, chemogenomic methods can be able to predict potential drug-target interactions (DTIs) on a large scale and have no luxurious need about target structures or ligand entries. In order to reflect the cases that the drugs having variant structures interact with common targets and the targets having dissimilar sequences interact with same drugs. In addition, though several other similarity metrics have been developed to predict DTIs, the combination of multiple similarity metrics (especially heterogeneous similarities) is too naïve to sufficiently explore the multiple similarities. In this paper, based on Gene Ontology and pathway annotation, we introduce two novel target similarity metrics to address above issues. More importantly, we propose a more effective strategy via decision template to integrate multiple classifiers designed with multiple similarity metrics. In the scenarios that predict existing targets for new drugs and predict approved drugs for new protein targets, the results on the DTI benchmark datasets show that our target similarity metrics are able to enhance the predictive accuracies in two scenarios. And the elaborate fusion strategy of multiple classifiers has better predictive power than the naïve combination of multiple similarity metrics. Compared with other two state-of-the-art approaches on the four popular benchmark datasets of binary drug-target interactions, our method achieves the best results in terms of AUC and AUPR for predicting available targets for new drugs (S2), and predicting approved drugs for new protein targets (S3).These results demonstrate that our method can effectively predict the drug-target interactions. The software package can freely available at https://github.com/NwpuSY/DT_all.git for academic users. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  17. Deep-Learning-Based Drug-Target Interaction Prediction.

    PubMed

    Wen, Ming; Zhang, Zhimin; Niu, Shaoyu; Sha, Haozhi; Yang, Ruihan; Yun, Yonghuan; Lu, Hongmei

    2017-04-07

    Identifying interactions between known drugs and targets is a major challenge in drug repositioning. In silico prediction of drug-target interaction (DTI) can speed up the expensive and time-consuming experimental work by providing the most potent DTIs. In silico prediction of DTI can also provide insights about the potential drug-drug interaction and promote the exploration of drug side effects. Traditionally, the performance of DTI prediction depends heavily on the descriptors used to represent the drugs and the target proteins. In this paper, to accurately predict new DTIs between approved drugs and targets without separating the targets into different classes, we developed a deep-learning-based algorithmic framework named DeepDTIs. It first abstracts representations from raw input descriptors using unsupervised pretraining and then applies known label pairs of interaction to build a classification model. Compared with other methods, it is found that DeepDTIs reaches or outperforms other state-of-the-art methods. The DeepDTIs can be further used to predict whether a new drug targets to some existing targets or whether a new target interacts with some existing drugs.

  18. Predicting Drug-Target Interactions With Multi-Information Fusion.

    PubMed

    Peng, Lihong; Liao, Bo; Zhu, Wen; Li, Zejun; Li, Keqin

    2017-03-01

    Identifying potential associations between drugs and targets is a critical prerequisite for modern drug discovery and repurposing. However, predicting these associations is difficult because of the limitations of existing computational methods. Most models only consider chemical structures and protein sequences, and other models are oversimplified. Moreover, datasets used for analysis contain only true-positive interactions, and experimentally validated negative samples are unavailable. To overcome these limitations, we developed a semi-supervised based learning framework called NormMulInf through collaborative filtering theory by using labeled and unlabeled interaction information. The proposed method initially determines similarity measures, such as similarities among samples and local correlations among the labels of the samples, by integrating biological information. The similarity information is then integrated into a robust principal component analysis model, which is solved using augmented Lagrange multipliers. Experimental results on four classes of drug-target interaction networks suggest that the proposed approach can accurately classify and predict drug-target interactions. Part of the predicted interactions are reported in public databases. The proposed method can also predict possible targets for new drugs and can be used to determine whether atropine may interact with alpha1B- and beta1- adrenergic receptors. Furthermore, the developed technique identifies potential drugs for new targets and can be used to assess whether olanzapine and propiomazine may target 5HT2B. Finally, the proposed method can potentially address limitations on studies of multitarget drugs and multidrug targets.

  19. Drug Target Prediction and Repositioning Using an Integrated Network-Based Approach

    PubMed Central

    Emig, Dorothea; Ivliev, Alexander; Pustovalova, Olga; Lancashire, Lee; Bureeva, Svetlana; Nikolsky, Yuri; Bessarabova, Marina

    2013-01-01

    The discovery of novel drug targets is a significant challenge in drug development. Although the human genome comprises approximately 30,000 genes, proteins encoded by fewer than 400 are used as drug targets in the treatment of diseases. Therefore, novel drug targets are extremely valuable as the source for first in class drugs. On the other hand, many of the currently known drug targets are functionally pleiotropic and involved in multiple pathologies. Several of them are exploited for treating multiple diseases, which highlights the need for methods to reliably reposition drug targets to new indications. Network-based methods have been successfully applied to prioritize novel disease-associated genes. In recent years, several such algorithms have been developed, some focusing on local network properties only, and others taking the complete network topology into account. Common to all approaches is the understanding that novel disease-associated candidates are in close overall proximity to known disease genes. However, the relevance of these methods to the prediction of novel drug targets has not yet been assessed. Here, we present a network-based approach for the prediction of drug targets for a given disease. The method allows both repositioning drug targets known for other diseases to the given disease and the prediction of unexploited drug targets which are not used for treatment of any disease. Our approach takes as input a disease gene expression signature and a high-quality interaction network and outputs a prioritized list of drug targets. We demonstrate the high performance of our method and highlight the usefulness of the predictions in three case studies. We present novel drug targets for scleroderma and different types of cancer with their underlying biological processes. Furthermore, we demonstrate the ability of our method to identify non-suspected repositioning candidates using diabetes type 1 as an example. PMID:23593264

  20. RobOKoD: microbial strain design for (over)production of target compounds.

    PubMed

    Stanford, Natalie J; Millard, Pierre; Swainston, Neil

    2015-01-01

    Sustainable production of target compounds such as biofuels and high-value chemicals for pharmaceutical, agrochemical, and chemical industries is becoming an increasing priority given their current dependency upon diminishing petrochemical resources. Designing these strains is difficult, with current methods focusing primarily on knocking-out genes, dismissing other vital steps of strain design including the overexpression and dampening of genes. The design predictions from current methods also do not translate well-into successful strains in the laboratory. Here, we introduce RobOKoD (Robust, Overexpression, Knockout and Dampening), a method for predicting strain designs for overproduction of targets. The method uses flux variability analysis to profile each reaction within the system under differing production percentages of target-compound and biomass. Using these profiles, reactions are identified as potential knockout, overexpression, or dampening targets. The identified reactions are ranked according to their suitability, providing flexibility in strain design for users. The software was tested by designing a butanol-producing Escherichia coli strain, and was compared against the popular OptKnock and RobustKnock methods. RobOKoD shows favorable design predictions, when predictions from these methods are compared to a successful butanol-producing experimentally-validated strain. Overall RobOKoD provides users with rankings of predicted beneficial genetic interventions with which to support optimized strain design.

  1. RobOKoD: microbial strain design for (over)production of target compounds

    PubMed Central

    Stanford, Natalie J.; Millard, Pierre; Swainston, Neil

    2015-01-01

    Sustainable production of target compounds such as biofuels and high-value chemicals for pharmaceutical, agrochemical, and chemical industries is becoming an increasing priority given their current dependency upon diminishing petrochemical resources. Designing these strains is difficult, with current methods focusing primarily on knocking-out genes, dismissing other vital steps of strain design including the overexpression and dampening of genes. The design predictions from current methods also do not translate well-into successful strains in the laboratory. Here, we introduce RobOKoD (Robust, Overexpression, Knockout and Dampening), a method for predicting strain designs for overproduction of targets. The method uses flux variability analysis to profile each reaction within the system under differing production percentages of target-compound and biomass. Using these profiles, reactions are identified as potential knockout, overexpression, or dampening targets. The identified reactions are ranked according to their suitability, providing flexibility in strain design for users. The software was tested by designing a butanol-producing Escherichia coli strain, and was compared against the popular OptKnock and RobustKnock methods. RobOKoD shows favorable design predictions, when predictions from these methods are compared to a successful butanol-producing experimentally-validated strain. Overall RobOKoD provides users with rankings of predicted beneficial genetic interventions with which to support optimized strain design. PMID:25853130

  2. LOCALIZER: subcellular localization prediction of both plant and effector proteins in the plant cell

    PubMed Central

    Sperschneider, Jana; Catanzariti, Ann-Maree; DeBoer, Kathleen; Petre, Benjamin; Gardiner, Donald M.; Singh, Karam B.; Dodds, Peter N.; Taylor, Jennifer M.

    2017-01-01

    Pathogens secrete effector proteins and many operate inside plant cells to enable infection. Some effectors have been found to enter subcellular compartments by mimicking host targeting sequences. Although many computational methods exist to predict plant protein subcellular localization, they perform poorly for effectors. We introduce LOCALIZER for predicting plant and effector protein localization to chloroplasts, mitochondria, and nuclei. LOCALIZER shows greater prediction accuracy for chloroplast and mitochondrial targeting compared to other methods for 652 plant proteins. For 107 eukaryotic effectors, LOCALIZER outperforms other methods and predicts a previously unrecognized chloroplast transit peptide for the ToxA effector, which we show translocates into tobacco chloroplasts. Secretome-wide predictions and confocal microscopy reveal that rust fungi might have evolved multiple effectors that target chloroplasts or nuclei. LOCALIZER is the first method for predicting effector localisation in plants and is a valuable tool for prioritizing effector candidates for functional investigations. LOCALIZER is available at http://localizer.csiro.au/. PMID:28300209

  3. Prediction of polypharmacological profiles of drugs by the integration of chemical, side effect, and therapeutic space.

    PubMed

    Cheng, Feixiong; Li, Weihua; Wu, Zengrui; Wang, Xichuan; Zhang, Chen; Li, Jie; Liu, Guixia; Tang, Yun

    2013-04-22

    Prediction of polypharmacological profiles of drugs enables us to investigate drug side effects and further find their new indications, i.e. drug repositioning, which could reduce the costs while increase the productivity of drug discovery. Here we describe a new computational framework to predict polypharmacological profiles of drugs by the integration of chemical, side effect, and therapeutic space. On the basis of our previous developed drug side effects database, named MetaADEDB, a drug side effect similarity inference (DSESI) method was developed for drug-target interaction (DTI) prediction on a known DTI network connecting 621 approved drugs and 893 target proteins. The area under the receiver operating characteristic curve was 0.882 ± 0.011 averaged from 100 simulated tests of 10-fold cross-validation for the DSESI method, which is comparative with drug structural similarity inference and drug therapeutic similarity inference methods. Seven new predicted candidate target proteins for seven approved drugs were confirmed by published experiments, with the successful hit rate more than 15.9%. Moreover, network visualization of drug-target interactions and off-target side effect associations provide new mechanism-of-action of three approved antipsychotic drugs in a case study. The results indicated that the proposed methods could be helpful for prediction of polypharmacological profiles of drugs.

  4. A Prediction Model for Functional Outcomes in Spinal Cord Disorder Patients Using Gaussian Process Regression.

    PubMed

    Lee, Sunghoon Ivan; Mortazavi, Bobak; Hoffman, Haydn A; Lu, Derek S; Li, Charles; Paak, Brian H; Garst, Jordan H; Razaghy, Mehrdad; Espinal, Marie; Park, Eunjeong; Lu, Daniel C; Sarrafzadeh, Majid

    2016-01-01

    Predicting the functional outcomes of spinal cord disorder patients after medical treatments, such as a surgical operation, has always been of great interest. Accurate posttreatment prediction is especially beneficial for clinicians, patients, care givers, and therapists. This paper introduces a prediction method for postoperative functional outcomes by a novel use of Gaussian process regression. The proposed method specifically considers the restricted value range of the target variables by modeling the Gaussian process based on a truncated Normal distribution, which significantly improves the prediction results. The prediction has been made in assistance with target tracking examinations using a highly portable and inexpensive handgrip device, which greatly contributes to the prediction performance. The proposed method has been validated through a dataset collected from a clinical cohort pilot involving 15 patients with cervical spinal cord disorder. The results show that the proposed method can accurately predict postoperative functional outcomes, Oswestry disability index and target tracking scores, based on the patient's preoperative information with a mean absolute error of 0.079 and 0.014 (out of 1.0), respectively.

  5. RFDT: A Rotation Forest-based Predictor for Predicting Drug-Target Interactions Using Drug Structure and Protein Sequence Information.

    PubMed

    Wang, Lei; You, Zhu-Hong; Chen, Xing; Yan, Xin; Liu, Gang; Zhang, Wei

    2018-01-01

    Identification of interaction between drugs and target proteins plays an important role in discovering new drug candidates. However, through the experimental method to identify the drug-target interactions remain to be extremely time-consuming, expensive and challenging even nowadays. Therefore, it is urgent to develop new computational methods to predict potential drugtarget interactions (DTI). In this article, a novel computational model is developed for predicting potential drug-target interactions under the theory that each drug-target interaction pair can be represented by the structural properties from drugs and evolutionary information derived from proteins. Specifically, the protein sequences are encoded as Position-Specific Scoring Matrix (PSSM) descriptor which contains information of biological evolutionary and the drug molecules are encoded as fingerprint feature vector which represents the existence of certain functional groups or fragments. Four benchmark datasets involving enzymes, ion channels, GPCRs and nuclear receptors, are independently used for establishing predictive models with Rotation Forest (RF) model. The proposed method achieved the prediction accuracy of 91.3%, 89.1%, 84.1% and 71.1% for four datasets respectively. In order to make our method more persuasive, we compared our classifier with the state-of-theart Support Vector Machine (SVM) classifier. We also compared the proposed method with other excellent methods. Experimental results demonstrate that the proposed method is effective in the prediction of DTI, and can provide assistance for new drug research and development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  6. Predicting drug-target interaction for new drugs using enhanced similarity measures and super-target clustering.

    PubMed

    Shi, Jian-Yu; Yiu, Siu-Ming; Li, Yiming; Leung, Henry C M; Chin, Francis Y L

    2015-07-15

    Predicting drug-target interaction using computational approaches is an important step in drug discovery and repositioning. To predict whether there will be an interaction between a drug and a target, most existing methods identify similar drugs and targets in the database. The prediction is then made based on the known interactions of these drugs and targets. This idea is promising. However, there are two shortcomings that have not yet been addressed appropriately. Firstly, most of the methods only use 2D chemical structures and protein sequences to measure the similarity of drugs and targets respectively. However, this information may not fully capture the characteristics determining whether a drug will interact with a target. Secondly, there are very few known interactions, i.e. many interactions are "missing" in the database. Existing approaches are biased towards known interactions and have no good solutions to handle possibly missing interactions which affect the accuracy of the prediction. In this paper, we enhance the similarity measures to include non-structural (and non-sequence-based) information and introduce the concept of a "super-target" to handle the problem of possibly missing interactions. Based on evaluations on real data, we show that our similarity measure is better than the existing measures and our approach is able to achieve higher accuracy than the two best existing algorithms, WNN-GIP and KBMF2K. Our approach is available at http://web.hku.hk/∼liym1018/projects/drug/drug.html or http://www.bmlnwpu.org/us/tools/PredictingDTI_S2/METHODS.html. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. DDR: efficient computational method to predict drug-target interactions using graph mining and machine learning approaches.

    PubMed

    Olayan, Rawan S; Ashoor, Haitham; Bajic, Vladimir B

    2018-04-01

    Finding computationally drug-target interactions (DTIs) is a convenient strategy to identify new DTIs at low cost with reasonable accuracy. However, the current DTI prediction methods suffer the high false positive prediction rate. We developed DDR, a novel method that improves the DTI prediction accuracy. DDR is based on the use of a heterogeneous graph that contains known DTIs with multiple similarities between drugs and multiple similarities between target proteins. DDR applies non-linear similarity fusion method to combine different similarities. Before fusion, DDR performs a pre-processing step where a subset of similarities is selected in a heuristic process to obtain an optimized combination of similarities. Then, DDR applies a random forest model using different graph-based features extracted from the DTI heterogeneous graph. Using 5-repeats of 10-fold cross-validation, three testing setups, and the weighted average of area under the precision-recall curve (AUPR) scores, we show that DDR significantly reduces the AUPR score error relative to the next best start-of-the-art method for predicting DTIs by 34% when the drugs are new, by 23% when targets are new and by 34% when the drugs and the targets are known but not all DTIs between them are not known. Using independent sources of evidence, we verify as correct 22 out of the top 25 DDR novel predictions. This suggests that DDR can be used as an efficient method to identify correct DTIs. The data and code are provided at https://bitbucket.org/RSO24/ddr/. vladimir.bajic@kaust.edu.sa. Supplementary data are available at Bioinformatics online.

  8. Ensemble Methods for MiRNA Target Prediction from Expression Data.

    PubMed

    Le, Thuc Duy; Zhang, Junpeng; Liu, Lin; Li, Jiuyong

    2015-01-01

    microRNAs (miRNAs) are short regulatory RNAs that are involved in several diseases, including cancers. Identifying miRNA functions is very important in understanding disease mechanisms and determining the efficacy of drugs. An increasing number of computational methods have been developed to explore miRNA functions by inferring the miRNA-mRNA regulatory relationships from data. Each of the methods is developed based on some assumptions and constraints, for instance, assuming linear relationships between variables. For such reasons, computational methods are often subject to the problem of inconsistent performance across different datasets. On the other hand, ensemble methods integrate the results from individual methods and have been proved to outperform each of their individual component methods in theory. In this paper, we investigate the performance of some ensemble methods over the commonly used miRNA target prediction methods. We apply eight different popular miRNA target prediction methods to three cancer datasets, and compare their performance with the ensemble methods which integrate the results from each combination of the individual methods. The validation results using experimentally confirmed databases show that the results of the ensemble methods complement those obtained by the individual methods and the ensemble methods perform better than the individual methods across different datasets. The ensemble method, Pearson+IDA+Lasso, which combines methods in different approaches, including a correlation method, a causal inference method, and a regression method, is the best performed ensemble method in this study. Further analysis of the results of this ensemble method shows that the ensemble method can obtain more targets which could not be found by any of the single methods, and the discovered targets are more statistically significant and functionally enriched. The source codes, datasets, miRNA target predictions by all methods, and the ground truth for validation are available in the Supplementary materials.

  9. Ensemble Methods for MiRNA Target Prediction from Expression Data

    PubMed Central

    Le, Thuc Duy; Zhang, Junpeng; Liu, Lin; Li, Jiuyong

    2015-01-01

    Background microRNAs (miRNAs) are short regulatory RNAs that are involved in several diseases, including cancers. Identifying miRNA functions is very important in understanding disease mechanisms and determining the efficacy of drugs. An increasing number of computational methods have been developed to explore miRNA functions by inferring the miRNA-mRNA regulatory relationships from data. Each of the methods is developed based on some assumptions and constraints, for instance, assuming linear relationships between variables. For such reasons, computational methods are often subject to the problem of inconsistent performance across different datasets. On the other hand, ensemble methods integrate the results from individual methods and have been proved to outperform each of their individual component methods in theory. Results In this paper, we investigate the performance of some ensemble methods over the commonly used miRNA target prediction methods. We apply eight different popular miRNA target prediction methods to three cancer datasets, and compare their performance with the ensemble methods which integrate the results from each combination of the individual methods. The validation results using experimentally confirmed databases show that the results of the ensemble methods complement those obtained by the individual methods and the ensemble methods perform better than the individual methods across different datasets. The ensemble method, Pearson+IDA+Lasso, which combines methods in different approaches, including a correlation method, a causal inference method, and a regression method, is the best performed ensemble method in this study. Further analysis of the results of this ensemble method shows that the ensemble method can obtain more targets which could not be found by any of the single methods, and the discovered targets are more statistically significant and functionally enriched. The source codes, datasets, miRNA target predictions by all methods, and the ground truth for validation are available in the Supplementary materials. PMID:26114448

  10. Prediction of Drug-Target Interaction Networks from the Integration of Protein Sequences and Drug Chemical Structures.

    PubMed

    Meng, Fan-Rong; You, Zhu-Hong; Chen, Xing; Zhou, Yong; An, Ji-Yong

    2017-07-05

    Knowledge of drug-target interaction (DTI) plays an important role in discovering new drug candidates. Unfortunately, there are unavoidable shortcomings; including the time-consuming and expensive nature of the experimental method to predict DTI. Therefore, it motivates us to develop an effective computational method to predict DTI based on protein sequence. In the paper, we proposed a novel computational approach based on protein sequence, namely PDTPS (Predicting Drug Targets with Protein Sequence) to predict DTI. The PDTPS method combines Bi-gram probabilities (BIGP), Position Specific Scoring Matrix (PSSM), and Principal Component Analysis (PCA) with Relevance Vector Machine (RVM). In order to evaluate the prediction capacity of the PDTPS, the experiment was carried out on enzyme, ion channel, GPCR, and nuclear receptor datasets by using five-fold cross-validation tests. The proposed PDTPS method achieved average accuracy of 97.73%, 93.12%, 86.78%, and 87.78% on enzyme, ion channel, GPCR and nuclear receptor datasets, respectively. The experimental results showed that our method has good prediction performance. Furthermore, in order to further evaluate the prediction performance of the proposed PDTPS method, we compared it with the state-of-the-art support vector machine (SVM) classifier on enzyme and ion channel datasets, and other exiting methods on four datasets. The promising comparison results further demonstrate that the efficiency and robust of the proposed PDTPS method. This makes it a useful tool and suitable for predicting DTI, as well as other bioinformatics tasks.

  11. TarPmiR: a new approach for microRNA target site prediction.

    PubMed

    Ding, Jun; Li, Xiaoman; Hu, Haiyan

    2016-09-15

    The identification of microRNA (miRNA) target sites is fundamentally important for studying gene regulation. There are dozens of computational methods available for miRNA target site prediction. Despite their existence, we still cannot reliably identify miRNA target sites, partially due to our limited understanding of the characteristics of miRNA target sites. The recently published CLASH (crosslinking ligation and sequencing of hybrids) data provide an unprecedented opportunity to study the characteristics of miRNA target sites and improve miRNA target site prediction methods. Applying four different machine learning approaches to the CLASH data, we identified seven new features of miRNA target sites. Combining these new features with those commonly used by existing miRNA target prediction algorithms, we developed an approach called TarPmiR for miRNA target site prediction. Testing on two human and one mouse non-CLASH datasets, we showed that TarPmiR predicted more than 74.2% of true miRNA target sites in each dataset. Compared with three existing approaches, we demonstrated that TarPmiR is superior to these existing approaches in terms of better recall and better precision. The TarPmiR software is freely available at http://hulab.ucf.edu/research/projects/miRNA/TarPmiR/ CONTACTS: haihu@cs.ucf.edu or xiaoman@mail.ucf.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  12. An Optimized Transient Dual Luciferase Assay for Quantifying MicroRNA Directed Repression of Targeted Sequences

    PubMed Central

    Moyle, Richard L.; Carvalhais, Lilia C.; Pretorius, Lara-Simone; Nowak, Ekaterina; Subramaniam, Gayathery; Dalton-Morgan, Jessica; Schenk, Peer M.

    2017-01-01

    Studies investigating the action of small RNAs on computationally predicted target genes require some form of experimental validation. Classical molecular methods of validating microRNA action on target genes are laborious, while approaches that tag predicted target sequences to qualitative reporter genes encounter technical limitations. The aim of this study was to address the challenge of experimentally validating large numbers of computationally predicted microRNA-target transcript interactions using an optimized, quantitative, cost-effective, and scalable approach. The presented method combines transient expression via agroinfiltration of Nicotiana benthamiana leaves with a quantitative dual luciferase reporter system, where firefly luciferase is used to report the microRNA-target sequence interaction and Renilla luciferase is used as an internal standard to normalize expression between replicates. We report the appropriate concentration of N. benthamiana leaf extracts and dilution factor to apply in order to avoid inhibition of firefly LUC activity. Furthermore, the optimal ratio of microRNA precursor expression construct to reporter construct and duration of the incubation period post-agroinfiltration were determined. The optimized dual luciferase assay provides an efficient, repeatable and scalable method to validate and quantify microRNA action on predicted target sequences. The optimized assay was used to validate five predicted targets of rice microRNA miR529b, with as few as six technical replicates. The assay can be extended to assess other small RNA-target sequence interactions, including assessing the functionality of an artificial miRNA or an RNAi construct on a targeted sequence. PMID:28979287

  13. Predicting Drug-Target Interactions for New Drug Compounds Using a Weighted Nearest Neighbor Profile.

    PubMed

    van Laarhoven, Twan; Marchiori, Elena

    2013-01-01

    In silico discovery of interactions between drug compounds and target proteins is of core importance for improving the efficiency of the laborious and costly experimental determination of drug-target interaction. Drug-target interaction data are available for many classes of pharmaceutically useful target proteins including enzymes, ion channels, GPCRs and nuclear receptors. However, current drug-target interaction databases contain a small number of drug-target pairs which are experimentally validated interactions. In particular, for some drug compounds (or targets) there is no available interaction. This motivates the need for developing methods that predict interacting pairs with high accuracy also for these 'new' drug compounds (or targets). We show that a simple weighted nearest neighbor procedure is highly effective for this task. We integrate this procedure into a recent machine learning method for drug-target interaction we developed in previous work. Results of experiments indicate that the resulting method predicts true interactions with high accuracy also for new drug compounds and achieves results comparable or better than those of recent state-of-the-art algorithms. Software is publicly available at http://cs.ru.nl/~tvanlaarhoven/drugtarget2013/.

  14. Improving consensus contact prediction via server correlation reduction.

    PubMed

    Gao, Xin; Bu, Dongbo; Xu, Jinbo; Li, Ming

    2009-05-06

    Protein inter-residue contacts play a crucial role in the determination and prediction of protein structures. Previous studies on contact prediction indicate that although template-based consensus methods outperform sequence-based methods on targets with typical templates, such consensus methods perform poorly on new fold targets. However, we find out that even for new fold targets, the models generated by threading programs can contain many true contacts. The challenge is how to identify them. In this paper, we develop an integer linear programming model for consensus contact prediction. In contrast to the simple majority voting method assuming that all the individual servers are equally important and independent, the newly developed method evaluates their correlation by using maximum likelihood estimation and extracts independent latent servers from them by using principal component analysis. An integer linear programming method is then applied to assign a weight to each latent server to maximize the difference between true contacts and false ones. The proposed method is tested on the CASP7 data set. If the top L/5 predicted contacts are evaluated where L is the protein size, the average accuracy is 73%, which is much higher than that of any previously reported study. Moreover, if only the 15 new fold CASP7 targets are considered, our method achieves an average accuracy of 37%, which is much better than that of the majority voting method, SVM-LOMETS, SVM-SEQ, and SAM-T06. These methods demonstrate an average accuracy of 13.0%, 10.8%, 25.8% and 21.2%, respectively. Reducing server correlation and optimally combining independent latent servers show a significant improvement over the traditional consensus methods. This approach can hopefully provide a powerful tool for protein structure refinement and prediction use.

  15. Prediction of Drug-Target Interactions and Drug Repositioning via Network-Based Inference

    PubMed Central

    Jiang, Jing; Lu, Weiqiang; Li, Weihua; Liu, Guixia; Zhou, Weixing; Huang, Jin; Tang, Yun

    2012-01-01

    Drug-target interaction (DTI) is the basis of drug discovery and design. It is time consuming and costly to determine DTI experimentally. Hence, it is necessary to develop computational methods for the prediction of potential DTI. Based on complex network theory, three supervised inference methods were developed here to predict DTI and used for drug repositioning, namely drug-based similarity inference (DBSI), target-based similarity inference (TBSI) and network-based inference (NBI). Among them, NBI performed best on four benchmark data sets. Then a drug-target network was created with NBI based on 12,483 FDA-approved and experimental drug-target binary links, and some new DTIs were further predicted. In vitro assays confirmed that five old drugs, namely montelukast, diclofenac, simvastatin, ketoconazole, and itraconazole, showed polypharmacological features on estrogen receptors or dipeptidyl peptidase-IV with half maximal inhibitory or effective concentration ranged from 0.2 to 10 µM. Moreover, simvastatin and ketoconazole showed potent antiproliferative activities on human MDA-MB-231 breast cancer cell line in MTT assays. The results indicated that these methods could be powerful tools in prediction of DTIs and drug repositioning. PMID:22589709

  16. A novel multi-target regression framework for time-series prediction of drug efficacy.

    PubMed

    Li, Haiqing; Zhang, Wei; Chen, Ying; Guo, Yumeng; Li, Guo-Zheng; Zhu, Xiaoxin

    2017-01-18

    Excavating from small samples is a challenging pharmacokinetic problem, where statistical methods can be applied. Pharmacokinetic data is special due to the small samples of high dimensionality, which makes it difficult to adopt conventional methods to predict the efficacy of traditional Chinese medicine (TCM) prescription. The main purpose of our study is to obtain some knowledge of the correlation in TCM prescription. Here, a novel method named Multi-target Regression Framework to deal with the problem of efficacy prediction is proposed. We employ the correlation between the values of different time sequences and add predictive targets of previous time as features to predict the value of current time. Several experiments are conducted to test the validity of our method and the results of leave-one-out cross-validation clearly manifest the competitiveness of our framework. Compared with linear regression, artificial neural networks, and partial least squares, support vector regression combined with our framework demonstrates the best performance, and appears to be more suitable for this task.

  17. A novel multi-target regression framework for time-series prediction of drug efficacy

    PubMed Central

    Li, Haiqing; Zhang, Wei; Chen, Ying; Guo, Yumeng; Li, Guo-Zheng; Zhu, Xiaoxin

    2017-01-01

    Excavating from small samples is a challenging pharmacokinetic problem, where statistical methods can be applied. Pharmacokinetic data is special due to the small samples of high dimensionality, which makes it difficult to adopt conventional methods to predict the efficacy of traditional Chinese medicine (TCM) prescription. The main purpose of our study is to obtain some knowledge of the correlation in TCM prescription. Here, a novel method named Multi-target Regression Framework to deal with the problem of efficacy prediction is proposed. We employ the correlation between the values of different time sequences and add predictive targets of previous time as features to predict the value of current time. Several experiments are conducted to test the validity of our method and the results of leave-one-out cross-validation clearly manifest the competitiveness of our framework. Compared with linear regression, artificial neural networks, and partial least squares, support vector regression combined with our framework demonstrates the best performance, and appears to be more suitable for this task. PMID:28098186

  18. Boys with a simple delayed puberty reach their target height.

    PubMed

    Cools, B L M; Rooman, R; Op De Beeck, L; Du Caju, M V L

    2008-01-01

    Final height in boys with delayed puberty is thought to be below target height. This conclusion, however, is based on studies that included patients with genetic short stature. We therefore studied final height in a group of 33 untreated boys with delayed puberty with a target height >-1.5 SDS. Standing height, sitting height, weight and arm span width were measured in each patient. Final height was predicted by the method of Greulich and Pyle using the tables of Bailey and Pinneau for retarded boys at their bone age (PAH1) and the tables of Bailey and Pinneau for average boys plus six months (PAH2). Mean final height (175.8 +/- 6.5 cm) was appropriate for the mean target height (174.7 +/- 4.5 cm). The prediction method of Bailey and Pinneau overestimated the final height by 1.4 cm and the modified prediction method slightly underestimated the final height (-0.15 cm). Boys with untreated delayed puberty reach a final height appropriate for their target height. Final height was best predicted by the method of Bailey and Pinneau using the tables for average boys at their bone age plus six months. Copyright 2008 S. Karger AG, Basel.

  19. TargetSpy: a supervised machine learning approach for microRNA target prediction.

    PubMed

    Sturm, Martin; Hackenberg, Michael; Langenberger, David; Frishman, Dmitrij

    2010-05-28

    Virtually all currently available microRNA target site prediction algorithms require the presence of a (conserved) seed match to the 5' end of the microRNA. Recently however, it has been shown that this requirement might be too stringent, leading to a substantial number of missed target sites. We developed TargetSpy, a novel computational approach for predicting target sites regardless of the presence of a seed match. It is based on machine learning and automatic feature selection using a wide spectrum of compositional, structural, and base pairing features covering current biological knowledge. Our model does not rely on evolutionary conservation, which allows the detection of species-specific interactions and makes TargetSpy suitable for analyzing unconserved genomic sequences.In order to allow for an unbiased comparison of TargetSpy to other methods, we classified all algorithms into three groups: I) no seed match requirement, II) seed match requirement, and III) conserved seed match requirement. TargetSpy predictions for classes II and III are generated by appropriate postfiltering. On a human dataset revealing fold-change in protein production for five selected microRNAs our method shows superior performance in all classes. In Drosophila melanogaster not only our class II and III predictions are on par with other algorithms, but notably the class I (no-seed) predictions are just marginally less accurate. We estimate that TargetSpy predicts between 26 and 112 functional target sites without a seed match per microRNA that are missed by all other currently available algorithms. Only a few algorithms can predict target sites without demanding a seed match and TargetSpy demonstrates a substantial improvement in prediction accuracy in that class. Furthermore, when conservation and the presence of a seed match are required, the performance is comparable with state-of-the-art algorithms. TargetSpy was trained on mouse and performs well in human and drosophila, suggesting that it may be applicable to a broad range of species. Moreover, we have demonstrated that the application of machine learning techniques in combination with upcoming deep sequencing data results in a powerful microRNA target site prediction tool http://www.targetspy.org.

  20. TargetSpy: a supervised machine learning approach for microRNA target prediction

    PubMed Central

    2010-01-01

    Background Virtually all currently available microRNA target site prediction algorithms require the presence of a (conserved) seed match to the 5' end of the microRNA. Recently however, it has been shown that this requirement might be too stringent, leading to a substantial number of missed target sites. Results We developed TargetSpy, a novel computational approach for predicting target sites regardless of the presence of a seed match. It is based on machine learning and automatic feature selection using a wide spectrum of compositional, structural, and base pairing features covering current biological knowledge. Our model does not rely on evolutionary conservation, which allows the detection of species-specific interactions and makes TargetSpy suitable for analyzing unconserved genomic sequences. In order to allow for an unbiased comparison of TargetSpy to other methods, we classified all algorithms into three groups: I) no seed match requirement, II) seed match requirement, and III) conserved seed match requirement. TargetSpy predictions for classes II and III are generated by appropriate postfiltering. On a human dataset revealing fold-change in protein production for five selected microRNAs our method shows superior performance in all classes. In Drosophila melanogaster not only our class II and III predictions are on par with other algorithms, but notably the class I (no-seed) predictions are just marginally less accurate. We estimate that TargetSpy predicts between 26 and 112 functional target sites without a seed match per microRNA that are missed by all other currently available algorithms. Conclusion Only a few algorithms can predict target sites without demanding a seed match and TargetSpy demonstrates a substantial improvement in prediction accuracy in that class. Furthermore, when conservation and the presence of a seed match are required, the performance is comparable with state-of-the-art algorithms. TargetSpy was trained on mouse and performs well in human and drosophila, suggesting that it may be applicable to a broad range of species. Moreover, we have demonstrated that the application of machine learning techniques in combination with upcoming deep sequencing data results in a powerful microRNA target site prediction tool http://www.targetspy.org. PMID:20509939

  1. TargetM6A: Identifying N6-Methyladenosine Sites From RNA Sequences via Position-Specific Nucleotide Propensities and a Support Vector Machine.

    PubMed

    Li, Guang-Qing; Liu, Zi; Shen, Hong-Bin; Yu, Dong-Jun

    2016-10-01

    As one of the most ubiquitous post-transcriptional modifications of RNA, N 6 -methyladenosine ( [Formula: see text]) plays an essential role in many vital biological processes. The identification of [Formula: see text] sites in RNAs is significantly important for both basic biomedical research and practical drug development. In this study, we designed a computational-based method, called TargetM6A, to rapidly and accurately target [Formula: see text] sites solely from the primary RNA sequences. Two new features, i.e., position-specific nucleotide/dinucleotide propensities (PSNP/PSDP), are introduced and combined with the traditional nucleotide composition (NC) feature to formulate RNA sequences. The extracted features are further optimized to obtain a much more compact and discriminative feature subset by applying an incremental feature selection (IFS) procedure. Based on the optimized feature subset, we trained TargetM6A on the training dataset with a support vector machine (SVM) as the prediction engine. We compared the proposed TargetM6A method with existing methods for predicting [Formula: see text] sites by performing stringent jackknife tests and independent validation tests on benchmark datasets. The experimental results show that the proposed TargetM6A method outperformed the existing methods for predicting [Formula: see text] sites and remarkably improved the prediction performances, with MCC = 0.526 and AUC = 0.818. We also provided a user-friendly web server for TargetM6A, which is publicly accessible for academic use at http://csbio.njust.edu.cn/bioinf/TargetM6A.

  2. Random walks on mutual microRNA-target gene interaction network improve the prediction of disease-associated microRNAs.

    PubMed

    Le, Duc-Hau; Verbeke, Lieven; Son, Le Hoang; Chu, Dinh-Toi; Pham, Van-Huy

    2017-11-14

    MicroRNAs (miRNAs) have been shown to play an important role in pathological initiation, progression and maintenance. Because identification in the laboratory of disease-related miRNAs is not straightforward, numerous network-based methods have been developed to predict novel miRNAs in silico. Homogeneous networks (in which every node is a miRNA) based on the targets shared between miRNAs have been widely used to predict their role in disease phenotypes. Although such homogeneous networks can predict potential disease-associated miRNAs, they do not consider the roles of the target genes of the miRNAs. Here, we introduce a novel method based on a heterogeneous network that not only considers miRNAs but also the corresponding target genes in the network model. Instead of constructing homogeneous miRNA networks, we built heterogeneous miRNA networks consisting of both miRNAs and their target genes, using databases of known miRNA-target gene interactions. In addition, as recent studies demonstrated reciprocal regulatory relations between miRNAs and their target genes, we considered these heterogeneous miRNA networks to be undirected, assuming mutual miRNA-target interactions. Next, we introduced a novel method (RWRMTN) operating on these mutual heterogeneous miRNA networks to rank candidate disease-related miRNAs using a random walk with restart (RWR) based algorithm. Using both known disease-associated miRNAs and their target genes as seed nodes, the method can identify additional miRNAs involved in the disease phenotype. Experiments indicated that RWRMTN outperformed two existing state-of-the-art methods: RWRMDA, a network-based method that also uses a RWR on homogeneous (rather than heterogeneous) miRNA networks, and RLSMDA, a machine learning-based method. Interestingly, we could relate this performance gain to the emergence of "disease modules" in the heterogeneous miRNA networks used as input for the algorithm. Moreover, we could demonstrate that RWRMTN is stable, performing well when using both experimentally validated and predicted miRNA-target gene interaction data for network construction. Finally, using RWRMTN, we identified 76 novel miRNAs associated with 23 disease phenotypes which were present in a recent database of known disease-miRNA associations. Summarizing, using random walks on mutual miRNA-target networks improves the prediction of novel disease-associated miRNAs because of the existence of "disease modules" in these networks.

  3. SELF-BLM: Prediction of drug-target interactions via self-training SVM.

    PubMed

    Keum, Jongsoo; Nam, Hojung

    2017-01-01

    Predicting drug-target interactions is important for the development of novel drugs and the repositioning of drugs. To predict such interactions, there are a number of methods based on drug and target protein similarity. Although these methods, such as the bipartite local model (BLM), show promise, they often categorize unknown interactions as negative interaction. Therefore, these methods are not ideal for finding potential drug-target interactions that have not yet been validated as positive interactions. Thus, here we propose a method that integrates machine learning techniques, such as self-training support vector machine (SVM) and BLM, to develop a self-training bipartite local model (SELF-BLM) that facilitates the identification of potential interactions. The method first categorizes unlabeled interactions and negative interactions among unknown interactions using a clustering method. Then, using the BLM method and self-training SVM, the unlabeled interactions are self-trained and final local classification models are constructed. When applied to four classes of proteins that include enzymes, G-protein coupled receptors (GPCRs), ion channels, and nuclear receptors, SELF-BLM showed the best performance for predicting not only known interactions but also potential interactions in three protein classes compare to other related studies. The implemented software and supporting data are available at https://github.com/GIST-CSBL/SELF-BLM.

  4. Improvement of Predictive Ability by Uniform Coverage of the Target Genetic Space

    PubMed Central

    Bustos-Korts, Daniela; Malosetti, Marcos; Chapman, Scott; Biddulph, Ben; van Eeuwijk, Fred

    2016-01-01

    Genome-enabled prediction provides breeders with the means to increase the number of genotypes that can be evaluated for selection. One of the major challenges in genome-enabled prediction is how to construct a training set of genotypes from a calibration set that represents the target population of genotypes, where the calibration set is composed of a training and validation set. A random sampling protocol of genotypes from the calibration set will lead to low quality coverage of the total genetic space by the training set when the calibration set contains population structure. As a consequence, predictive ability will be affected negatively, because some parts of the genotypic diversity in the target population will be under-represented in the training set, whereas other parts will be over-represented. Therefore, we propose a training set construction method that uniformly samples the genetic space spanned by the target population of genotypes, thereby increasing predictive ability. To evaluate our method, we constructed training sets alongside with the identification of corresponding genomic prediction models for four genotype panels that differed in the amount of population structure they contained (maize Flint, maize Dent, wheat, and rice). Training sets were constructed using uniform sampling, stratified-uniform sampling, stratified sampling and random sampling. We compared these methods with a method that maximizes the generalized coefficient of determination (CD). Several training set sizes were considered. We investigated four genomic prediction models: multi-locus QTL models, GBLUP models, combinations of QTL and GBLUPs, and Reproducing Kernel Hilbert Space (RKHS) models. For the maize and wheat panels, construction of the training set under uniform sampling led to a larger predictive ability than under stratified and random sampling. The results of our methods were similar to those of the CD method. For the rice panel, all training set construction methods led to similar predictive ability, a reflection of the very strong population structure in this panel. PMID:27672112

  5. DrugE-Rank: improving drug–target interaction prediction of new candidate drugs or targets by ensemble learning to rank

    PubMed Central

    Yuan, Qingjun; Gao, Junning; Wu, Dongliang; Zhang, Shihua; Mamitsuka, Hiroshi; Zhu, Shanfeng

    2016-01-01

    Motivation: Identifying drug–target interactions is an important task in drug discovery. To reduce heavy time and financial cost in experimental way, many computational approaches have been proposed. Although these approaches have used many different principles, their performance is far from satisfactory, especially in predicting drug–target interactions of new candidate drugs or targets. Methods: Approaches based on machine learning for this problem can be divided into two types: feature-based and similarity-based methods. Learning to rank is the most powerful technique in the feature-based methods. Similarity-based methods are well accepted, due to their idea of connecting the chemical and genomic spaces, represented by drug and target similarities, respectively. We propose a new method, DrugE-Rank, to improve the prediction performance by nicely combining the advantages of the two different types of methods. That is, DrugE-Rank uses LTR, for which multiple well-known similarity-based methods can be used as components of ensemble learning. Results: The performance of DrugE-Rank is thoroughly examined by three main experiments using data from DrugBank: (i) cross-validation on FDA (US Food and Drug Administration) approved drugs before March 2014; (ii) independent test on FDA approved drugs after March 2014; and (iii) independent test on FDA experimental drugs. Experimental results show that DrugE-Rank outperforms competing methods significantly, especially achieving more than 30% improvement in Area under Prediction Recall curve for FDA approved new drugs and FDA experimental drugs. Availability: http://datamining-iip.fudan.edu.cn/service/DrugE-Rank Contact: zhusf@fudan.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307615

  6. IFPTarget: A Customized Virtual Target Identification Method Based on Protein-Ligand Interaction Fingerprinting Analyses.

    PubMed

    Li, Guo-Bo; Yu, Zhu-Jun; Liu, Sha; Huang, Lu-Yi; Yang, Ling-Ling; Lohans, Christopher T; Yang, Sheng-Yong

    2017-07-24

    Small-molecule target identification is an important and challenging task for chemical biology and drug discovery. Structure-based virtual target identification has been widely used, which infers and prioritizes potential protein targets for the molecule of interest (MOI) principally via a scoring function. However, current "universal" scoring functions may not always accurately identify targets to which the MOI binds from the retrieved target database, in part due to a lack of consideration of the important binding features for an individual target. Here, we present IFPTarget, a customized virtual target identification method, which uses an interaction fingerprinting (IFP) method for target-specific interaction analyses and a comprehensive index (Cvalue) for target ranking. Evaluation results indicate that the IFP method enables substantially improved binding pose prediction, and Cvalue has an excellent performance in target ranking for the test set. When applied to screen against our established target library that contains 11,863 protein structures covering 2842 unique targets, IFPTarget could retrieve known targets within the top-ranked list and identified new potential targets for chemically diverse drugs. IFPTarget prediction led to the identification of the metallo-β-lactamase VIM-2 as a target for quercetin as validated by enzymatic inhibition assays. This study provides a new in silico target identification tool and will aid future efforts to develop new target-customized methods for target identification.

  7. Multiple grid arrangement improves ligand docking with unknown binding sites: Application to the inverse docking problem.

    PubMed

    Ban, Tomohiro; Ohue, Masahito; Akiyama, Yutaka

    2018-04-01

    The identification of comprehensive drug-target interactions is important in drug discovery. Although numerous computational methods have been developed over the years, a gold standard technique has not been established. Computational ligand docking and structure-based drug design allow researchers to predict the binding affinity between a compound and a target protein, and thus, they are often used to virtually screen compound libraries. In addition, docking techniques have also been applied to the virtual screening of target proteins (inverse docking) to predict target proteins of a drug candidate. Nevertheless, a more accurate docking method is currently required. In this study, we proposed a method in which a predicted ligand-binding site is covered by multiple grids, termed multiple grid arrangement. Notably, multiple grid arrangement facilitates the conformational search for a grid-based ligand docking software and can be applied to the state-of-the-art commercial docking software Glide (Schrödinger, LLC). We validated the proposed method by re-docking with the Astex diverse benchmark dataset and blind binding site situations, which improved the correct prediction rate of the top scoring docking pose from 27.1% to 34.1%; however, only a slight improvement in target prediction accuracy was observed with inverse docking scenarios. These findings highlight the limitations and challenges of current scoring functions and the need for more accurate docking methods. The proposed multiple grid arrangement method was implemented in Glide by modifying a cross-docking script for Glide, xglide.py. The script of our method is freely available online at http://www.bi.cs.titech.ac.jp/mga_glide/. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  8. Blind predictions of protein interfaces by docking calculations in CAPRI.

    PubMed

    Lensink, Marc F; Wodak, Shoshana J

    2010-11-15

    Reliable prediction of the amino acid residues involved in protein-protein interfaces can provide valuable insight into protein function, and inform mutagenesis studies, and drug design applications. A fast-growing number of methods are being proposed for predicting protein interfaces, using structural information, energetic criteria, or sequence conservation or by integrating multiple criteria and approaches. Overall however, their performance remains limited, especially when applied to nonobligate protein complexes, where the individual components are also stable on their own. Here, we evaluate interface predictions derived from protein-protein docking calculations. To this end we measure the overlap between the interfaces in models of protein complexes submitted by 76 participants in CAPRI (Critical Assessment of Predicted Interactions) and those of 46 observed interfaces in 20 CAPRI targets corresponding to nonobligate complexes. Our evaluation considers multiple models for each target interface, submitted by different participants, using a variety of docking methods. Although this results in a substantial variability in the prediction performance across participants and targets, clear trends emerge. Docking methods that perform best in our evaluation predict interfaces with average recall and precision levels of about 60%, for a small majority (60%) of the analyzed interfaces. These levels are significantly higher than those obtained for nonobligate complexes by most extant interface prediction methods. We find furthermore that a sizable fraction (24%) of the interfaces in models ranked as incorrect in the CAPRI assessment are actually correctly predicted (recall and precision ≥50%), and that these models contribute to 70% of the correct docking-based interface predictions overall. Our analysis proves that docking methods are much more successful in identifying interfaces than in predicting complexes, and suggests that these methods have an excellent potential of addressing the interface prediction challenge. © 2010 Wiley-Liss, Inc.

  9. Large-Scale Off-Target Identification Using Fast and Accurate Dual Regularized One-Class Collaborative Filtering and Its Application to Drug Repurposing.

    PubMed

    Lim, Hansaim; Poleksic, Aleksandar; Yao, Yuan; Tong, Hanghang; He, Di; Zhuang, Luke; Meng, Patrick; Xie, Lei

    2016-10-01

    Target-based screening is one of the major approaches in drug discovery. Besides the intended target, unexpected drug off-target interactions often occur, and many of them have not been recognized and characterized. The off-target interactions can be responsible for either therapeutic or side effects. Thus, identifying the genome-wide off-targets of lead compounds or existing drugs will be critical for designing effective and safe drugs, and providing new opportunities for drug repurposing. Although many computational methods have been developed to predict drug-target interactions, they are either less accurate than the one that we are proposing here or computationally too intensive, thereby limiting their capability for large-scale off-target identification. In addition, the performances of most machine learning based algorithms have been mainly evaluated to predict off-target interactions in the same gene family for hundreds of chemicals. It is not clear how these algorithms perform in terms of detecting off-targets across gene families on a proteome scale. Here, we are presenting a fast and accurate off-target prediction method, REMAP, which is based on a dual regularized one-class collaborative filtering algorithm, to explore continuous chemical space, protein space, and their interactome on a large scale. When tested in a reliable, extensive, and cross-gene family benchmark, REMAP outperforms the state-of-the-art methods. Furthermore, REMAP is highly scalable. It can screen a dataset of 200 thousands chemicals against 20 thousands proteins within 2 hours. Using the reconstructed genome-wide target profile as the fingerprint of a chemical compound, we predicted that seven FDA-approved drugs can be repurposed as novel anti-cancer therapies. The anti-cancer activity of six of them is supported by experimental evidences. Thus, REMAP is a valuable addition to the existing in silico toolbox for drug target identification, drug repurposing, phenotypic screening, and side effect prediction. The software and benchmark are available at https://github.com/hansaimlim/REMAP.

  10. Large-Scale Off-Target Identification Using Fast and Accurate Dual Regularized One-Class Collaborative Filtering and Its Application to Drug Repurposing

    PubMed Central

    Poleksic, Aleksandar; Yao, Yuan; Tong, Hanghang; Meng, Patrick; Xie, Lei

    2016-01-01

    Target-based screening is one of the major approaches in drug discovery. Besides the intended target, unexpected drug off-target interactions often occur, and many of them have not been recognized and characterized. The off-target interactions can be responsible for either therapeutic or side effects. Thus, identifying the genome-wide off-targets of lead compounds or existing drugs will be critical for designing effective and safe drugs, and providing new opportunities for drug repurposing. Although many computational methods have been developed to predict drug-target interactions, they are either less accurate than the one that we are proposing here or computationally too intensive, thereby limiting their capability for large-scale off-target identification. In addition, the performances of most machine learning based algorithms have been mainly evaluated to predict off-target interactions in the same gene family for hundreds of chemicals. It is not clear how these algorithms perform in terms of detecting off-targets across gene families on a proteome scale. Here, we are presenting a fast and accurate off-target prediction method, REMAP, which is based on a dual regularized one-class collaborative filtering algorithm, to explore continuous chemical space, protein space, and their interactome on a large scale. When tested in a reliable, extensive, and cross-gene family benchmark, REMAP outperforms the state-of-the-art methods. Furthermore, REMAP is highly scalable. It can screen a dataset of 200 thousands chemicals against 20 thousands proteins within 2 hours. Using the reconstructed genome-wide target profile as the fingerprint of a chemical compound, we predicted that seven FDA-approved drugs can be repurposed as novel anti-cancer therapies. The anti-cancer activity of six of them is supported by experimental evidences. Thus, REMAP is a valuable addition to the existing in silico toolbox for drug target identification, drug repurposing, phenotypic screening, and side effect prediction. The software and benchmark are available at https://github.com/hansaimlim/REMAP. PMID:27716836

  11. Drug-target interaction prediction from PSSM based evolutionary information.

    PubMed

    Mousavian, Zaynab; Khakabimamaghani, Sahand; Kavousi, Kaveh; Masoudi-Nejad, Ali

    2016-01-01

    The labor-intensive and expensive experimental process of drug-target interaction prediction has motivated many researchers to focus on in silico prediction, which leads to the helpful information in supporting the experimental interaction data. Therefore, they have proposed several computational approaches for discovering new drug-target interactions. Several learning-based methods have been increasingly developed which can be categorized into two main groups: similarity-based and feature-based. In this paper, we firstly use the bi-gram features extracted from the Position Specific Scoring Matrix (PSSM) of proteins in predicting drug-target interactions. Our results demonstrate the high-confidence prediction ability of the Bigram-PSSM model in terms of several performance indicators specifically for enzymes and ion channels. Moreover, we investigate the impact of negative selection strategy on the performance of the prediction, which is not widely taken into account in the other relevant studies. This is important, as the number of non-interacting drug-target pairs are usually extremely large in comparison with the number of interacting ones in existing drug-target interaction data. An interesting observation is that different levels of performance reduction have been attained for four datasets when we change the sampling method from the random sampling to the balanced sampling. Copyright © 2015 Elsevier Inc. All rights reserved.

  12. Protein model quality assessment prediction by combining fragment comparisons and a consensus Cα contact potential

    PubMed Central

    Zhou, Hongyi; Skolnick, Jeffrey

    2009-01-01

    In this work, we develop a fully automated method for the quality assessment prediction of protein structural models generated by structure prediction approaches such as fold recognition servers, or ab initio methods. The approach is based on fragment comparisons and a consensus Cα contact potential derived from the set of models to be assessed and was tested on CASP7 server models. The average Pearson linear correlation coefficient between predicted quality and model GDT-score per target is 0.83 for the 98 targets which is better than those of other quality assessment methods that participated in CASP7. Our method also outperforms the other methods by about 3% as assessed by the total GDT-score of the selected top models. PMID:18004783

  13. Prediction of Protein Structure by Template-Based Modeling Combined with the UNRES Force Field.

    PubMed

    Krupa, Paweł; Mozolewska, Magdalena A; Joo, Keehyoung; Lee, Jooyoung; Czaplewski, Cezary; Liwo, Adam

    2015-06-22

    A new approach to the prediction of protein structures that uses distance and backbone virtual-bond dihedral angle restraints derived from template-based models and simulations with the united residue (UNRES) force field is proposed. The approach combines the accuracy and reliability of template-based methods for the segments of the target sequence with high similarity to those having known structures with the ability of UNRES to pack the domains correctly. Multiplexed replica-exchange molecular dynamics with restraints derived from template-based models of a given target, in which each restraint is weighted according to the accuracy of the prediction of the corresponding section of the molecule, is used to search the conformational space, and the weighted histogram analysis method and cluster analysis are applied to determine the families of the most probable conformations, from which candidate predictions are selected. To test the capability of the method to recover template-based models from restraints, five single-domain proteins with structures that have been well-predicted by template-based methods were used; it was found that the resulting structures were of the same quality as the best of the original models. To assess whether the new approach can improve template-based predictions with incorrectly predicted domain packing, four such targets were selected from the CASP10 targets; for three of them the new approach resulted in significantly better predictions compared with the original template-based models. The new approach can be used to predict the structures of proteins for which good templates can be found for sections of the sequence or an overall good template can be found for the entire sequence but the prediction quality is remarkably weaker in putative domain-linker regions.

  14. DrugE-Rank: improving drug-target interaction prediction of new candidate drugs or targets by ensemble learning to rank.

    PubMed

    Yuan, Qingjun; Gao, Junning; Wu, Dongliang; Zhang, Shihua; Mamitsuka, Hiroshi; Zhu, Shanfeng

    2016-06-15

    Identifying drug-target interactions is an important task in drug discovery. To reduce heavy time and financial cost in experimental way, many computational approaches have been proposed. Although these approaches have used many different principles, their performance is far from satisfactory, especially in predicting drug-target interactions of new candidate drugs or targets. Approaches based on machine learning for this problem can be divided into two types: feature-based and similarity-based methods. Learning to rank is the most powerful technique in the feature-based methods. Similarity-based methods are well accepted, due to their idea of connecting the chemical and genomic spaces, represented by drug and target similarities, respectively. We propose a new method, DrugE-Rank, to improve the prediction performance by nicely combining the advantages of the two different types of methods. That is, DrugE-Rank uses LTR, for which multiple well-known similarity-based methods can be used as components of ensemble learning. The performance of DrugE-Rank is thoroughly examined by three main experiments using data from DrugBank: (i) cross-validation on FDA (US Food and Drug Administration) approved drugs before March 2014; (ii) independent test on FDA approved drugs after March 2014; and (iii) independent test on FDA experimental drugs. Experimental results show that DrugE-Rank outperforms competing methods significantly, especially achieving more than 30% improvement in Area under Prediction Recall curve for FDA approved new drugs and FDA experimental drugs. http://datamining-iip.fudan.edu.cn/service/DrugE-Rank zhusf@fudan.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  15. Genome scale enzyme–metabolite and drug–target interaction predictions using the signature molecular descriptor

    DOE PAGES

    Faulon, Jean-Loup; Misra, Milind; Martin, Shawn; ...

    2007-11-23

    Motivation: Identifying protein enzymatic or pharmacological activities are important areas of research in biology and chemistry. Biological and chemical databases are increasingly being populated with linkages between protein sequences and chemical structures. Additionally, there is now sufficient information to apply machine-learning techniques to predict interactions between chemicals and proteins at a genome scale. Current machine-learning techniques use as input either protein sequences and structures or chemical information. We propose here a method to infer protein–chemical interactions using heterogeneous input consisting of both protein sequence and chemical information. Results: Our method relies on expressing proteins and chemicals with a common cheminformaticsmore » representation. We demonstrate our approach by predicting whether proteins can catalyze reactions not present in training sets. We also predict whether a given drug can bind a target, in the absence of prior binding information for that drug and target. Lastly, such predictions cannot be made with current machine-learning techniques requiring binding information for individual reactions or individual targets.« less

  16. A new approach to human microRNA target prediction using ensemble pruning and rotation forest.

    PubMed

    Mousavi, Reza; Eftekhari, Mahdi; Haghighi, Mehdi Ghezelbash

    2015-12-01

    MicroRNAs (miRNAs) are small non-coding RNAs that have important functions in gene regulation. Since finding miRNA target experimentally is costly and needs spending much time, the use of machine learning methods is a growing research area for miRNA target prediction. In this paper, a new approach is proposed by using two popular ensemble strategies, i.e. Ensemble Pruning and Rotation Forest (EP-RTF), to predict human miRNA target. For EP, the approach utilizes Genetic Algorithm (GA). In other words, a subset of classifiers from the heterogeneous ensemble is first selected by GA. Next, the selected classifiers are trained based on the RTF method and then are combined using weighted majority voting. In addition to seeking a better subset of classifiers, the parameter of RTF is also optimized by GA. Findings of the present study confirm that the newly developed EP-RTF outperforms (in terms of classification accuracy, sensitivity, and specificity) the previously applied methods over four datasets in the field of human miRNA target. Diversity-error diagrams reveal that the proposed ensemble approach constructs individual classifiers which are more accurate and usually diverse than the other ensemble approaches. Given these experimental results, we highly recommend EP-RTF for improving the performance of miRNA target prediction.

  17. Plant microRNA-Target Interaction Identification Model Based on the Integration of Prediction Tools and Support Vector Machine

    PubMed Central

    Meng, Jun; Shi, Lin; Luan, Yushi

    2014-01-01

    Background Confident identification of microRNA-target interactions is significant for studying the function of microRNA (miRNA). Although some computational miRNA target prediction methods have been proposed for plants, results of various methods tend to be inconsistent and usually lead to more false positive. To address these issues, we developed an integrated model for identifying plant miRNA–target interactions. Results Three online miRNA target prediction toolkits and machine learning algorithms were integrated to identify and analyze Arabidopsis thaliana miRNA-target interactions. Principle component analysis (PCA) feature extraction and self-training technology were introduced to improve the performance. Results showed that the proposed model outperformed the previously existing methods. The results were validated by using degradome sequencing supported Arabidopsis thaliana miRNA-target interactions. The proposed model constructed on Arabidopsis thaliana was run over Oryza sativa and Vitis vinifera to demonstrate that our model is effective for other plant species. Conclusions The integrated model of online predictors and local PCA-SVM classifier gained credible and high quality miRNA-target interactions. The supervised learning algorithm of PCA-SVM classifier was employed in plant miRNA target identification for the first time. Its performance can be substantially improved if more experimentally proved training samples are provided. PMID:25051153

  18. Brainstorming: weighted voting prediction of inhibitors for protein targets.

    PubMed

    Plewczynski, Dariusz

    2011-09-01

    The "Brainstorming" approach presented in this paper is a weighted voting method that can improve the quality of predictions generated by several machine learning (ML) methods. First, an ensemble of heterogeneous ML algorithms is trained on available experimental data, then all solutions are gathered and a consensus is built between them. The final prediction is performed using a voting procedure, whereby the vote of each method is weighted according to a quality coefficient calculated using multivariable linear regression (MLR). The MLR optimization procedure is very fast, therefore no additional computational cost is introduced by using this jury approach. Here, brainstorming is applied to selecting actives from large collections of compounds relating to five diverse biological targets of medicinal interest, namely HIV-reverse transcriptase, cyclooxygenase-2, dihydrofolate reductase, estrogen receptor, and thrombin. The MDL Drug Data Report (MDDR) database was used for selecting known inhibitors for these protein targets, and experimental data was then used to train a set of machine learning methods. The benchmark dataset (available at http://bio.icm.edu.pl/∼darman/chemoinfo/benchmark.tar.gz ) can be used for further testing of various clustering and machine learning methods when predicting the biological activity of compounds. Depending on the protein target, the overall recall value is raised by at least 20% in comparison to any single machine learning method (including ensemble methods like random forest) and unweighted simple majority voting procedures.

  19. Synergistic target combination prediction from curated signaling networks: Machine learning meets systems biology and pharmacology.

    PubMed

    Chua, Huey Eng; Bhowmick, Sourav S; Tucker-Kellogg, Lisa

    2017-10-01

    Given a signaling network, the target combination prediction problem aims to predict efficacious and safe target combinations for combination therapy. State-of-the-art in silico methods use Monte Carlo simulated annealing (mcsa) to modify a candidate solution stochastically, and use the Metropolis criterion to accept or reject the proposed modifications. However, such stochastic modifications ignore the impact of the choice of targets and their activities on the combination's therapeutic effect and off-target effects, which directly affect the solution quality. In this paper, we present mascot, a method that addresses this limitation by leveraging two additional heuristic criteria to minimize off-target effects and achieve synergy for candidate modification. Specifically, off-target effects measure the unintended response of a signaling network to the target combination and is often associated with toxicity. Synergy occurs when a pair of targets exerts effects that are greater than the sum of their individual effects, and is generally a beneficial strategy for maximizing effect while minimizing toxicity. mascot leverages on a machine learning-based target prioritization method which prioritizes potential targets in a given disease-associated network to select more effective targets (better therapeutic effect and/or lower off-target effects); and on Loewe additivity theory from pharmacology which assesses the non-additive effects in a combination drug treatment to select synergistic target activities. Our experimental study on two disease-related signaling networks demonstrates the superiority of mascot in comparison to existing approaches. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Global analysis of bacterial transcription factors to predict cellular target processes.

    PubMed

    Doerks, Tobias; Andrade, Miguel A; Lathe, Warren; von Mering, Christian; Bork, Peer

    2004-03-01

    Whole-genome sequences are now available for >100 bacterial species, giving unprecedented power to comparative genomics approaches. We have applied genome-context methods to predict target processes that are regulated by transcription factors (TFs). Of 128 orthologous groups of proteins annotated as TFs, to date, 36 are functionally uncharacterized; in our analysis we predict a probable cellular target process or biochemical pathway for half of these functionally uncharacterized TFs.

  1. DIANA-microT web server: elucidating microRNA functions through target prediction.

    PubMed

    Maragkakis, M; Reczko, M; Simossis, V A; Alexiou, P; Papadopoulos, G L; Dalamagas, T; Giannopoulos, G; Goumas, G; Koukis, E; Kourtis, K; Vergoulis, T; Koziris, N; Sellis, T; Tsanakas, P; Hatzigeorgiou, A G

    2009-07-01

    Computational microRNA (miRNA) target prediction is one of the key means for deciphering the role of miRNAs in development and disease. Here, we present the DIANA-microT web server as the user interface to the DIANA-microT 3.0 miRNA target prediction algorithm. The web server provides extensive information for predicted miRNA:target gene interactions with a user-friendly interface, providing extensive connectivity to online biological resources. Target gene and miRNA functions may be elucidated through automated bibliographic searches and functional information is accessible through Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The web server offers links to nomenclature, sequence and protein databases, and users are facilitated by being able to search for targeted genes using different nomenclatures or functional features, such as the genes possible involvement in biological pathways. The target prediction algorithm supports parameters calculated individually for each miRNA:target gene interaction and provides a signal-to-noise ratio and a precision score that helps in the evaluation of the significance of the predicted results. Using a set of miRNA targets recently identified through the pSILAC method, the performance of several computational target prediction programs was assessed. DIANA-microT 3.0 achieved there with 66% the highest ratio of correctly predicted targets over all predicted targets. The DIANA-microT web server is freely available at www.microrna.gr/microT.

  2. Comprehensive predictions of target proteins based on protein-chemical interaction using virtual screening and experimental verifications.

    PubMed

    Kobayashi, Hiroki; Harada, Hiroko; Nakamura, Masaomi; Futamura, Yushi; Ito, Akihiro; Yoshida, Minoru; Iemura, Shun-Ichiro; Shin-Ya, Kazuo; Doi, Takayuki; Takahashi, Takashi; Natsume, Tohru; Imoto, Masaya; Sakakibara, Yasubumi

    2012-04-05

    Identification of the target proteins of bioactive compounds is critical for elucidating the mode of action; however, target identification has been difficult in general, mostly due to the low sensitivity of detection using affinity chromatography followed by CBB staining and MS/MS analysis. We applied our protocol of predicting target proteins combining in silico screening and experimental verification for incednine, which inhibits the anti-apoptotic function of Bcl-xL by an unknown mechanism. One hundred eighty-two target protein candidates were computationally predicted to bind to incednine by the statistical prediction method, and the predictions were verified by in vitro binding of incednine to seven proteins, whose expression can be confirmed in our cell system.As a result, 40% accuracy of the computational predictions was achieved successfully, and we newly found 3 incednine-binding proteins. This study revealed that our proposed protocol of predicting target protein combining in silico screening and experimental verification is useful, and provides new insight into a strategy for identifying target proteins of small molecules.

  3. Using Deep Learning for Compound Selectivity Prediction.

    PubMed

    Zhang, Ruisheng; Li, Juan; Lu, Jingjing; Hu, Rongjing; Yuan, Yongna; Zhao, Zhili

    2016-01-01

    Compound selectivity prediction plays an important role in identifying potential compounds that bind to the target of interest with high affinity. However, there is still short of efficient and accurate computational approaches to analyze and predict compound selectivity. In this paper, we propose two methods to improve the compound selectivity prediction. We employ an improved multitask learning method in Neural Networks (NNs), which not only incorporates both activity and selectivity for other targets, but also uses a probabilistic classifier with a logistic regression. We further improve the compound selectivity prediction by using the multitask learning method in Deep Belief Networks (DBNs) which can build a distributed representation model and improve the generalization of the shared tasks. In addition, we assign different weights to the auxiliary tasks that are related to the primary selectivity prediction task. In contrast to other related work, our methods greatly improve the accuracy of the compound selectivity prediction, in particular, using the multitask learning in DBNs with modified weights obtains the best performance.

  4. Integrative genetic risk prediction using non-parametric empirical Bayes classification.

    PubMed

    Zhao, Sihai Dave

    2017-06-01

    Genetic risk prediction is an important component of individualized medicine, but prediction accuracies remain low for many complex diseases. A fundamental limitation is the sample sizes of the studies on which the prediction algorithms are trained. One way to increase the effective sample size is to integrate information from previously existing studies. However, it can be difficult to find existing data that examine the target disease of interest, especially if that disease is rare or poorly studied. Furthermore, individual-level genotype data from these auxiliary studies are typically difficult to obtain. This article proposes a new approach to integrative genetic risk prediction of complex diseases with binary phenotypes. It accommodates possible heterogeneity in the genetic etiologies of the target and auxiliary diseases using a tuning parameter-free non-parametric empirical Bayes procedure, and can be trained using only auxiliary summary statistics. Simulation studies show that the proposed method can provide superior predictive accuracy relative to non-integrative as well as integrative classifiers. The method is applied to a recent study of pediatric autoimmune diseases, where it substantially reduces prediction error for certain target/auxiliary disease combinations. The proposed method is implemented in the R package ssa. © 2016, The International Biometric Society.

  5. Drug-target interaction prediction: A Bayesian ranking approach.

    PubMed

    Peska, Ladislav; Buza, Krisztian; Koller, Júlia

    2017-12-01

    In silico prediction of drug-target interactions (DTI) could provide valuable information and speed-up the process of drug repositioning - finding novel usage for existing drugs. In our work, we focus on machine learning algorithms supporting drug-centric repositioning approach, which aims to find novel usage for existing or abandoned drugs. We aim at proposing a per-drug ranking-based method, which reflects the needs of drug-centric repositioning research better than conventional drug-target prediction approaches. We propose Bayesian Ranking Prediction of Drug-Target Interactions (BRDTI). The method is based on Bayesian Personalized Ranking matrix factorization (BPR) which has been shown to be an excellent approach for various preference learning tasks, however, it has not been used for DTI prediction previously. In order to successfully deal with DTI challenges, we extended BPR by proposing: (i) the incorporation of target bias, (ii) a technique to handle new drugs and (iii) content alignment to take structural similarities of drugs and targets into account. Evaluation on five benchmark datasets shows that BRDTI outperforms several state-of-the-art approaches in terms of per-drug nDCG and AUC. BRDTI results w.r.t. nDCG are 0.929, 0.953, 0.948, 0.897 and 0.690 for G-Protein Coupled Receptors (GPCR), Ion Channels (IC), Nuclear Receptors (NR), Enzymes (E) and Kinase (K) datasets respectively. Additionally, BRDTI significantly outperformed other methods (BLM-NII, WNN-GIP, NetLapRLS and CMF) w.r.t. nDCG in 17 out of 20 cases. Furthermore, BRDTI was also shown to be able to predict novel drug-target interactions not contained in the original datasets. The average recall at top-10 predicted targets for each drug was 0.762, 0.560, 1.000 and 0.404 for GPCR, IC, NR, and E datasets respectively. Based on the evaluation, we can conclude that BRDTI is an appropriate choice for researchers looking for an in silico DTI prediction technique to be used in drug-centric repositioning scenarios. BRDTI Software and supplementary materials are available online at www.ksi.mff.cuni.cz/∼peska/BRDTI. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. MultiMiTar: a novel multi objective optimization based miRNA-target prediction method.

    PubMed

    Mitra, Ramkrishna; Bandyopadhyay, Sanghamitra

    2011-01-01

    Machine learning based miRNA-target prediction algorithms often fail to obtain a balanced prediction accuracy in terms of both sensitivity and specificity due to lack of the gold standard of negative examples, miRNA-targeting site context specific relevant features and efficient feature selection process. Moreover, all the sequence, structure and machine learning based algorithms are unable to distribute the true positive predictions preferentially at the top of the ranked list; hence the algorithms become unreliable to the biologists. In addition, these algorithms fail to obtain considerable combination of precision and recall for the target transcripts that are translationally repressed at protein level. In the proposed article, we introduce an efficient miRNA-target prediction system MultiMiTar, a Support Vector Machine (SVM) based classifier integrated with a multiobjective metaheuristic based feature selection technique. The robust performance of the proposed method is mainly the result of using high quality negative examples and selection of biologically relevant miRNA-targeting site context specific features. The features are selected by using a novel feature selection technique AMOSA-SVM, that integrates the multi objective optimization technique Archived Multi-Objective Simulated Annealing (AMOSA) and SVM. MultiMiTar is found to achieve much higher Matthew's correlation coefficient (MCC) of 0.583 and average class-wise accuracy (ACA) of 0.8 compared to the others target prediction methods for a completely independent test data set. The obtained MCC and ACA values of these algorithms range from -0.269 to 0.155 and 0.321 to 0.582, respectively. Moreover, it shows a more balanced result in terms of precision and sensitivity (recall) for the translationally repressed data set as compared to all the other existing methods. An important aspect is that the true positive predictions are distributed preferentially at the top of the ranked list that makes MultiMiTar reliable for the biologists. MultiMiTar is now available as an online tool at www.isical.ac.in/~bioinfo_miu/multimitar.htm. MultiMiTar software can be downloaded from www.isical.ac.in/~bioinfo_miu/multimitar-download.htm.

  7. Spot Weight Adaptation for Moving Target in Spot Scanning Proton Therapy.

    PubMed

    Morel, Paul; Wu, Xiaodong; Blin, Guillaume; Vialette, Stéphane; Flynn, Ryan; Hyer, Daniel; Wang, Dongxu

    2015-01-01

    This study describes a real-time spot weight adaptation method in spot-scanning proton therapy for moving target or moving patient, so that the resultant dose distribution closely matches the planned dose distribution. The method proposed in this study adapts the weight (MU) of the delivering pencil beam to that of the target spot; it will actually hit during patient/target motion. The target spot that a certain delivering pencil beam may hit relies on patient monitoring and/or motion modeling using four-dimensional (4D) CT. After the adapted delivery, the required total weight [Monitor Unit (MU)] for this target spot is then subtracted from the planned value. With continuous patient motion and continuous spot scanning, the planned doses to all target spots will eventually be all fulfilled. In a proof-of-principle test, a lung case was presented with realistic temporal and motion parameters; the resultant dose distribution using spot weight adaptation was compared to that without using this method. The impact of the real-time patient/target position tracking or prediction was also investigated. For moderate motion (i.e., mean amplitude 0.5 cm), D95% to the planning target volume (PTV) was only 81.5% of the prescription (RX) dose; with spot weight adaptation PTV D95% achieves 97.7% RX. For large motion amplitude (i.e., 1.5 cm), without spot weight adaptation PTV D95% is only 42.9% of RX; with spot weight adaptation, PTV D95% achieves 97.7% RX. Larger errors in patient/target position tracking or prediction led to worse final target coverage; an error of 3 mm or smaller in patient/target position tracking is preferred. The proposed spot weight adaptation method was able to deliver the planned dose distribution and maintain target coverage when patient motion was involved. The successful implementation of this method would rely on accurate monitoring or prediction of patient/target motion.

  8. A Systematic Prediction of Drug-Target Interactions Using Molecular Fingerprints and Protein Sequences.

    PubMed

    Huang, Yu-An; You, Zhu-Hong; Chen, Xing

    2018-01-01

    Drug-Target Interactions (DTI) play a crucial role in discovering new drug candidates and finding new proteins to target for drug development. Although the number of detected DTI obtained by high-throughput techniques has been increasing, the number of known DTI is still limited. On the other hand, the experimental methods for detecting the interactions among drugs and proteins are costly and inefficient. Therefore, computational approaches for predicting DTI are drawing increasing attention in recent years. In this paper, we report a novel computational model for predicting the DTI using extremely randomized trees model and protein amino acids information. More specifically, the protein sequence is represented as a Pseudo Substitution Matrix Representation (Pseudo-SMR) descriptor in which the influence of biological evolutionary information is retained. For the representation of drug molecules, a novel fingerprint feature vector is utilized to describe its substructure information. Then the DTI pair is characterized by concatenating the two vector spaces of protein sequence and drug substructure. Finally, the proposed method is explored for predicting the DTI on four benchmark datasets: Enzyme, Ion Channel, GPCRs and Nuclear Receptor. The experimental results demonstrate that this method achieves promising prediction accuracies of 89.85%, 87.87%, 82.99% and 81.67%, respectively. For further evaluation, we compared the performance of Extremely Randomized Trees model with that of the state-of-the-art Support Vector Machine classifier. And we also compared the proposed model with existing computational models, and confirmed 15 potential drug-target interactions by looking for existing databases. The experiment results show that the proposed method is feasible and promising for predicting drug-target interactions for new drug candidate screening based on sizeable features. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  9. MOST: most-similar ligand based approach to target prediction.

    PubMed

    Huang, Tao; Mi, Hong; Lin, Cheng-Yuan; Zhao, Ling; Zhong, Linda L D; Liu, Feng-Bin; Zhang, Ge; Lu, Ai-Ping; Bian, Zhao-Xiang

    2017-03-11

    Many computational approaches have been used for target prediction, including machine learning, reverse docking, bioactivity spectra analysis, and chemical similarity searching. Recent studies have suggested that chemical similarity searching may be driven by the most-similar ligand. However, the extent of bioactivity of most-similar ligands has been oversimplified or even neglected in these studies, and this has impaired the prediction power. Here we propose the MOst-Similar ligand-based Target inference approach, namely MOST, which uses fingerprint similarity and explicit bioactivity of the most-similar ligands to predict targets of the query compound. Performance of MOST was evaluated by using combinations of different fingerprint schemes, machine learning methods, and bioactivity representations. In sevenfold cross-validation with a benchmark Ki dataset from CHEMBL release 19 containing 61,937 bioactivity data of 173 human targets, MOST achieved high average prediction accuracy (0.95 for pKi ≥ 5, and 0.87 for pKi ≥ 6). Morgan fingerprint was shown to be slightly better than FP2. Logistic Regression and Random Forest methods performed better than Naïve Bayes. In a temporal validation, the Ki dataset from CHEMBL19 were used to train models and predict the bioactivity of newly deposited ligands in CHEMBL20. MOST also performed well with high accuracy (0.90 for pKi ≥ 5, and 0.76 for pKi ≥ 6), when Logistic Regression and Morgan fingerprint were employed. Furthermore, the p values associated with explicit bioactivity were found be a robust index for removing false positive predictions. Implicit bioactivity did not offer this capability. Finally, p values generated with Logistic Regression, Morgan fingerprint and explicit activity were integrated with a false discovery rate (FDR) control procedure to reduce false positives in multiple-target prediction scenario, and the success of this strategy it was demonstrated with a case of fluanisone. In the case of aloe-emodin's laxative effect, MOST predicted that acetylcholinesterase was the mechanism-of-action target; in vivo studies validated this prediction. Using the MOST approach can result in highly accurate and robust target prediction. Integrated with a FDR control procedure, MOST provides a reliable framework for multiple-target inference. It has prospective applications in drug repurposing and mechanism-of-action target prediction.

  10. Experimental and statistical post-validation of positive example EST sequences carrying peroxisome targeting signals type 1 (PTS1)

    PubMed Central

    Lingner, Thomas; Kataya, Amr R. A.; Reumann, Sigrun

    2012-01-01

    We recently developed the first algorithms specifically for plants to predict proteins carrying peroxisome targeting signals type 1 (PTS1) from genome sequences.1 As validated experimentally, the prediction methods are able to correctly predict unknown peroxisomal Arabidopsis proteins and to infer novel PTS1 tripeptides. The high prediction performance is primarily determined by the large number and sequence diversity of the underlying positive example sequences, which mainly derived from EST databases. However, a few constructs remained cytosolic in experimental validation studies, indicating sequencing errors in some ESTs. To identify erroneous sequences, we validated subcellular targeting of additional positive example sequences in the present study. Moreover, we analyzed the distribution of prediction scores separately for each orthologous group of PTS1 proteins, which generally resembled normal distributions with group-specific mean values. The cytosolic sequences commonly represented outliers of low prediction scores and were located at the very tail of a fitted normal distribution. Three statistical methods for identifying outliers were compared in terms of sensitivity and specificity.” Their combined application allows elimination of erroneous ESTs from positive example data sets. This new post-validation method will further improve the prediction accuracy of both PTS1 and PTS2 protein prediction models for plants, fungi, and mammals. PMID:22415050

  11. Experimental and statistical post-validation of positive example EST sequences carrying peroxisome targeting signals type 1 (PTS1).

    PubMed

    Lingner, Thomas; Kataya, Amr R A; Reumann, Sigrun

    2012-02-01

    We recently developed the first algorithms specifically for plants to predict proteins carrying peroxisome targeting signals type 1 (PTS1) from genome sequences. As validated experimentally, the prediction methods are able to correctly predict unknown peroxisomal Arabidopsis proteins and to infer novel PTS1 tripeptides. The high prediction performance is primarily determined by the large number and sequence diversity of the underlying positive example sequences, which mainly derived from EST databases. However, a few constructs remained cytosolic in experimental validation studies, indicating sequencing errors in some ESTs. To identify erroneous sequences, we validated subcellular targeting of additional positive example sequences in the present study. Moreover, we analyzed the distribution of prediction scores separately for each orthologous group of PTS1 proteins, which generally resembled normal distributions with group-specific mean values. The cytosolic sequences commonly represented outliers of low prediction scores and were located at the very tail of a fitted normal distribution. Three statistical methods for identifying outliers were compared in terms of sensitivity and specificity." Their combined application allows elimination of erroneous ESTs from positive example data sets. This new post-validation method will further improve the prediction accuracy of both PTS1 and PTS2 protein prediction models for plants, fungi, and mammals.

  12. Recommendation Techniques for Drug-Target Interaction Prediction and Drug Repositioning.

    PubMed

    Alaimo, Salvatore; Giugno, Rosalba; Pulvirenti, Alfredo

    2016-01-01

    The usage of computational methods in drug discovery is a common practice. More recently, by exploiting the wealth of biological knowledge bases, a novel approach called drug repositioning has raised. Several computational methods are available, and these try to make a high-level integration of all the knowledge in order to discover unknown mechanisms. In this chapter, we review drug-target interaction prediction methods based on a recommendation system. We also give some extensions which go beyond the bipartite network case.

  13. General overview on structure prediction of twilight-zone proteins.

    PubMed

    Khor, Bee Yin; Tye, Gee Jun; Lim, Theam Soon; Choong, Yee Siew

    2015-09-04

    Protein structure prediction from amino acid sequence has been one of the most challenging aspects in computational structural biology despite significant progress in recent years showed by critical assessment of protein structure prediction (CASP) experiments. When experimentally determined structures are unavailable, the predictive structures may serve as starting points to study a protein. If the target protein consists of homologous region, high-resolution (typically <1.5 Å) model can be built via comparative modelling. However, when confronted with low sequence similarity of the target protein (also known as twilight-zone protein, sequence identity with available templates is less than 30%), the protein structure prediction has to be initiated from scratch. Traditionally, twilight-zone proteins can be predicted via threading or ab initio method. Based on the current trend, combination of different methods brings an improved success in the prediction of twilight-zone proteins. In this mini review, the methods, progresses and challenges for the prediction of twilight-zone proteins were discussed.

  14. Tracking Multiple Video Targets with an Improved GM-PHD Tracker

    PubMed Central

    Zhou, Xiaolong; Yu, Hui; Liu, Honghai; Li, Youfu

    2015-01-01

    Tracking multiple moving targets from a video plays an important role in many vision-based robotic applications. In this paper, we propose an improved Gaussian mixture probability hypothesis density (GM-PHD) tracker with weight penalization to effectively and accurately track multiple moving targets from a video. First, an entropy-based birth intensity estimation method is incorporated to eliminate the false positives caused by noisy video data. Then, a weight-penalized method with multi-feature fusion is proposed to accurately track the targets in close movement. For targets without occlusion, a weight matrix that contains all updated weights between the predicted target states and the measurements is constructed, and a simple, but effective method based on total weight and predicted target state is proposed to search the ambiguous weights in the weight matrix. The ambiguous weights are then penalized according to the fused target features that include spatial-colour appearance, histogram of oriented gradient and target area and further re-normalized to form a new weight matrix. With this new weight matrix, the tracker can correctly track the targets in close movement without occlusion. For targets with occlusion, a robust game-theoretical method is used. Finally, the experiments conducted on various video scenarios validate the effectiveness of the proposed penalization method and show the superior performance of our tracker over the state of the art. PMID:26633422

  15. miRTar2GO: a novel rule-based model learning method for cell line specific microRNA target prediction that integrates Ago2 CLIP-Seq and validated microRNA-target interaction data.

    PubMed

    Ahadi, Alireza; Sablok, Gaurav; Hutvagner, Gyorgy

    2017-04-07

    MicroRNAs (miRNAs) are ∼19-22 nucleotides (nt) long regulatory RNAs that regulate gene expression by recognizing and binding to complementary sequences on mRNAs. The key step in revealing the function of a miRNA, is the identification of miRNA target genes. Recent biochemical advances including PAR-CLIP and HITS-CLIP allow for improved miRNA target predictions and are widely used to validate miRNA targets. Here, we present miRTar2GO, which is a model, trained on the common rules of miRNA-target interactions, Argonaute (Ago) CLIP-Seq data and experimentally validated miRNA target interactions. miRTar2GO is designed to predict miRNA target sites using more relaxed miRNA-target binding characteristics. More importantly, miRTar2GO allows for the prediction of cell-type specific miRNA targets. We have evaluated miRTar2GO against other widely used miRNA target prediction algorithms and demonstrated that miRTar2GO produced significantly higher F1 and G scores. Target predictions, binding specifications, results of the pathway analysis and gene ontology enrichment of miRNA targets are freely available at http://www.mirtar2go.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Research on cross - Project software defect prediction based on transfer learning

    NASA Astrophysics Data System (ADS)

    Chen, Ya; Ding, Xiaoming

    2018-04-01

    According to the two challenges in the prediction of cross-project software defects, the distribution differences between the source project and the target project dataset and the class imbalance in the dataset, proposing a cross-project software defect prediction method based on transfer learning, named NTrA. Firstly, solving the source project data's class imbalance based on the Augmented Neighborhood Cleaning Algorithm. Secondly, the data gravity method is used to give different weights on the basis of the attribute similarity of source project and target project data. Finally, a defect prediction model is constructed by using Trad boost algorithm. Experiments were conducted using data, come from NASA and SOFTLAB respectively, from a published PROMISE dataset. The results show that the method has achieved good values of recall and F-measure, and achieved good prediction results.

  17. A large-scale evaluation of computational protein function prediction

    PubMed Central

    Radivojac, Predrag; Clark, Wyatt T; Ronnen Oron, Tal; Schnoes, Alexandra M; Wittkop, Tobias; Sokolov, Artem; Graim, Kiley; Funk, Christopher; Verspoor, Karin; Ben-Hur, Asa; Pandey, Gaurav; Yunes, Jeffrey M; Talwalkar, Ameet S; Repo, Susanna; Souza, Michael L; Piovesan, Damiano; Casadio, Rita; Wang, Zheng; Cheng, Jianlin; Fang, Hai; Gough, Julian; Koskinen, Patrik; Törönen, Petri; Nokso-Koivisto, Jussi; Holm, Liisa; Cozzetto, Domenico; Buchan, Daniel W A; Bryson, Kevin; Jones, David T; Limaye, Bhakti; Inamdar, Harshal; Datta, Avik; Manjari, Sunitha K; Joshi, Rajendra; Chitale, Meghana; Kihara, Daisuke; Lisewski, Andreas M; Erdin, Serkan; Venner, Eric; Lichtarge, Olivier; Rentzsch, Robert; Yang, Haixuan; Romero, Alfonso E; Bhat, Prajwal; Paccanaro, Alberto; Hamp, Tobias; Kassner, Rebecca; Seemayer, Stefan; Vicedo, Esmeralda; Schaefer, Christian; Achten, Dominik; Auer, Florian; Böhm, Ariane; Braun, Tatjana; Hecht, Maximilian; Heron, Mark; Hönigschmid, Peter; Hopf, Thomas; Kaufmann, Stefanie; Kiening, Michael; Krompass, Denis; Landerer, Cedric; Mahlich, Yannick; Roos, Manfred; Björne, Jari; Salakoski, Tapio; Wong, Andrew; Shatkay, Hagit; Gatzmann, Fanny; Sommer, Ingolf; Wass, Mark N; Sternberg, Michael J E; Škunca, Nives; Supek, Fran; Bošnjak, Matko; Panov, Panče; Džeroski, Sašo; Šmuc, Tomislav; Kourmpetis, Yiannis A I; van Dijk, Aalt D J; ter Braak, Cajo J F; Zhou, Yuanpeng; Gong, Qingtian; Dong, Xinran; Tian, Weidong; Falda, Marco; Fontana, Paolo; Lavezzo, Enrico; Di Camillo, Barbara; Toppo, Stefano; Lan, Liang; Djuric, Nemanja; Guo, Yuhong; Vucetic, Slobodan; Bairoch, Amos; Linial, Michal; Babbitt, Patricia C; Brenner, Steven E; Orengo, Christine; Rost, Burkhard; Mooney, Sean D; Friedberg, Iddo

    2013-01-01

    Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based Critical Assessment of protein Function Annotation (CAFA) experiment. Fifty-four methods representing the state-of-the-art for protein function prediction were evaluated on a target set of 866 proteins from eleven organisms. Two findings stand out: (i) today’s best protein function prediction algorithms significantly outperformed widely-used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is significant need for improvement of currently available tools. PMID:23353650

  18. Global vision of druggability issues: applications and perspectives.

    PubMed

    Abi Hussein, Hiba; Geneix, Colette; Petitjean, Michel; Borrel, Alexandre; Flatters, Delphine; Camproux, Anne-Claude

    2017-02-01

    During the preliminary stage of a drug discovery project, the lack of druggability information and poor target selection are the main causes of frequent failures. Elaborating on accurate computational druggability prediction methods is a requirement for prioritizing target selection, designing new drugs and avoiding side effects. In this review, we describe a survey of recently reported druggability prediction methods mainly based on networks, statistical pocket druggability predictions and virtual screening. An application for a frequent mutation of p53 tumor suppressor is presented, illustrating the complementarity of druggability prediction approaches, the remaining challenges and potential new drug development perspectives. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Frnakenstein: multiple target inverse RNA folding.

    PubMed

    Lyngsø, Rune B; Anderson, James W J; Sizikova, Elena; Badugu, Amarendra; Hyland, Tomas; Hein, Jotun

    2012-10-09

    RNA secondary structure prediction, or folding, is a classic problem in bioinformatics: given a sequence of nucleotides, the aim is to predict the base pairs formed in its three dimensional conformation. The inverse problem of designing a sequence folding into a particular target structure has only more recently received notable interest. With a growing appreciation and understanding of the functional and structural properties of RNA motifs, and a growing interest in utilising biomolecules in nano-scale designs, the interest in the inverse RNA folding problem is bound to increase. However, whereas the RNA folding problem from an algorithmic viewpoint has an elegant and efficient solution, the inverse RNA folding problem appears to be hard. In this paper we present a genetic algorithm approach to solve the inverse folding problem. The main aims of the development was to address the hitherto mostly ignored extension of solving the inverse folding problem, the multi-target inverse folding problem, while simultaneously designing a method with superior performance when measured on the quality of designed sequences. The genetic algorithm has been implemented as a Python program called Frnakenstein. It was benchmarked against four existing methods and several data sets totalling 769 real and predicted single structure targets, and on 292 two structure targets. It performed as well as or better at finding sequences which folded in silico into the target structure than all existing methods, without the heavy bias towards CG base pairs that was observed for all other top performing methods. On the two structure targets it also performed well, generating a perfect design for about 80% of the targets. Our method illustrates that successful designs for the inverse RNA folding problem does not necessarily have to rely on heavy biases in base pair and unpaired base distributions. The design problem seems to become more difficult on larger structures when the target structures are real structures, while no deterioration was observed for predicted structures. Design for two structure targets is considerably more difficult, but far from impossible, demonstrating the feasibility of automated design of artificial riboswitches. The Python implementation is available at http://www.stats.ox.ac.uk/research/genome/software/frnakenstein.

  20. Frnakenstein: multiple target inverse RNA folding

    PubMed Central

    2012-01-01

    Background RNA secondary structure prediction, or folding, is a classic problem in bioinformatics: given a sequence of nucleotides, the aim is to predict the base pairs formed in its three dimensional conformation. The inverse problem of designing a sequence folding into a particular target structure has only more recently received notable interest. With a growing appreciation and understanding of the functional and structural properties of RNA motifs, and a growing interest in utilising biomolecules in nano-scale designs, the interest in the inverse RNA folding problem is bound to increase. However, whereas the RNA folding problem from an algorithmic viewpoint has an elegant and efficient solution, the inverse RNA folding problem appears to be hard. Results In this paper we present a genetic algorithm approach to solve the inverse folding problem. The main aims of the development was to address the hitherto mostly ignored extension of solving the inverse folding problem, the multi-target inverse folding problem, while simultaneously designing a method with superior performance when measured on the quality of designed sequences. The genetic algorithm has been implemented as a Python program called Frnakenstein. It was benchmarked against four existing methods and several data sets totalling 769 real and predicted single structure targets, and on 292 two structure targets. It performed as well as or better at finding sequences which folded in silico into the target structure than all existing methods, without the heavy bias towards CG base pairs that was observed for all other top performing methods. On the two structure targets it also performed well, generating a perfect design for about 80% of the targets. Conclusions Our method illustrates that successful designs for the inverse RNA folding problem does not necessarily have to rely on heavy biases in base pair and unpaired base distributions. The design problem seems to become more difficult on larger structures when the target structures are real structures, while no deterioration was observed for predicted structures. Design for two structure targets is considerably more difficult, but far from impossible, demonstrating the feasibility of automated design of artificial riboswitches. The Python implementation is available at http://www.stats.ox.ac.uk/research/genome/software/frnakenstein. PMID:23043260

  1. Prediction-based Dynamic Energy Management in Wireless Sensor Networks

    PubMed Central

    Wang, Xue; Ma, Jun-Jie; Wang, Sheng; Bi, Dao-Wei

    2007-01-01

    Energy consumption is a critical constraint in wireless sensor networks. Focusing on the energy efficiency problem of wireless sensor networks, this paper proposes a method of prediction-based dynamic energy management. A particle filter was introduced to predict a target state, which was adopted to awaken wireless sensor nodes so that their sleep time was prolonged. With the distributed computing capability of nodes, an optimization approach of distributed genetic algorithm and simulated annealing was proposed to minimize the energy consumption of measurement. Considering the application of target tracking, we implemented target position prediction, node sleep scheduling and optimal sensing node selection. Moreover, a routing scheme of forwarding nodes was presented to achieve extra energy conservation. Experimental results of target tracking verified that energy-efficiency is enhanced by prediction-based dynamic energy management.

  2. Predicting Drug-Target Interaction Networks Based on Functional Groups and Biological Features

    PubMed Central

    Shi, Xiao-He; Hu, Le-Le; Kong, Xiangyin; Cai, Yu-Dong; Chou, Kuo-Chen

    2010-01-01

    Background Study of drug-target interaction networks is an important topic for drug development. It is both time-consuming and costly to determine compound-protein interactions or potential drug-target interactions by experiments alone. As a complement, the in silico prediction methods can provide us with very useful information in a timely manner. Methods/Principal Findings To realize this, drug compounds are encoded with functional groups and proteins encoded by biological features including biochemical and physicochemical properties. The optimal feature selection procedures are adopted by means of the mRMR (Maximum Relevance Minimum Redundancy) method. Instead of classifying the proteins as a whole family, target proteins are divided into four groups: enzymes, ion channels, G-protein- coupled receptors and nuclear receptors. Thus, four independent predictors are established using the Nearest Neighbor algorithm as their operation engine, with each to predict the interactions between drugs and one of the four protein groups. As a result, the overall success rates by the jackknife cross-validation tests achieved with the four predictors are 85.48%, 80.78%, 78.49%, and 85.66%, respectively. Conclusion/Significance Our results indicate that the network prediction system thus established is quite promising and encouraging. PMID:20300175

  3. Prediction of beta-turns and beta-turn types by a novel bidirectional Elman-type recurrent neural network with multiple output layers (MOLEBRNN).

    PubMed

    Kirschner, Andreas; Frishman, Dmitrij

    2008-10-01

    Prediction of beta-turns from amino acid sequences has long been recognized as an important problem in structural bioinformatics due to their frequent occurrence as well as their structural and functional significance. Because various structural features of proteins are intercorrelated, secondary structure information has been often employed as an additional input for machine learning algorithms while predicting beta-turns. Here we present a novel bidirectional Elman-type recurrent neural network with multiple output layers (MOLEBRNN) capable of predicting multiple mutually dependent structural motifs and demonstrate its efficiency in recognizing three aspects of protein structure: beta-turns, beta-turn types, and secondary structure. The advantage of our method compared to other predictors is that it does not require any external input except for sequence profiles because interdependencies between different structural features are taken into account implicitly during the learning process. In a sevenfold cross-validation experiment on a standard test dataset our method exhibits the total prediction accuracy of 77.9% and the Mathew's Correlation Coefficient of 0.45, the highest performance reported so far. It also outperforms other known methods in delineating individual turn types. We demonstrate how simultaneous prediction of multiple targets influences prediction performance on single targets. The MOLEBRNN presented here is a generic method applicable in a variety of research fields where multiple mutually depending target classes need to be predicted. http://webclu.bio.wzw.tum.de/predator-web/.

  4. Sub-Model Partial Least Squares for Improved Accuracy in Quantitative Laser Induced Breakdown Spectroscopy

    NASA Astrophysics Data System (ADS)

    Anderson, R. B.; Clegg, S. M.; Frydenvang, J.

    2015-12-01

    One of the primary challenges faced by the ChemCam instrument on the Curiosity Mars rover is developing a regression model that can accurately predict the composition of the wide range of target types encountered (basalts, calcium sulfate, feldspar, oxides, etc.). The original calibration used 69 rock standards to train a partial least squares (PLS) model for each major element. By expanding the suite of calibration samples to >400 targets spanning a wider range of compositions, the accuracy of the model was improved, but some targets with "extreme" compositions (e.g. pure minerals) were still poorly predicted. We have therefore developed a simple method, referred to as "submodel PLS", to improve the performance of PLS across a wide range of target compositions. In addition to generating a "full" (0-100 wt.%) PLS model for the element of interest, we also generate several overlapping submodels (e.g. for SiO2, we generate "low" (0-50 wt.%), "mid" (30-70 wt.%), and "high" (60-100 wt.%) models). The submodels are generally more accurate than the "full" model for samples within their range because they are able to adjust for matrix effects that are specific to that range. To predict the composition of an unknown target, we first predict the composition with the submodels and the "full" model. Then, based on the predicted composition from the "full" model, the appropriate submodel prediction can be used (e.g. if the full model predicts a low composition, use the "low" model result, which is likely to be more accurate). For samples with "full" predictions that occur in a region of overlap between submodels, the submodel predictions are "blended" using a simple linear weighted sum. The submodel PLS method shows improvements in most of the major elements predicted by ChemCam and reduces the occurrence of negative predictions for low wt.% targets. Submodel PLS is currently being used in conjunction with ICA regression for the major element compositions of ChemCam data.

  5. DeepMirTar: a deep-learning approach for predicting human miRNA targets.

    PubMed

    Wen, Ming; Cong, Peisheng; Zhang, Zhimin; Lu, Hongmei; Li, Tonghua

    2018-06-01

    MicroRNAs (miRNAs) are small noncoding RNAs that function in RNA silencing and post-transcriptional regulation of gene expression by targeting messenger RNAs (mRNAs). Because the underlying mechanisms associated with miRNA binding to mRNA are not fully understood, a major challenge of miRNA studies involves the identification of miRNA-target sites on mRNA. In silico prediction of miRNA-target sites can expedite costly and time-consuming experimental work by providing the most promising miRNA-target-site candidates. In this study, we reported the design and implementation of DeepMirTar, a deep-learning-based approach for accurately predicting human miRNA targets at the site level. The predicted miRNA-target sites are those having canonical or non-canonical seed, and features, including high-level expert-designed, low-level expert-designed, and raw-data-level, were used to represent the miRNA-target site. Comparison with other state-of-the-art machine-learning methods and existing miRNA-target-prediction tools indicated that DeepMirTar improved overall predictive performance. DeepMirTar is freely available at https://github.com/Bjoux2/DeepMirTar_SdA. lith@tongji.edu.cn, hongmeilu@csu.edu.cn. Supplementary data are available at Bioinformatics online.

  6. In silico pharmacology for drug discovery: applications to targets and beyond

    PubMed Central

    Ekins, S; Mestres, J; Testa, B

    2007-01-01

    Computational (in silico) methods have been developed and widely applied to pharmacology hypothesis development and testing. These in silico methods include databases, quantitative structure-activity relationships, similarity searching, pharmacophores, homology models and other molecular modeling, machine learning, data mining, network analysis tools and data analysis tools that use a computer. Such methods have seen frequent use in the discovery and optimization of novel molecules with affinity to a target, the clarification of absorption, distribution, metabolism, excretion and toxicity properties as well as physicochemical characterization. The first part of this review discussed the methods that have been used for virtual ligand and target-based screening and profiling to predict biological activity. The aim of this second part of the review is to illustrate some of the varied applications of in silico methods for pharmacology in terms of the targets addressed. We will also discuss some of the advantages and disadvantages of in silico methods with respect to in vitro and in vivo methods for pharmacology research. Our conclusion is that the in silico pharmacology paradigm is ongoing and presents a rich array of opportunities that will assist in expediating the discovery of new targets, and ultimately lead to compounds with predicted biological activity for these novel targets. PMID:17549046

  7. Cell fate reprogramming by control of intracellular network dynamics

    NASA Astrophysics Data System (ADS)

    Zanudo, Jorge G. T.; Albert, Reka

    Identifying control strategies for biological networks is paramount for practical applications that involve reprogramming a cell's fate, such as disease therapeutics and stem cell reprogramming. Although the topic of controlling the dynamics of a system has a long history in control theory, most of this work is not directly applicable to intracellular networks. Here we present a network control method that integrates the structural and functional information available for intracellular networks to predict control targets. Formulated in a logical dynamic scheme, our control method takes advantage of certain function-dependent network components and their relation to steady states in order to identify control targets, which are guaranteed to drive any initial state to the target state with 100% effectiveness and need to be applied only transiently for the system to reach and stay in the desired state. We illustrate our method's potential to find intervention targets for cancer treatment and cell differentiation by applying it to a leukemia signaling network and to the network controlling the differentiation of T cells. We find that the predicted control targets are effective in a broad dynamic framework. Moreover, several of the predicted interventions are supported by experiments. This work was supported by NSF Grant PHY 1205840.

  8. Improved protein model quality assessments by changing the target function.

    PubMed

    Uziela, Karolis; Menéndez Hurtado, David; Shu, Nanjiang; Wallner, Björn; Elofsson, Arne

    2018-06-01

    Protein modeling quality is an important part of protein structure prediction. We have for more than a decade developed a set of methods for this problem. We have used various types of description of the protein and different machine learning methodologies. However, common to all these methods has been the target function used for training. The target function in ProQ describes the local quality of a residue in a protein model. In all versions of ProQ the target function has been the S-score. However, other quality estimation functions also exist, which can be divided into superposition- and contact-based methods. The superposition-based methods, such as S-score, are based on a rigid body superposition of a protein model and the native structure, while the contact-based methods compare the local environment of each residue. Here, we examine the effects of retraining our latest predictor, ProQ3D, using identical inputs but different target functions. We find that the contact-based methods are easier to predict and that predictors trained on these measures provide some advantages when it comes to identifying the best model. One possible reason for this is that contact based methods are better at estimating the quality of multi-domain targets. However, training on the S-score gives the best correlation with the GDT_TS score, which is commonly used in CASP to score the global model quality. To take the advantage of both of these features we provide an updated version of ProQ3D that predicts local and global model quality estimates based on different quality estimates. © 2018 Wiley Periodicals, Inc.

  9. TargetMiner: microRNA target prediction with systematic identification of tissue-specific negative examples.

    PubMed

    Bandyopadhyay, Sanghamitra; Mitra, Ramkrishna

    2009-10-15

    Prediction of microRNA (miRNA) target mRNAs using machine learning approaches is an important area of research. However, most of the methods suffer from either high false positive or false negative rates. One reason for this is the marked deficiency of negative examples or miRNA non-target pairs. Systematic identification of non-target mRNAs is still not addressed properly, and therefore, current machine learning approaches are compelled to rely on artificially generated negative examples for training. In this article, we have identified approximately 300 tissue-specific negative examples using a novel approach that involves expression profiling of both miRNAs and mRNAs, miRNA-mRNA structural interactions and seed-site conservation. The newly generated negative examples are validated with pSILAC dataset, which elucidate the fact that the identified non-targets are indeed non-targets.These high-throughput tissue-specific negative examples and a set of experimentally verified positive examples are then used to build a system called TargetMiner, a support vector machine (SVM)-based classifier. In addition to assessing the prediction accuracy on cross-validation experiments, TargetMiner has been validated with a completely independent experimental test dataset. Our method outperforms 10 existing target prediction algorithms and provides a good balance between sensitivity and specificity that is not reflected in the existing methods. We achieve a significantly higher sensitivity and specificity of 69% and 67.8% based on a pool of 90 feature set and 76.5% and 66.1% using a set of 30 selected feature set on the completely independent test dataset. In order to establish the effectiveness of the systematically generated negative examples, the SVM is trained using a different set of negative data generated using the method in Yousef et al. A significantly higher false positive rate (70.6%) is observed when tested on the independent set, while all other factors are kept the same. Again, when an existing method (NBmiRTar) is executed with the our proposed negative data, we observe an improvement in its performance. These clearly establish the effectiveness of the proposed approach of selecting the negative examples systematically. TargetMiner is now available as an online tool at www.isical.ac.in/ approximately bioinfo_miu

  10. HomoTarget: a new algorithm for prediction of microRNA targets in Homo sapiens.

    PubMed

    Ahmadi, Hamed; Ahmadi, Ali; Azimzadeh-Jamalkandi, Sadegh; Shoorehdeli, Mahdi Aliyari; Salehzadeh-Yazdi, Ali; Bidkhori, Gholamreza; Masoudi-Nejad, Ali

    2013-02-01

    MiRNAs play an essential role in the networks of gene regulation by inhibiting the translation of target mRNAs. Several computational approaches have been proposed for the prediction of miRNA target-genes. Reports reveal a large fraction of under-predicted or falsely predicted target genes. Thus, there is an imperative need to develop a computational method by which the target mRNAs of existing miRNAs can be correctly identified. In this study, combined pattern recognition neural network (PRNN) and principle component analysis (PCA) architecture has been proposed in order to model the complicated relationship between miRNAs and their target mRNAs in humans. The results of several types of intelligent classifiers and our proposed model were compared, showing that our algorithm outperformed them with higher sensitivity and specificity. Using the recent release of the mirBase database to find potential targets of miRNAs, this model incorporated twelve structural, thermodynamic and positional features of miRNA:mRNA binding sites to select target candidates. Copyright © 2012 Elsevier Inc. All rights reserved.

  11. Improved genome-scale multi-target virtual screening via a novel collaborative filtering approach to cold-start problem

    PubMed Central

    Lim, Hansaim; Gray, Paul; Xie, Lei; Poleksic, Aleksandar

    2016-01-01

    Conventional one-drug-one-gene approach has been of limited success in modern drug discovery. Polypharmacology, which focuses on searching for multi-targeted drugs to perturb disease-causing networks instead of designing selective ligands to target individual proteins, has emerged as a new drug discovery paradigm. Although many methods for single-target virtual screening have been developed to improve the efficiency of drug discovery, few of these algorithms are designed for polypharmacology. Here, we present a novel theoretical framework and a corresponding algorithm for genome-scale multi-target virtual screening based on the one-class collaborative filtering technique. Our method overcomes the sparseness of the protein-chemical interaction data by means of interaction matrix weighting and dual regularization from both chemicals and proteins. While the statistical foundation behind our method is general enough to encompass genome-wide drug off-target prediction, the program is specifically tailored to find protein targets for new chemicals with little to no available interaction data. We extensively evaluate our method using a number of the most widely accepted gene-specific and cross-gene family benchmarks and demonstrate that our method outperforms other state-of-the-art algorithms for predicting the interaction of new chemicals with multiple proteins. Thus, the proposed algorithm may provide a powerful tool for multi-target drug design. PMID:27958331

  12. Improved genome-scale multi-target virtual screening via a novel collaborative filtering approach to cold-start problem.

    PubMed

    Lim, Hansaim; Gray, Paul; Xie, Lei; Poleksic, Aleksandar

    2016-12-13

    Conventional one-drug-one-gene approach has been of limited success in modern drug discovery. Polypharmacology, which focuses on searching for multi-targeted drugs to perturb disease-causing networks instead of designing selective ligands to target individual proteins, has emerged as a new drug discovery paradigm. Although many methods for single-target virtual screening have been developed to improve the efficiency of drug discovery, few of these algorithms are designed for polypharmacology. Here, we present a novel theoretical framework and a corresponding algorithm for genome-scale multi-target virtual screening based on the one-class collaborative filtering technique. Our method overcomes the sparseness of the protein-chemical interaction data by means of interaction matrix weighting and dual regularization from both chemicals and proteins. While the statistical foundation behind our method is general enough to encompass genome-wide drug off-target prediction, the program is specifically tailored to find protein targets for new chemicals with little to no available interaction data. We extensively evaluate our method using a number of the most widely accepted gene-specific and cross-gene family benchmarks and demonstrate that our method outperforms other state-of-the-art algorithms for predicting the interaction of new chemicals with multiple proteins. Thus, the proposed algorithm may provide a powerful tool for multi-target drug design.

  13. Predicted Biological Activity of Purchasable Chemical Space

    PubMed Central

    2017-01-01

    Whereas 400 million distinct compounds are now purchasable within the span of a few weeks, the biological activities of most are unknown. To facilitate access to new chemistry for biology, we have combined the Similarity Ensemble Approach (SEA) with the maximum Tanimoto similarity to the nearest bioactive to predict activity for every commercially available molecule in ZINC. This method, which we label SEA+TC, outperforms both SEA and a naïve-Bayesian classifier via predictive performance on a 5-fold cross-validation of ChEMBL’s bioactivity data set (version 21). Using this method, predictions for over 40% of compounds (>160 million) have either high significance (pSEA ≥ 40), high similarity (ECFP4MaxTc ≥ 0.4), or both, for one or more of 1382 targets well described by ligands in the literature. Using a further 1347 less-well-described targets, we predict activities for an additional 11 million compounds. To gauge whether these predictions are sensible, we investigate 75 predictions for 50 drugs lacking a binding affinity annotation in ChEMBL. The 535 million predictions for over 171 million compounds at 2629 targets are linked to purchasing information and evidence to support each prediction and are freely available via https://zinc15.docking.org and https://files.docking.org. PMID:29193970

  14. Modularity of Protein Folds as a Tool for Template-Free Modeling of Structures.

    PubMed

    Vallat, Brinda; Madrid-Aliste, Carlos; Fiser, Andras

    2015-08-01

    Predicting the three-dimensional structure of proteins from their amino acid sequences remains a challenging problem in molecular biology. While the current structural coverage of proteins is almost exclusively provided by template-based techniques, the modeling of the rest of the protein sequences increasingly require template-free methods. However, template-free modeling methods are much less reliable and are usually applicable for smaller proteins, leaving much space for improvement. We present here a novel computational method that uses a library of supersecondary structure fragments, known as Smotifs, to model protein structures. The library of Smotifs has saturated over time, providing a theoretical foundation for efficient modeling. The method relies on weak sequence signals from remotely related protein structures to create a library of Smotif fragments specific to the target protein sequence. This Smotif library is exploited in a fragment assembly protocol to sample decoys, which are assessed by a composite scoring function. Since the Smotif fragments are larger in size compared to the ones used in other fragment-based methods, the proposed modeling algorithm, SmotifTF, can employ an exhaustive sampling during decoy assembly. SmotifTF successfully predicts the overall fold of the target proteins in about 50% of the test cases and performs competitively when compared to other state of the art prediction methods, especially when sequence signal to remote homologs is diminishing. Smotif-based modeling is complementary to current prediction methods and provides a promising direction in addressing the structure prediction problem, especially when targeting larger proteins for modeling.

  15. Predicting drug-target interactions by dual-network integrated logistic matrix factorization

    NASA Astrophysics Data System (ADS)

    Hao, Ming; Bryant, Stephen H.; Wang, Yanli

    2017-01-01

    In this work, we propose a dual-network integrated logistic matrix factorization (DNILMF) algorithm to predict potential drug-target interactions (DTI). The prediction procedure consists of four steps: (1) inferring new drug/target profiles and constructing profile kernel matrix; (2) diffusing drug profile kernel matrix with drug structure kernel matrix; (3) diffusing target profile kernel matrix with target sequence kernel matrix; and (4) building DNILMF model and smoothing new drug/target predictions based on their neighbors. We compare our algorithm with the state-of-the-art method based on the benchmark dataset. Results indicate that the DNILMF algorithm outperforms the previously reported approaches in terms of AUPR (area under precision-recall curve) and AUC (area under curve of receiver operating characteristic) based on the 5 trials of 10-fold cross-validation. We conclude that the performance improvement depends on not only the proposed objective function, but also the used nonlinear diffusion technique which is important but under studied in the DTI prediction field. In addition, we also compile a new DTI dataset for increasing the diversity of currently available benchmark datasets. The top prediction results for the new dataset are confirmed by experimental studies or supported by other computational research.

  16. Realizing drug repositioning by adapting a recommendation system to handle the process.

    PubMed

    Ozsoy, Makbule Guclin; Özyer, Tansel; Polat, Faruk; Alhajj, Reda

    2018-04-12

    Drug repositioning is the process of identifying new targets for known drugs. It can be used to overcome problems associated with traditional drug discovery by adapting existing drugs to treat new discovered diseases. Thus, it may reduce associated risk, cost and time required to identify and verify new drugs. Nowadays, drug repositioning has received more attention from industry and academia. To tackle this problem, researchers have applied many different computational methods and have used various features of drugs and diseases. In this study, we contribute to the ongoing research efforts by combining multiple features, namely chemical structures, protein interactions and side-effects to predict new indications of target drugs. To achieve our target, we realize drug repositioning as a recommendation process and this leads to a new perspective in tackling the problem. The utilized recommendation method is based on Pareto dominance and collaborative filtering. It can also integrate multiple data-sources and multiple features. For the computation part, we applied several settings and we compared their performance. Evaluation results show that the proposed method can achieve more concentrated predictions with high precision, where nearly half of the predictions are true. Compared to other state of the art methods described in the literature, the proposed method is better at making right predictions by having higher precision. The reported results demonstrate the applicability and effectiveness of recommendation methods for drug repositioning.

  17. Evaluation of free modeling targets in CASP11 and ROLL.

    PubMed

    Kinch, Lisa N; Li, Wenlin; Monastyrskyy, Bohdan; Kryshtafovych, Andriy; Grishin, Nick V

    2016-09-01

    We present an assessment of 'template-free modeling' (FM) in CASP11and ROLL. Community-wide server performance suggested the use of automated scores similar to previous CASPs would provide a good system of evaluating performance, even in the absence of comprehensive manual assessment. The CASP11 FM category included several outstanding examples, including successful prediction by the Baker group of a 256-residue target (T0806-D1) that lacked sequence similarity to any existing template. The top server model prediction by Zhang's Quark, which was apparently selected and refined by several manual groups, encompassed the entire fold of target T0837-D1. Methods from the same two groups tended to dominate overall CASP11 FM and ROLL rankings. Comparison of top FM predictions with those from the previous CASP experiment revealed progress in the category, particularly reflected in high prediction accuracy for larger protein domains. FM prediction models for two cases were sufficient to provide functional insights that were otherwise not obtainable by traditional sequence analysis methods. Importantly, CASP11 abstracts revealed that alignment-based contact prediction methods brought about much of the CASP11 progress, producing both of the functionally relevant models as well as several of the other outstanding structure predictions. These methodological advances enabled de novo modeling of much larger domain structures than was previously possible and allowed prediction of functional sites. Proteins 2016; 84(Suppl 1):51-66. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  18. Drug repositioning for enzyme modulator based on human metabolite-likeness.

    PubMed

    Lee, Yoon Hyeok; Choi, Hojae; Park, Seongyong; Lee, Boah; Yi, Gwan-Su

    2017-05-31

    Recently, the metabolite-likeness of the drug space has emerged and has opened a new possibility for exploring human metabolite-like candidates in drug discovery. However, the applicability of metabolite-likeness in drug discovery has been largely unexplored. Moreover, there are no reports on its applications for the repositioning of drugs to possible enzyme modulators, although enzyme-drug relations could be directly inferred from the similarity relationships between enzyme's metabolites and drugs. We constructed a drug-metabolite structural similarity matrix, which contains 1,861 FDA-approved drugs and 1,110 human intermediary metabolites scored with the Tanimoto similarity. To verify the metabolite-likeness measure for drug repositioning, we analyzed 17 known antimetabolite drugs that resemble the innate metabolites of their eleven target enzymes as the gold standard positives. Highly scored drugs were selected as possible modulators of enzymes for their corresponding metabolites. Then, we assessed the performance of metabolite-likeness with a receiver operating characteristic analysis and compared it with other drug-target prediction methods. We set the similarity threshold for drug repositioning candidates of new enzyme modulators based on maximization of the Youden's index. We also carried out literature surveys for supporting the drug repositioning results based on the metabolite-likeness. In this paper, we applied metabolite-likeness to repurpose FDA-approved drugs to disease-associated enzyme modulators that resemble human innate metabolites. All antimetabolite drugs were mapped with their known 11 target enzymes with statistically significant similarity values to the corresponding metabolites. The comparison with other drug-target prediction methods showed the higher performance of metabolite-likeness for predicting enzyme modulators. After that, the drugs scored higher than similarity score of 0.654 were selected as possible modulators of enzymes for their corresponding metabolites. In addition, we showed that drug repositioning results of 10 enzymes were concordant with the literature evidence. This study introduced a method to predict the repositioning of known drugs to possible modulators of disease associated enzymes using human metabolite-likeness. We demonstrated that this approach works correctly with known antimetabolite drugs and showed that the proposed method has better performance compared to other drug target prediction methods in terms of enzyme modulators prediction. This study as a proof-of-concept showed how to apply metabolite-likeness to drug repositioning as well as potential in further expansion as we acquire more disease associated metabolite-target protein relations.

  19. Target and Tissue Selectivity Prediction by Integrated Mechanistic Pharmacokinetic-Target Binding and Quantitative Structure Activity Modeling.

    PubMed

    Vlot, Anna H C; de Witte, Wilhelmus E A; Danhof, Meindert; van der Graaf, Piet H; van Westen, Gerard J P; de Lange, Elizabeth C M

    2017-12-04

    Selectivity is an important attribute of effective and safe drugs, and prediction of in vivo target and tissue selectivity would likely improve drug development success rates. However, a lack of understanding of the underlying (pharmacological) mechanisms and availability of directly applicable predictive methods complicates the prediction of selectivity. We explore the value of combining physiologically based pharmacokinetic (PBPK) modeling with quantitative structure-activity relationship (QSAR) modeling to predict the influence of the target dissociation constant (K D ) and the target dissociation rate constant on target and tissue selectivity. The K D values of CB1 ligands in the ChEMBL database are predicted by QSAR random forest (RF) modeling for the CB1 receptor and known off-targets (TRPV1, mGlu5, 5-HT1a). Of these CB1 ligands, rimonabant, CP-55940, and Δ 8 -tetrahydrocanabinol, one of the active ingredients of cannabis, were selected for simulations of target occupancy for CB1, TRPV1, mGlu5, and 5-HT1a in three brain regions, to illustrate the principles of the combined PBPK-QSAR modeling. Our combined PBPK and target binding modeling demonstrated that the optimal values of the K D and k off for target and tissue selectivity were dependent on target concentration and tissue distribution kinetics. Interestingly, if the target concentration is high and the perfusion of the target site is low, the optimal K D value is often not the lowest K D value, suggesting that optimization towards high drug-target affinity can decrease the benefit-risk ratio. The presented integrative structure-pharmacokinetic-pharmacodynamic modeling provides an improved understanding of tissue and target selectivity.

  20. A Template-Based Protein Structure Reconstruction Method Using Deep Autoencoder Learning.

    PubMed

    Li, Haiou; Lyu, Qiang; Cheng, Jianlin

    2016-12-01

    Protein structure prediction is an important problem in computational biology, and is widely applied to various biomedical problems such as protein function study, protein design, and drug design. In this work, we developed a novel deep learning approach based on a deeply stacked denoising autoencoder for protein structure reconstruction. We applied our approach to a template-based protein structure prediction using only the 3D structural coordinates of homologous template proteins as input. The templates were identified for a target protein by a PSI-BLAST search. 3DRobot (a program that automatically generates diverse and well-packed protein structure decoys) was used to generate initial decoy models for the target from the templates. A stacked denoising autoencoder was trained on the decoys to obtain a deep learning model for the target protein. The trained deep model was then used to reconstruct the final structural model for the target sequence. With target proteins that have highly similar template proteins as benchmarks, the GDT-TS score of the predicted structures is greater than 0.7, suggesting that the deep autoencoder is a promising method for protein structure reconstruction.

  1. APOLLO: a quality assessment service for single and multiple protein models.

    PubMed

    Wang, Zheng; Eickholt, Jesse; Cheng, Jianlin

    2011-06-15

    We built a web server named APOLLO, which can evaluate the absolute global and local qualities of a single protein model using machine learning methods or the global and local qualities of a pool of models using a pair-wise comparison approach. Based on our evaluations on 107 CASP9 (Critical Assessment of Techniques for Protein Structure Prediction) targets, the predicted quality scores generated from our machine learning and pair-wise methods have an average per-target correlation of 0.671 and 0.917, respectively, with the true model quality scores. Based on our test on 92 CASP9 targets, our predicted absolute local qualities have an average difference of 2.60 Å with the actual distances to native structure. http://sysbio.rnet.missouri.edu/apollo/. Single and pair-wise global quality assessment software is also available at the site.

  2. High-order graph matching based feature selection for Alzheimer's disease identification.

    PubMed

    Liu, Feng; Suk, Heung-Il; Wee, Chong-Yaw; Chen, Huafu; Shen, Dinggang

    2013-01-01

    One of the main limitations of l1-norm feature selection is that it focuses on estimating the target vector for each sample individually without considering relations with other samples. However, it's believed that the geometrical relation among target vectors in the training set may provide useful information, and it would be natural to expect that the predicted vectors have similar geometric relations as the target vectors. To overcome these limitations, we formulate this as a graph-matching feature selection problem between a predicted graph and a target graph. In the predicted graph a node is represented by predicted vector that may describe regional gray matter volume or cortical thickness features, and in the target graph a node is represented by target vector that include class label and clinical scores. In particular, we devise new regularization terms in sparse representation to impose high-order graph matching between the target vectors and the predicted ones. Finally, the selected regional gray matter volume and cortical thickness features are fused in kernel space for classification. Using the ADNI dataset, we evaluate the effectiveness of the proposed method and obtain the accuracies of 92.17% and 81.57% in AD and MCI classification, respectively.

  3. Modeling protein complexes with BiGGER.

    PubMed

    Krippahl, Ludwig; Moura, José J; Palma, P Nuno

    2003-07-01

    This article describes the method and results of our participation in the Critical Assessment of PRediction of Interactions (CAPRI) experiment, using the protein docking program BiGGER (Bimolecular complex Generation with Global Evaluation and Ranking) (Palma et al., Proteins 2000;39:372-384). Of five target complexes (CAPRI targets 2, 4, 5, 6, and 7), only one was successfully predicted (target 6), but BiGGER generated reasonable models for targets 4, 5, and 7, which could have been identified if additional biochemical information had been available. Copyright 2003 Wiley-Liss, Inc.

  4. BUSCA: an integrative web server to predict subcellular localization of proteins.

    PubMed

    Savojardo, Castrense; Martelli, Pier Luigi; Fariselli, Piero; Profiti, Giuseppe; Casadio, Rita

    2018-04-30

    Here, we present BUSCA (http://busca.biocomp.unibo.it), a novel web server that integrates different computational tools for predicting protein subcellular localization. BUSCA combines methods for identifying signal and transit peptides (DeepSig and TPpred3), GPI-anchors (PredGPI) and transmembrane domains (ENSEMBLE3.0 and BetAware) with tools for discriminating subcellular localization of both globular and membrane proteins (BaCelLo, MemLoci and SChloro). Outcomes from the different tools are processed and integrated for annotating subcellular localization of both eukaryotic and bacterial protein sequences. We benchmark BUSCA against protein targets derived from recent CAFA experiments and other specific data sets, reporting performance at the state-of-the-art. BUSCA scores better than all other evaluated methods on 2732 targets from CAFA2, with a F1 value equal to 0.49 and among the best methods when predicting targets from CAFA3. We propose BUSCA as an integrated and accurate resource for the annotation of protein subcellular localization.

  5. Modified linear predictive coding approach for moving target tracking by Doppler radar

    NASA Astrophysics Data System (ADS)

    Ding, Yipeng; Lin, Xiaoyi; Sun, Ke-Hui; Xu, Xue-Mei; Liu, Xi-Yao

    2016-07-01

    Doppler radar is a cost-effective tool for moving target tracking, which can support a large range of civilian and military applications. A modified linear predictive coding (LPC) approach is proposed to increase the target localization accuracy of the Doppler radar. Based on the time-frequency analysis of the received echo, the proposed approach first real-time estimates the noise statistical parameters and constructs an adaptive filter to intelligently suppress the noise interference. Then, a linear predictive model is applied to extend the available data, which can help improve the resolution of the target localization result. Compared with the traditional LPC method, which empirically decides the extension data length, the proposed approach develops an error array to evaluate the prediction accuracy and thus, adjust the optimum extension data length intelligently. Finally, the prediction error array is superimposed with the predictor output to correct the prediction error. A series of experiments are conducted to illustrate the validity and performance of the proposed techniques.

  6. A tale of two sequences: microRNA-target chimeric reads.

    PubMed

    Broughton, James P; Pasquinelli, Amy E

    2016-04-04

    In animals, a functional interaction between a microRNA (miRNA) and its target RNA requires only partial base pairing. The limited number of base pair interactions required for miRNA targeting provides miRNAs with broad regulatory potential and also makes target prediction challenging. Computational approaches to target prediction have focused on identifying miRNA target sites based on known sequence features that are important for canonical targeting and may miss non-canonical targets. Current state-of-the-art experimental approaches, such as CLIP-seq (cross-linking immunoprecipitation with sequencing), PAR-CLIP (photoactivatable-ribonucleoside-enhanced CLIP), and iCLIP (individual-nucleotide resolution CLIP), require inference of which miRNA is bound at each site. Recently, the development of methods to ligate miRNAs to their target RNAs during the preparation of sequencing libraries has provided a new tool for the identification of miRNA target sites. The chimeric, or hybrid, miRNA-target reads that are produced by these methods unambiguously identify the miRNA bound at a specific target site. The information provided by these chimeric reads has revealed extensive non-canonical interactions between miRNAs and their target mRNAs, and identified many novel interactions between miRNAs and noncoding RNAs.

  7. Method to determine transcriptional regulation pathways in organisms

    DOEpatents

    Gardner, Timothy S.; Collins, James J.; Hayete, Boris; Faith, Jeremiah

    2012-11-06

    The invention relates to computer-implemented methods and systems for identifying regulatory relationships between expressed regulating polypeptides and targets of the regulatory activities of such regulating polypeptides. More specifically, the invention provides a new method for identifying regulatory dependencies between biochemical species in a cell. In particular embodiments, provided are computer-implemented methods for identifying a regulatory interaction between a transcription factor and a gene target of the transcription factor, or between a transcription factor and a set of gene targets of the transcription factor. Further provided are genome-scale methods for predicting regulatory interactions between a set of transcription factors and a corresponding set of transcriptional target substrates thereof.

  8. Computational Prediction of Neutralization Epitopes Targeted by Human Anti-V3 HIV Monoclonal Antibodies

    PubMed Central

    Shmelkov, Evgeny; Krachmarov, Chavdar; Grigoryan, Arsen V.; Pinter, Abraham; Statnikov, Alexander; Cardozo, Timothy

    2014-01-01

    The extreme diversity of HIV-1 strains presents a formidable challenge for HIV-1 vaccine design. Although antibodies (Abs) can neutralize HIV-1 and potentially protect against infection, antibodies that target the immunogenic viral surface protein gp120 have widely variable and poorly predictable cross-strain reactivity. Here, we developed a novel computational approach, the Method of Dynamic Epitopes, for identification of neutralization epitopes targeted by anti-HIV-1 monoclonal antibodies (mAbs). Our data demonstrate that this approach, based purely on calculated energetics and 3D structural information, accurately predicts the presence of neutralization epitopes targeted by V3-specific mAbs 2219 and 447-52D in any HIV-1 strain. The method was used to calculate the range of conservation of these specific epitopes across all circulating HIV-1 viruses. Accurately identifying an Ab-targeted neutralization epitope in a virus by computational means enables easy prediction of the breadth of reactivity of specific mAbs across the diversity of thousands of different circulating HIV-1 variants and facilitates rational design and selection of immunogens mimicking specific mAb-targeted epitopes in a multivalent HIV-1 vaccine. The defined epitopes can also be used for the purpose of epitope-specific analyses of breakthrough sequences recorded in vaccine clinical trials. Thus, our study is a prototype for a valuable tool for rational HIV-1 vaccine design. PMID:24587168

  9. Cooperative gene regulation by microRNA pairs and their identification using a computational workflow

    PubMed Central

    Schmitz, Ulf; Lai, Xin; Winter, Felix; Wolkenhauer, Olaf; Vera, Julio; Gupta, Shailendra K.

    2014-01-01

    MicroRNAs (miRNAs) are an integral part of gene regulation at the post-transcriptional level. Recently, it has been shown that pairs of miRNAs can repress the translation of a target mRNA in a cooperative manner, which leads to an enhanced effectiveness and specificity in target repression. However, it remains unclear which miRNA pairs can synergize and which genes are target of cooperative miRNA regulation. In this paper, we present a computational workflow for the prediction and analysis of cooperating miRNAs and their mutual target genes, which we refer to as RNA triplexes. The workflow integrates methods of miRNA target prediction; triplex structure analysis; molecular dynamics simulations and mathematical modeling for a reliable prediction of functional RNA triplexes and target repression efficiency. In a case study we analyzed the human genome and identified several thousand targets of cooperative gene regulation. Our results suggest that miRNA cooperativity is a frequent mechanism for an enhanced target repression by pairs of miRNAs facilitating distinctive and fine-tuned target gene expression patterns. Human RNA triplexes predicted and characterized in this study are organized in a web resource at www.sbi.uni-rostock.de/triplexrna/. PMID:24875477

  10. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model

    PubMed Central

    Li, Zhen; Zhang, Renyu

    2017-01-01

    Motivation Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. Method This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Results Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have much better quality than template-based models especially for membrane proteins. The 3D models built from our contact prediction have TMscore>0.5 for 208 of the 398 membrane proteins, while those from homology modeling have TMscore>0.5 for only 10 of them. Further, even if trained mostly by soluble proteins, our deep learning method works very well on membrane proteins. In the recent blind CAMEO benchmark, our fully-automated web server implementing this method successfully folded 6 targets with a new fold and only 0.3L-2.3L effective sequence homologs, including one β protein of 182 residues, one α+β protein of 125 residues, one α protein of 140 residues, one α protein of 217 residues, one α/β of 260 residues and one α protein of 462 residues. Our method also achieved the highest F1 score on free-modeling targets in the latest CASP (Critical Assessment of Structure Prediction), although it was not fully implemented back then. Availability http://raptorx.uchicago.edu/ContactMap/ PMID:28056090

  11. Assessment of CAPRI predictions in rounds 3-5 shows progress in docking procedures.

    PubMed

    Méndez, Raúl; Leplae, Raphaël; Lensink, Marc F; Wodak, Shoshana J

    2005-08-01

    The current status of docking procedures for predicting protein-protein interactions starting from their three-dimensional (3D) structure is reassessed by evaluating blind predictions, performed during 2003-2004 as part of Rounds 3-5 of the community-wide experiment on Critical Assessment of PRedicted Interactions (CAPRI). Ten newly determined structures of protein-protein complexes were used as targets for these rounds. They comprised 2 enzyme-inhibitor complexes, 2 antigen-antibody complexes, 2 complexes involved in cellular signaling, 2 homo-oligomers, and a complex between 2 components of the bacterial cellulosome. For most targets, the predictors were given the experimental structures of 1 unbound and 1 bound component, with the latter in a random orientation. For some, the structure of the free component was derived from that of a related protein, requiring the use of homology modeling. In some of the targets, significant differences in conformation were displayed between the bound and unbound components, representing a major challenge for the docking procedures. For 1 target, predictions could not go to completion. In total, 1866 predictions submitted by 30 groups were evaluated. Over one-third of these groups applied completely novel docking algorithms and scoring functions, with several of them specifically addressing the challenge of dealing with side-chain and backbone flexibility. The quality of the predicted interactions was evaluated by comparison to the experimental structures of the targets, made available for the evaluation, using the well-agreed-upon criteria used previously. Twenty-four groups, which for the first time included an automatic Web server, produced predictions ranking from acceptable to highly accurate for all targets, including those where the structures of the bound and unbound forms differed substantially. These results and a brief survey of the methods used by participants of CAPRI Rounds 3-5 suggest that genuine progress in the performance of docking methods is being achieved, with CAPRI acting as the catalyst.

  12. Large-Scale Chemical Similarity Networks for Target Profiling of Compounds Identified in Cell-Based Chemical Screens

    PubMed Central

    Lo, Yu-Chen; Senese, Silvia; Li, Chien-Ming; Hu, Qiyang; Huang, Yong; Damoiseaux, Robert; Torres, Jorge Z.

    2015-01-01

    Target identification is one of the most critical steps following cell-based phenotypic chemical screens aimed at identifying compounds with potential uses in cell biology and for developing novel disease therapies. Current in silico target identification methods, including chemical similarity database searches, are limited to single or sequential ligand analysis that have limited capabilities for accurate deconvolution of a large number of compounds with diverse chemical structures. Here, we present CSNAP (Chemical Similarity Network Analysis Pulldown), a new computational target identification method that utilizes chemical similarity networks for large-scale chemotype (consensus chemical pattern) recognition and drug target profiling. Our benchmark study showed that CSNAP can achieve an overall higher accuracy (>80%) of target prediction with respect to representative chemotypes in large (>200) compound sets, in comparison to the SEA approach (60–70%). Additionally, CSNAP is capable of integrating with biological knowledge-based databases (Uniprot, GO) and high-throughput biology platforms (proteomic, genetic, etc) for system-wise drug target validation. To demonstrate the utility of the CSNAP approach, we combined CSNAP's target prediction with experimental ligand evaluation to identify the major mitotic targets of hit compounds from a cell-based chemical screen and we highlight novel compounds targeting microtubules, an important cancer therapeutic target. The CSNAP method is freely available and can be accessed from the CSNAP web server (http://services.mbi.ucla.edu/CSNAP/). PMID:25826798

  13. Ab Initio Protein Structure Prediction Using Chunk-TASSER

    PubMed Central

    Zhou, Hongyi; Skolnick, Jeffrey

    2007-01-01

    We have developed an ab initio protein structure prediction method called chunk-TASSER that uses ab initio folded supersecondary structure chunks of a given target as well as threading templates for obtaining contact potentials and distance restraints. The predicted chunks, selected on the basis of a new fragment comparison method, are folded by a fragment insertion method. Full-length models are built and refined by the TASSER methodology, which searches conformational space via parallel hyperbolic Monte Carlo. We employ an optimized reduced force field that includes knowledge-based statistical potentials and restraints derived from the chunks as well as threading templates. The method is tested on a dataset of 425 hard target proteins ≤250 amino acids in length. The average TM-scores of the best of top five models per target are 0.266, 0.336, and 0.362 by the threading algorithm SP3, original TASSER and chunk-TASSER, respectively. For a subset of 80 proteins with predicted α-helix content ≥50%, these averages are 0.284, 0.356, and 0.403, respectively. The percentages of proteins with the best of top five models having TM-score ≥0.4 (a statistically significant threshold for structural similarity) are 3.76, 20.94, and 28.94% by SP3, TASSER, and chunk-TASSER, respectively, overall, while for the subset of 80 predominantly helical proteins, these percentages are 2.50, 23.75, and 41.25%. Thus, chunk-TASSER shows a significant improvement over TASSER for modeling hard targets where no good template can be identified. We also tested chunk-TASSER on 21 medium/hard targets <200 amino-acids-long from CASP7. Chunk-TASSER is ∼11% (10%) better than TASSER for the total TM-score of the first (best of top five) models. Chunk-TASSER is fully automated and can be used in proteome scale protein structure prediction. PMID:17496016

  14. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.

    PubMed

    Wang, Sheng; Sun, Siqi; Li, Zhen; Zhang, Renyu; Xu, Jinbo

    2017-01-01

    Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have much better quality than template-based models especially for membrane proteins. The 3D models built from our contact prediction have TMscore>0.5 for 208 of the 398 membrane proteins, while those from homology modeling have TMscore>0.5 for only 10 of them. Further, even if trained mostly by soluble proteins, our deep learning method works very well on membrane proteins. In the recent blind CAMEO benchmark, our fully-automated web server implementing this method successfully folded 6 targets with a new fold and only 0.3L-2.3L effective sequence homologs, including one β protein of 182 residues, one α+β protein of 125 residues, one α protein of 140 residues, one α protein of 217 residues, one α/β of 260 residues and one α protein of 462 residues. Our method also achieved the highest F1 score on free-modeling targets in the latest CASP (Critical Assessment of Structure Prediction), although it was not fully implemented back then. http://raptorx.uchicago.edu/ContactMap/.

  15. ProTox: a web server for the in silico prediction of rodent oral toxicity

    PubMed Central

    Drwal, Malgorzata N.; Banerjee, Priyanka; Dunkel, Mathias; Wettig, Martin R.; Preissner, Robert

    2014-01-01

    Animal trials are currently the major method for determining the possible toxic effects of drug candidates and cosmetics. In silico prediction methods represent an alternative approach and aim to rationalize the preclinical drug development, thus enabling the reduction of the associated time, costs and animal experiments. Here, we present ProTox, a web server for the prediction of rodent oral toxicity. The prediction method is based on the analysis of the similarity of compounds with known median lethal doses (LD50) and incorporates the identification of toxic fragments, therefore representing a novel approach in toxicity prediction. In addition, the web server includes an indication of possible toxicity targets which is based on an in-house collection of protein–ligand-based pharmacophore models (‘toxicophores’) for targets associated with adverse drug reactions. The ProTox web server is open to all users and can be accessed without registration at: http://tox.charite.de/tox. The only requirement for the prediction is the two-dimensional structure of the input compounds. All ProTox methods have been evaluated based on a diverse external validation set and displayed strong performance (sensitivity, specificity and precision of 76, 95 and 75%, respectively) and superiority over other toxicity prediction tools, indicating their possible applicability for other compound classes. PMID:24838562

  16. Macromolecular target prediction by self-organizing feature maps.

    PubMed

    Schneider, Gisbert; Schneider, Petra

    2017-03-01

    Rational drug discovery would greatly benefit from a more nuanced appreciation of the activity of pharmacologically active compounds against a diverse panel of macromolecular targets. Already, computational target-prediction models assist medicinal chemists in library screening, de novo molecular design, optimization of active chemical agents, drug re-purposing, in the spotting of potential undesired off-target activities, and in the 'de-orphaning' of phenotypic screening hits. The self-organizing map (SOM) algorithm has been employed successfully for these and other purposes. Areas covered: The authors recapitulate contemporary artificial neural network methods for macromolecular target prediction, and present the basic SOM algorithm at a conceptual level. Specifically, they highlight consensus target-scoring by the employment of multiple SOMs, and discuss the opportunities and limitations of this technique. Expert opinion: Self-organizing feature maps represent a straightforward approach to ligand clustering and classification. Some of the appeal lies in their conceptual simplicity and broad applicability domain. Despite known algorithmic shortcomings, this computational target prediction concept has been proven to work in prospective settings with high success rates. It represents a prototypic technique for future advances in the in silico identification of the modes of action and macromolecular targets of bioactive molecules.

  17. Reduced Fragment Diversity for Alpha and Alpha-Beta Protein Structure Prediction using Rosetta.

    PubMed

    Abbass, Jad; Nebel, Jean-Christophe

    2017-01-01

    Protein structure prediction is considered a main challenge in computational biology. The biannual international competition, Critical Assessment of protein Structure Prediction (CASP), has shown in its eleventh experiment that free modelling target predictions are still beyond reliable accuracy, therefore, much effort should be made to improve ab initio methods. Arguably, Rosetta is considered as the most competitive method when it comes to targets with no homologues. Relying on fragments of length 9 and 3 from known structures, Rosetta creates putative structures by assembling candidate fragments. Generally, the structure with the lowest energy score, also known as first model, is chosen to be the "predicted one". A thorough study has been conducted on the role and diversity of 3-mers involved in Rosetta's model "refinement" phase. Usage of the standard number of 3-mers - i.e. 200 - has been shown to degrade alpha and alpha-beta protein conformations initially achieved by assembling 9-mers. Therefore, a new prediction pipeline is proposed for Rosetta where the "refinement" phase is customised according to a target's structural class prediction. Over 8% improvement in terms of first model structure accuracy is reported for alpha and alpha-beta classes when decreasing the number of 3- mers. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  18. CisMapper: predicting regulatory interactions from transcription factor ChIP-seq data

    PubMed Central

    O'Connor, Timothy; Bodén, Mikael

    2017-01-01

    Abstract Identifying the genomic regions and regulatory factors that control the transcription of genes is an important, unsolved problem. The current method of choice predicts transcription factor (TF) binding sites using chromatin immunoprecipitation followed by sequencing (ChIP-seq), and then links the binding sites to putative target genes solely on the basis of the genomic distance between them. Evidence from chromatin conformation capture experiments shows that this approach is inadequate due to long-distance regulation via chromatin looping. We present CisMapper, which predicts the regulatory targets of a TF using the correlation between a histone mark at the TF's bound sites and the expression of each gene across a panel of tissues. Using both chromatin conformation capture and differential expression data, we show that CisMapper is more accurate at predicting the target genes of a TF than the distance-based approaches currently used, and is particularly advantageous for predicting the long-range regulatory interactions typical of tissue-specific gene expression. CisMapper also predicts which TF binding sites regulate a given gene more accurately than using genomic distance. Unlike distance-based methods, CisMapper can predict which transcription start site of a gene is regulated by a particular binding site of the TF. PMID:28204599

  19. A deep learning framework for improving long-range residue-residue contact prediction using a hierarchical strategy.

    PubMed

    Xiong, Dapeng; Zeng, Jianyang; Gong, Haipeng

    2017-09-01

    Residue-residue contacts are of great value for protein structure prediction, since contact information, especially from those long-range residue pairs, can significantly reduce the complexity of conformational sampling for protein structure prediction in practice. Despite progresses in the past decade on protein targets with abundant homologous sequences, accurate contact prediction for proteins with limited sequence information is still far from satisfaction. Methodologies for these hard targets still need further improvement. We presented a computational program DeepConPred, which includes a pipeline of two novel deep-learning-based methods (DeepCCon and DeepRCon) as well as a contact refinement step, to improve the prediction of long-range residue contacts from primary sequences. When compared with previous prediction approaches, our framework employed an effective scheme to identify optimal and important features for contact prediction, and was only trained with coevolutionary information derived from a limited number of homologous sequences to ensure robustness and usefulness for hard targets. Independent tests showed that 59.33%/49.97%, 64.39%/54.01% and 70.00%/59.81% of the top L/5, top L/10 and top 5 predictions were correct for CASP10/CASP11 proteins, respectively. In general, our algorithm ranked as one of the best methods for CASP targets. All source data and codes are available at http://166.111.152.91/Downloads.html . hgong@tsinghua.edu.cn or zengjy321@tsinghua.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  20. AlzhCPI: A knowledge base for predicting chemical-protein interactions towards Alzheimer's disease.

    PubMed

    Fang, Jiansong; Wang, Ling; Li, Yecheng; Lian, Wenwen; Pang, Xiaocong; Wang, Hong; Yuan, Dongsheng; Wang, Qi; Liu, Ai-Lin; Du, Guan-Hua

    2017-01-01

    Alzheimer's disease (AD) is a complicated progressive neurodegeneration disorder. To confront AD, scientists are searching for multi-target-directed ligands (MTDLs) to delay disease progression. The in silico prediction of chemical-protein interactions (CPI) can accelerate target identification and drug discovery. Previously, we developed 100 binary classifiers to predict the CPI for 25 key targets against AD using the multi-target quantitative structure-activity relationship (mt-QSAR) method. In this investigation, we aimed to apply the mt-QSAR method to enlarge the model library to predict CPI towards AD. Another 104 binary classifiers were further constructed to predict the CPI for 26 preclinical AD targets based on the naive Bayesian (NB) and recursive partitioning (RP) algorithms. The internal 5-fold cross-validation and external test set validation were applied to evaluate the performance of the training sets and test set, respectively. The area under the receiver operating characteristic curve (ROC) for the test sets ranged from 0.629 to 1.0, with an average of 0.903. In addition, we developed a web server named AlzhCPI to integrate the comprehensive information of approximately 204 binary classifiers, which has potential applications in network pharmacology and drug repositioning. AlzhCPI is available online at http://rcidm.org/AlzhCPI/index.html. To illustrate the applicability of AlzhCPI, the developed system was employed for the systems pharmacology-based investigation of shichangpu against AD to enhance the understanding of the mechanisms of action of shichangpu from a holistic perspective.

  1. Forecasting Occurrences of Activities.

    PubMed

    Minor, Bryan; Cook, Diane J

    2017-07-01

    While activity recognition has been shown to be valuable for pervasive computing applications, less work has focused on techniques for forecasting the future occurrence of activities. We present an activity forecasting method to predict the time that will elapse until a target activity occurs. This method generates an activity forecast using a regression tree classifier and offers an advantage over sequence prediction methods in that it can predict expected time until an activity occurs. We evaluate this algorithm on real-world smart home datasets and provide evidence that our proposed approach is most effective at predicting activity timings.

  2. Prediction of intracellular exposure bridges the gap between target- and cell-based drug discovery

    PubMed Central

    Gordon, Laurie J.; Wayne, Gareth J.; Almqvist, Helena; Axelsson, Hanna; Seashore-Ludlow, Brinton; Treyer, Andrea; Lundbäck, Thomas; West, Andy; Hann, Michael M.; Artursson, Per

    2017-01-01

    Inadequate target exposure is a major cause of high attrition in drug discovery. Here, we show that a label-free method for quantifying the intracellular bioavailability (Fic) of drug molecules predicts drug access to intracellular targets and hence, pharmacological effect. We determined Fic in multiple cellular assays and cell types representing different targets from a number of therapeutic areas, including cancer, inflammation, and dementia. Both cytosolic targets and targets localized in subcellular compartments were investigated. Fic gives insights on membrane-permeable compounds in terms of cellular potency and intracellular target engagement, compared with biochemical potency measurements alone. Knowledge of the amount of drug that is locally available to bind intracellular targets provides a powerful tool for compound selection in early drug discovery. PMID:28701380

  3. A new method of small target detection based on neural network

    NASA Astrophysics Data System (ADS)

    Hu, Jing; Hu, Yongli; Lu, Xinxin

    2018-02-01

    The detection and tracking of moving dim target in infrared image have been an research hotspot for many years. The target in each frame of images only occupies several pixels without any shape and structure information. Moreover, infrared small target is often submerged in complicated background with low signal-to-clutter ratio, making the detection very difficult. Different backgrounds exhibit different statistical properties, making it becomes extremely complex to detect the target. If the threshold segmentation is not reasonable, there may be more noise points in the final detection, which is unfavorable for the detection of the trajectory of the target. Single-frame target detection may not be able to obtain the desired target and cause high false alarm rate. We believe the combination of suspicious target detection spatially in each frame and temporal association for target tracking will increase reliability of tracking dim target. The detection of dim target is mainly divided into two parts, In the first part, we adopt bilateral filtering method in background suppression, after the threshold segmentation, the suspicious target in each frame are extracted, then we use LSTM(long short term memory) neural network to predict coordinates of target of the next frame. It is a brand-new method base on the movement characteristic of the target in sequence images which could respond to the changes in the relationship between past and future values of the values. Simulation results demonstrate proposed algorithm can effectively predict the trajectory of the moving small target and work efficiently and robustly with low false alarm.

  4. Predicting Drug Combination Index and Simulating the Network-Regulation Dynamics by Mathematical Modeling of Drug-Targeted EGFR-ERK Signaling Pathway

    NASA Astrophysics Data System (ADS)

    Huang, Lu; Jiang, Yuyang; Chen, Yuzong

    2017-01-01

    Synergistic drug combinations enable enhanced therapeutics. Their discovery typically involves the measurement and assessment of drug combination index (CI), which can be facilitated by the development and applications of in-silico CI predictive tools. In this work, we developed and tested the ability of a mathematical model of drug-targeted EGFR-ERK pathway in predicting CIs and in analyzing multiple synergistic drug combinations against observations. Our mathematical model was validated against the literature reported signaling, drug response dynamics, and EGFR-MEK drug combination effect. The predicted CIs and combination therapeutic effects of the EGFR-BRaf, BRaf-MEK, FTI-MEK, and FTI-BRaf inhibitor combinations showed consistent synergism. Our results suggest that existing pathway models may be potentially extended for developing drug-targeted pathway models to predict drug combination CI values, isobolograms, and drug-response surfaces as well as to analyze the dynamics of individual and combinations of drugs. With our model, the efficacy of potential drug combinations can be predicted. Our method complements the developed in-silico methods (e.g. the chemogenomic profile and the statistically-inferenced network models) by predicting drug combination effects from the perspectives of pathway dynamics using experimental or validated molecular kinetic constants, thereby facilitating the collective prediction of drug combination effects in diverse ranges of disease systems.

  5. Biological and functional relevance of CASP predictions

    PubMed Central

    Liu, Tianyun; Ish‐Shalom, Shirbi; Torng, Wen; Lafita, Aleix; Bock, Christian; Mort, Matthew; Cooper, David N; Bliven, Spencer; Capitani, Guido; Mooney, Sean D.

    2017-01-01

    Abstract Our goal is to answer the question: compared with experimental structures, how useful are predicted models for functional annotation? We assessed the functional utility of predicted models by comparing the performances of a suite of methods for functional characterization on the predictions and the experimental structures. We identified 28 sites in 25 protein targets to perform functional assessment. These 28 sites included nine sites with known ligand binding (holo‐sites), nine sites that are expected or suggested by experimental authors for small molecule binding (apo‐sites), and Ten sites containing important motifs, loops, or key residues with important disease‐associated mutations. We evaluated the utility of the predictions by comparing their microenvironments to the experimental structures. Overall structural quality correlates with functional utility. However, the best‐ranked predictions (global) may not have the best functional quality (local). Our assessment provides an ability to discriminate between predictions with high structural quality. When assessing ligand‐binding sites, most prediction methods have higher performance on apo‐sites than holo‐sites. Some servers show consistently high performance for certain types of functional sites. Finally, many functional sites are associated with protein‐protein interaction. We also analyzed biologically relevant features from the protein assemblies of two targets where the active site spanned the protein‐protein interface. For the assembly targets, we find that the features in the models are mainly determined by the choice of template. PMID:28975675

  6. Transcriptional network inference from functional similarity and expression data: a global supervised approach.

    PubMed

    Ambroise, Jérôme; Robert, Annie; Macq, Benoit; Gala, Jean-Luc

    2012-01-06

    An important challenge in system biology is the inference of biological networks from postgenomic data. Among these biological networks, a gene transcriptional regulatory network focuses on interactions existing between transcription factors (TFs) and and their corresponding target genes. A large number of reverse engineering algorithms were proposed to infer such networks from gene expression profiles, but most current methods have relatively low predictive performances. In this paper, we introduce the novel TNIFSED method (Transcriptional Network Inference from Functional Similarity and Expression Data), that infers a transcriptional network from the integration of correlations and partial correlations of gene expression profiles and gene functional similarities through a supervised classifier. In the current work, TNIFSED was applied to predict the transcriptional network in Escherichia coli and in Saccharomyces cerevisiae, using datasets of 445 and 170 affymetrix arrays, respectively. Using the area under the curve of the receiver operating characteristics and the F-measure as indicators, we showed the predictive performance of TNIFSED to be better than unsupervised state-of-the-art methods. TNIFSED performed slightly worse than the supervised SIRENE algorithm for the target genes identification of the TF having a wide range of yet identified target genes but better for TF having only few identified target genes. Our results indicate that TNIFSED is complementary to the SIRENE algorithm, and particularly suitable to discover target genes of "orphan" TFs.

  7. A method for predicting target drug efficiency in cancer based on the analysis of signaling pathway activation.

    PubMed

    Artemov, Artem; Aliper, Alexander; Korzinkin, Michael; Lezhnina, Ksenia; Jellen, Leslie; Zhukov, Nikolay; Roumiantsev, Sergey; Gaifullin, Nurshat; Zhavoronkov, Alex; Borisov, Nicolas; Buzdin, Anton

    2015-10-06

    A new generation of anticancer therapeutics called target drugs has quickly developed in the 21st century. These drugs are tailored to inhibit cancer cell growth, proliferation, and viability by specific interactions with one or a few target proteins. However, despite formally known molecular targets for every "target" drug, patient response to treatment remains largely individual and unpredictable. Choosing the most effective personalized treatment remains a major challenge in oncology and is still largely trial and error. Here we present a novel approach for predicting target drug efficacy based on the gene expression signature of the individual tumor sample(s). The enclosed bioinformatic algorithm detects activation of intracellular regulatory pathways in the tumor in comparison to the corresponding normal tissues. According to the nature of the molecular targets of a drug, it predicts whether the drug can prevent cancer growth and survival in each individual case by blocking the abnormally activated tumor-promoting pathways or by reinforcing internal tumor suppressor cascades. To validate the method, we compared the distribution of predicted drug efficacy scores for five drugs (Sorafenib, Bevacizumab, Cetuximab, Sorafenib, Imatinib, Sunitinib) and seven cancer types (Clear Cell Renal Cell Carcinoma, Colon cancer, Lung adenocarcinoma, non-Hodgkin Lymphoma, Thyroid cancer and Sarcoma) with the available clinical trials data for the respective cancer types and drugs. The percent of responders to a drug treatment correlated significantly (Pearson's correlation 0.77 p = 0.023) with the percent of tumors showing high drug scores calculated with the current algorithm.

  8. A cross docking pipeline for improving pose prediction and virtual screening performance

    NASA Astrophysics Data System (ADS)

    Kumar, Ashutosh; Zhang, Kam Y. J.

    2018-01-01

    Pose prediction and virtual screening performance of a molecular docking method depend on the choice of protein structures used for docking. Multiple structures for a target protein are often used to take into account the receptor flexibility and problems associated with a single receptor structure. However, the use of multiple receptor structures is computationally expensive when docking a large library of small molecules. Here, we propose a new cross-docking pipeline suitable to dock a large library of molecules while taking advantage of multiple target protein structures. Our method involves the selection of a suitable receptor for each ligand in a screening library utilizing ligand 3D shape similarity with crystallographic ligands. We have prospectively evaluated our method in D3R Grand Challenge 2 and demonstrated that our cross-docking pipeline can achieve similar or better performance than using either single or multiple-receptor structures. Moreover, our method displayed not only decent pose prediction performance but also better virtual screening performance over several other methods.

  9. An integrated structure- and system-based framework to identify new targets of metabolites and known drugs

    PubMed Central

    Naveed, Hammad; Hameed, Umar S.; Harrus, Deborah; Bourguet, William; Arold, Stefan T.; Gao, Xin

    2015-01-01

    Motivation: The inherent promiscuity of small molecules towards protein targets impedes our understanding of healthy versus diseased metabolism. This promiscuity also poses a challenge for the pharmaceutical industry as identifying all protein targets is important to assess (side) effects and repositioning opportunities for a drug. Results: Here, we present a novel integrated structure- and system-based approach of drug-target prediction (iDTP) to enable the large-scale discovery of new targets for small molecules, such as pharmaceutical drugs, co-factors and metabolites (collectively called ‘drugs’). For a given drug, our method uses sequence order–independent structure alignment, hierarchical clustering and probabilistic sequence similarity to construct a probabilistic pocket ensemble (PPE) that captures promiscuous structural features of different binding sites on known targets. A drug’s PPE is combined with an approximation of its delivery profile to reduce false positives. In our cross-validation study, we use iDTP to predict the known targets of 11 drugs, with 63% sensitivity and 81% specificity. We then predicted novel targets for these drugs—two that are of high pharmacological interest, the peroxisome proliferator-activated receptor gamma and the oncogene B-cell lymphoma 2, were successfully validated through in vitro binding experiments. Our method is broadly applicable for the prediction of protein-small molecule interactions with several novel applications to biological research and drug development. Availability and implementation: The program, datasets and results are freely available to academic users at http://sfb.kaust.edu.sa/Pages/Software.aspx. Contact: xin.gao@kaust.edu.sa and stefan.arold@kaust.edu.sa Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26286808

  10. Predicting Drug-Target Interactions Based on Small Positive Samples.

    PubMed

    Hu, Pengwei; Chan, Keith C C; Hu, Yanxing

    2018-01-01

    A basic task in drug discovery is to find new medication in the form of candidate compounds that act on a target protein. In other words, a drug has to interact with a target and such drug-target interaction (DTI) is not expected to be random. Significant and interesting patterns are expected to be hidden in them. If these patterns can be discovered, new drugs are expected to be more easily discoverable. Currently, a number of computational methods have been proposed to predict DTIs based on their similarity. However, such as approach does not allow biochemical features to be directly considered. As a result, some methods have been proposed to try to discover patterns in physicochemical interactions. Since the number of potential negative DTIs are very high both in absolute terms and in comparison to that of the known ones, these methods are rather computationally expensive and they can only rely on subsets, rather than the full set, of negative DTIs for training and validation. As there is always a relatively high chance for negative DTIs to be falsely identified and as only partial subset of such DTIs is considered, existing approaches can be further improved to better predict DTIs. In this paper, we present a novel approach, called ODT (one class drug target interaction prediction), for such purpose. One main task of ODT is to discover association patterns between interacting drugs and proteins from the chemical structure of the former and the protein sequence network of the latter. ODT does so in two phases. First, the DTI-network is transformed to a representation by structural properties. Second, it applies a oneclass classification algorithm to build a prediction model based only on known positive interactions. We compared the best AUROC scores of the ODT with several state-of-art approaches on Gold standard data. The prediction accuracy of the ODT is superior in comparison with all the other methods at GPCRs dataset and Ion channels dataset. Performance evaluation of ODT shows that it can be potentially useful. It confirms that predicting potential or missing DTIs based on the known interactions is a promising direction to solve problems related to the use of uncertain and unreliable negative samples and those related to the great demand in computational resources. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  11. Key Technology of Real-Time Road Navigation Method Based on Intelligent Data Research

    PubMed Central

    Tang, Haijing; Liang, Yu; Huang, Zhongnan; Wang, Taoyi; He, Lin; Du, Yicong; Ding, Gangyi

    2016-01-01

    The effect of traffic flow prediction plays an important role in routing selection. Traditional traffic flow forecasting methods mainly include linear, nonlinear, neural network, and Time Series Analysis method. However, all of them have some shortcomings. This paper analyzes the existing algorithms on traffic flow prediction and characteristics of city traffic flow and proposes a road traffic flow prediction method based on transfer probability. This method first analyzes the transfer probability of upstream of the target road and then makes the prediction of the traffic flow at the next time by using the traffic flow equation. Newton Interior-Point Method is used to obtain the optimal value of parameters. Finally, it uses the proposed model to predict the traffic flow at the next time. By comparing the existing prediction methods, the proposed model has proven to have good performance. It can fast get the optimal value of parameters faster and has higher prediction accuracy, which can be used to make real-time traffic flow prediction. PMID:27872637

  12. An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms.

    PubMed

    Hua, Hong-Li; Zhang, Fa-Zhan; Labena, Abraham Alemayehu; Dong, Chuan; Jin, Yan-Ting; Guo, Feng-Biao

    Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus , which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge.

  13. Advanced systems biology methods in drug discovery and translational biomedicine.

    PubMed

    Zou, Jun; Zheng, Ming-Wu; Li, Gen; Su, Zhi-Guang

    2013-01-01

    Systems biology is in an exponential development stage in recent years and has been widely utilized in biomedicine to better understand the molecular basis of human disease and the mechanism of drug action. Here, we discuss the fundamental concept of systems biology and its two computational methods that have been commonly used, that is, network analysis and dynamical modeling. The applications of systems biology in elucidating human disease are highlighted, consisting of human disease networks, treatment response prediction, investigation of disease mechanisms, and disease-associated gene prediction. In addition, important advances in drug discovery, to which systems biology makes significant contributions, are discussed, including drug-target networks, prediction of drug-target interactions, investigation of drug adverse effects, drug repositioning, and drug combination prediction. The systems biology methods and applications covered in this review provide a framework for addressing disease mechanism and approaching drug discovery, which will facilitate the translation of research findings into clinical benefits such as novel biomarkers and promising therapies.

  14. Effect of missing data on multitask prediction methods.

    PubMed

    de la Vega de León, Antonio; Chen, Beining; Gillet, Valerie J

    2018-05-22

    There has been a growing interest in multitask prediction in chemoinformatics, helped by the increasing use of deep neural networks in this field. This technique is applied to multitarget data sets, where compounds have been tested against different targets, with the aim of developing models to predict a profile of biological activities for a given compound. However, multitarget data sets tend to be sparse; i.e., not all compound-target combinations have experimental values. There has been little research on the effect of missing data on the performance of multitask methods. We have used two complete data sets to simulate sparseness by removing data from the training set. Different models to remove the data were compared. These sparse sets were used to train two different multitask methods, deep neural networks and Macau, which is a Bayesian probabilistic matrix factorization technique. Results from both methods were remarkably similar and showed that the performance decrease because of missing data is at first small before accelerating after large amounts of data are removed. This work provides a first approximation to assess how much data is required to produce good performance in multitask prediction exercises.

  15. Report on the sixth blind test of organic crystal structure prediction methods

    PubMed Central

    Reilly, Anthony M.; Cooper, Richard I.; Adjiman, Claire S.; Bhattacharya, Saswata; Boese, A. Daniel; Brandenburg, Jan Gerit; Bygrave, Peter J.; Bylsma, Rita; Campbell, Josh E.; Car, Roberto; Case, David H.; Chadha, Renu; Cole, Jason C.; Cosburn, Katherine; Cuppen, Herma M.; Curtis, Farren; Day, Graeme M.; DiStasio Jr, Robert A.; Dzyabchenko, Alexander; van Eijck, Bouke P.; Elking, Dennis M.; van den Ende, Joost A.; Facelli, Julio C.; Ferraro, Marta B.; Fusti-Molnar, Laszlo; Gatsiou, Christina-Anna; Gee, Thomas S.; de Gelder, René; Ghiringhelli, Luca M.; Goto, Hitoshi; Grimme, Stefan; Guo, Rui; Hofmann, Detlef W. M.; Hoja, Johannes; Hylton, Rebecca K.; Iuzzolino, Luca; Jankiewicz, Wojciech; de Jong, Daniël T.; Kendrick, John; de Klerk, Niek J. J.; Ko, Hsin-Yu; Kuleshova, Liudmila N.; Li, Xiayue; Lohani, Sanjaya; Leusen, Frank J. J.; Lund, Albert M.; Lv, Jian; Ma, Yanming; Marom, Noa; Masunov, Artëm E.; McCabe, Patrick; McMahon, David P.; Meekes, Hugo; Metz, Michael P.; Misquitta, Alston J.; Mohamed, Sharmarke; Monserrat, Bartomeu; Needs, Richard J.; Neumann, Marcus A.; Nyman, Jonas; Obata, Shigeaki; Oberhofer, Harald; Oganov, Artem R.; Orendt, Anita M.; Pagola, Gabriel I.; Pantelides, Constantinos C.; Pickard, Chris J.; Podeszwa, Rafal; Price, Louise S.; Price, Sarah L.; Pulido, Angeles; Read, Murray G.; Reuter, Karsten; Schneider, Elia; Schober, Christoph; Shields, Gregory P.; Singh, Pawanpreet; Sugden, Isaac J.; Szalewicz, Krzysztof; Taylor, Christopher R.; Tkatchenko, Alexandre; Tuckerman, Mark E.; Vacarro, Francesca; Vasileiadis, Manolis; Vazquez-Mayagoitia, Alvaro; Vogt, Leslie; Wang, Yanchao; Watson, Rona E.; de Wijs, Gilles A.; Yang, Jack; Zhu, Qiang; Groom, Colin R.

    2016-01-01

    The sixth blind test of organic crystal structure prediction (CSP) methods has been held, with five target systems: a small nearly rigid molecule, a polymorphic former drug candidate, a chloride salt hydrate, a co-crystal and a bulky flexible molecule. This blind test has seen substantial growth in the number of participants, with the broad range of prediction methods giving a unique insight into the state of the art in the field. Significant progress has been seen in treating flexible molecules, usage of hierarchical approaches to ranking structures, the application of density-functional approximations, and the establishment of new workflows and ‘best practices’ for performing CSP calculations. All of the targets, apart from a single potentially disordered Z′ = 2 polymorph of the drug candidate, were predicted by at least one submission. Despite many remaining challenges, it is clear that CSP methods are becoming more applicable to a wider range of real systems, including salts, hydrates and larger flexible molecules. The results also highlight the potential for CSP calculations to complement and augment experimental studies of organic solid forms. PMID:27484368

  16. Combining on-chip synthesis of a focused combinatorial library with computational target prediction reveals imidazopyridine GPCR ligands.

    PubMed

    Reutlinger, Michael; Rodrigues, Tiago; Schneider, Petra; Schneider, Gisbert

    2014-01-07

    Using the example of the Ugi three-component reaction we report a fast and efficient microfluidic-assisted entry into the imidazopyridine scaffold, where building block prioritization was coupled to a new computational method for predicting ligand-target associations. We identified an innovative GPCR-modulating combinatorial chemotype featuring ligand-efficient adenosine A1/2B and adrenergic α1A/B receptor antagonists. Our results suggest the tight integration of microfluidics-assisted synthesis with computer-based target prediction as a viable approach to rapidly generate bioactivity-focused combinatorial compound libraries with high success rates. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Tertiary structure-based analysis of microRNA–target interactions

    PubMed Central

    Gan, Hin Hark; Gunsalus, Kristin C.

    2013-01-01

    Current computational analysis of microRNA interactions is based largely on primary and secondary structure analysis. Computationally efficient tertiary structure-based methods are needed to enable more realistic modeling of the molecular interactions underlying miRNA-mediated translational repression. We incorporate algorithms for predicting duplex RNA structures, ionic strength effects, duplex entropy and free energy, and docking of duplex–Argonaute protein complexes into a pipeline to model and predict miRNA–target duplex binding energies. To ensure modeling accuracy and computational efficiency, we use an all-atom description of RNA and a continuum description of ionic interactions using the Poisson–Boltzmann equation. Our method predicts the conformations of two constructs of Caenorhabditis elegans let-7 miRNA–target duplexes to an accuracy of ∼3.8 Å root mean square distance of their NMR structures. We also show that the computed duplex formation enthalpies, entropies, and free energies for eight miRNA–target duplexes agree with titration calorimetry data. Analysis of duplex–Argonaute docking shows that structural distortions arising from single-base-pair mismatches in the seed region influence the activity of the complex by destabilizing both duplex hybridization and its association with Argonaute. Collectively, these results demonstrate that tertiary structure-based modeling of miRNA interactions can reveal structural mechanisms not accessible with current secondary structure-based methods. PMID:23417009

  18. A novel approach for predicting microRNA-disease associations by unbalanced bi-random walk on heterogeneous network.

    PubMed

    Luo, Jiawei; Xiao, Qiu

    2017-02-01

    MicroRNAs (miRNAs) play a critical role by regulating their targets in post-transcriptional level. Identification of potential miRNA-disease associations will aid in deciphering the pathogenesis of human polygenic diseases. Several computational models have been developed to uncover novel miRNA-disease associations based on the predicted target genes. However, due to the insufficient number of experimentally validated miRNA-target interactions as well as the relatively high false-positive and false-negative rates of predicted target genes, it is still challenging for these prediction models to obtain remarkable performances. The purpose of this study is to prioritize miRNA candidates for diseases. We first construct a heterogeneous network, which consists of a disease similarity network, a miRNA functional similarity network and a known miRNA-disease association network. Then, an unbalanced bi-random walk-based algorithm on the heterogeneous network (BRWH) is adopted to discover potential associations by exploiting bipartite subgraphs. Based on 5-fold cross validation, the proposed network-based method achieves AUC values ranging from 0.782 to 0.907 for the 22 human diseases and an average AUC of almost 0.846. The experiments indicated that BRWH can achieve better performances compared with several popular methods. In addition, case studies of some common diseases further demonstrated the superior performance of our proposed method on prioritizing disease-related miRNA candidates. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Pressor mechanism evaluation for phytochemical compounds using in silico compound-protein interaction prediction.

    PubMed

    He, Min; Cao, Dong-Sheng; Liang, Yi-Zeng; Li, Ya-Ping; Liu, Ping-Le; Xu, Qing-Song; Huang, Ren-Bin

    2013-10-01

    In this study, a method was applied to evaluate pressor mechanisms through compound-protein interactions. Our method assumed that the compounds with different pressor mechanisms should bind to different target proteins, and thereby these mechanisms could be differentiated using compound-protein interactions. Twenty-six phytochemical components and 46 tested target proteins related to blood pressure (BP) elevation were collected. Then, in silico compound-protein interactions prediction probabilities were calculated using a random forest model, which have been implemented in a web server, and the credibility was judged using related literature and other methods. Further, a heat map was constructed, it clearly showed different prediction probabilities accompanied with hierarchical clustering analysis results. Followed by a compound-protein interaction network was depicted according to the results, we can see the connectivity layout of phytochemical components with different target proteins within the BP elevation network, which guided the hypothesis generation of poly-pharmacology. Lastly, principal components analysis (PCA) was carried out upon the prediction probabilities, and pressor targets could be divided into three large classes: neurotransmitter receptors, hormones receptors and monoamine oxidases. In addition, steroid glycosides seem to be close to the region of hormone receptors, and a weak difference existed between them. This work explored the possibility for pharmacological or toxicological mechanism classification using compound-protein interactions. Such approaches could also be used to deduce pharmacological or toxicological mechanisms for uncharacterized compounds. Copyright © 2013 Elsevier Inc. All rights reserved.

  20. Head-target tracking control of well drilling

    NASA Astrophysics Data System (ADS)

    Agzamov, Z. V.

    2018-05-01

    The method of directional drilling trajectory control for oil and gas wells using predictive models is considered in the paper. The developed method does not apply optimization and therefore there is no need for the high-performance computing. Nevertheless, it allows following the well-plan with high precision taking into account process input saturation. Controller output is calculated both from the present target reference point of the well-plan and from well trajectory prediction with using the analytical model. This method allows following a well-plan not only on angular, but also on the Cartesian coordinates. Simulation of the control system has confirmed the high precision and operation performance with a wide range of random disturbance action.

  1. Mortality prediction system for heart failure with orthogonal relief and dynamic radius means.

    PubMed

    Wang, Zhe; Yao, Lijuan; Li, Dongdong; Ruan, Tong; Liu, Min; Gao, Ju

    2018-07-01

    This paper constructs a mortality prediction system based on a real-world dataset. This mortality prediction system aims to predict mortality in heart failure (HF) patients. Effective mortality prediction can improve resources allocation and clinical outcomes, avoiding inappropriate overtreatment of low-mortality patients and discharging of high-mortality patients. This system covers three mortality prediction targets: prediction of in-hospital mortality, prediction of 30-day mortality and prediction of 1-year mortality. HF data are collected from the Shanghai Shuguang hospital. 10,203 in-patients records are extracted from encounters occurring between March 2009 and April 2016. The records involve 4682 patients, including 539 death cases. A feature selection method called Orthogonal Relief (OR) algorithm is first used to reduce the dimensionality. Then, a classification algorithm named Dynamic Radius Means (DRM) is proposed to predict the mortality in HF patients. The comparative experimental results demonstrate that mortality prediction system achieves high performance in all targets by DRM. It is noteworthy that the performance of in-hospital mortality prediction achieves 87.3% in AUC (35.07% improvement). Moreover, the AUC of 30-day and 1-year mortality prediction reach to 88.45% and 84.84%, respectively. Especially, the system could keep itself effective and not deteriorate when the dimension of samples is sharply reduced. The proposed system with its own method DRM can predict mortality in HF patients and achieve high performance in all three mortality targets. Furthermore, effective feature selection strategy can boost the system. This system shows its importance in real-world applications, assisting clinicians in HF treatment by providing crucial decision information. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. ProTox: a web server for the in silico prediction of rodent oral toxicity.

    PubMed

    Drwal, Malgorzata N; Banerjee, Priyanka; Dunkel, Mathias; Wettig, Martin R; Preissner, Robert

    2014-07-01

    Animal trials are currently the major method for determining the possible toxic effects of drug candidates and cosmetics. In silico prediction methods represent an alternative approach and aim to rationalize the preclinical drug development, thus enabling the reduction of the associated time, costs and animal experiments. Here, we present ProTox, a web server for the prediction of rodent oral toxicity. The prediction method is based on the analysis of the similarity of compounds with known median lethal doses (LD50) and incorporates the identification of toxic fragments, therefore representing a novel approach in toxicity prediction. In addition, the web server includes an indication of possible toxicity targets which is based on an in-house collection of protein-ligand-based pharmacophore models ('toxicophores') for targets associated with adverse drug reactions. The ProTox web server is open to all users and can be accessed without registration at: http://tox.charite.de/tox. The only requirement for the prediction is the two-dimensional structure of the input compounds. All ProTox methods have been evaluated based on a diverse external validation set and displayed strong performance (sensitivity, specificity and precision of 76, 95 and 75%, respectively) and superiority over other toxicity prediction tools, indicating their possible applicability for other compound classes. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Addressing recent docking challenges: A hybrid strategy to integrate template-based and free protein-protein docking.

    PubMed

    Yan, Yumeng; Wen, Zeyu; Wang, Xinxiang; Huang, Sheng-You

    2017-03-01

    Protein-protein docking is an important computational tool for predicting protein-protein interactions. With the rapid development of proteomics projects, more and more experimental binding information ranging from mutagenesis data to three-dimensional structures of protein complexes are becoming available. Therefore, how to appropriately incorporate the biological information into traditional ab initio docking has been an important issue and challenge in the field of protein-protein docking. To address these challenges, we have developed a Hybrid DOCKing protocol of template-based and template-free approaches, referred to as HDOCK. The basic procedure of HDOCK is to model the structures of individual components based on the template complex by a template-based method if a template is available; otherwise, the component structures will be modeled based on monomer proteins by regular homology modeling. Then, the complex structure of the component models is predicted by traditional protein-protein docking. With the HDOCK protocol, we have participated in the CPARI experiment for rounds 28-35. Out of the 25 CASP-CAPRI targets for oligomer modeling, our HDOCK protocol predicted correct models for 16 targets, ranking one of the top algorithms in this challenge. Our docking method also made correct predictions on other CAPRI challenges such as protein-peptide binding for 6 out of 8 targets and water predictions for 2 out of 2 targets. The advantage of our hybrid docking approach over pure template-based docking was further confirmed by a comparative evaluation on 20 CASP-CAPRI targets. Proteins 2017; 85:497-512. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  4. Perturbation biology nominates upstream-downstream drug combinations in RAF inhibitor resistant melanoma cells.

    PubMed

    Korkut, Anil; Wang, Weiqing; Demir, Emek; Aksoy, Bülent Arman; Jing, Xiaohong; Molinelli, Evan J; Babur, Özgün; Bemis, Debra L; Onur Sumer, Selcuk; Solit, David B; Pratilas, Christine A; Sander, Chris

    2015-08-18

    Resistance to targeted cancer therapies is an important clinical problem. The discovery of anti-resistance drug combinations is challenging as resistance can arise by diverse escape mechanisms. To address this challenge, we improved and applied the experimental-computational perturbation biology method. Using statistical inference, we build network models from high-throughput measurements of molecular and phenotypic responses to combinatorial targeted perturbations. The models are computationally executed to predict the effects of thousands of untested perturbations. In RAF-inhibitor resistant melanoma cells, we measured 143 proteomic/phenotypic entities under 89 perturbation conditions and predicted c-Myc as an effective therapeutic co-target with BRAF or MEK. Experiments using the BET bromodomain inhibitor JQ1 affecting the level of c-Myc protein and protein kinase inhibitors targeting the ERK pathway confirmed the prediction. In conclusion, we propose an anti-cancer strategy of co-targeting a specific upstream alteration and a general downstream point of vulnerability to prevent or overcome resistance to targeted drugs.

  5. Towards crystal structure prediction of complex organic compounds – a report on the fifth blind test

    PubMed Central

    Bardwell, David A.; Adjiman, Claire S.; Arnautova, Yelena A.; Bartashevich, Ekaterina; Boerrigter, Stephan X. M.; Braun, Doris E.; Cruz-Cabeza, Aurora J.; Day, Graeme M.; Della Valle, Raffaele G.; Desiraju, Gautam R.; van Eijck, Bouke P.; Facelli, Julio C.; Ferraro, Marta B.; Grillo, Damian; Habgood, Matthew; Hofmann, Detlef W. M.; Hofmann, Fridolin; Jose, K. V. Jovan; Karamertzanis, Panagiotis G.; Kazantsev, Andrei V.; Kendrick, John; Kuleshova, Liudmila N.; Leusen, Frank J. J.; Maleev, Andrey V.; Misquitta, Alston J.; Mohamed, Sharmarke; Needs, Richard J.; Neumann, Marcus A.; Nikylov, Denis; Orendt, Anita M.; Pal, Rumpa; Pantelides, Constantinos C.; Pickard, Chris J.; Price, Louise S.; Price, Sarah L.; Scheraga, Harold A.; van de Streek, Jacco; Thakur, Tejender S.; Tiwari, Siddharth; Venuti, Elisabetta; Zhitkov, Ilia K.

    2011-01-01

    Following on from the success of the previous crystal structure prediction blind tests (CSP1999, CSP2001, CSP2004 and CSP2007), a fifth such collaborative project (CSP2010) was organized at the Cambridge Crystallographic Data Centre. A range of methodologies was used by the participating groups in order to evaluate the ability of the current computational methods to predict the crystal structures of the six organic molecules chosen as targets for this blind test. The first four targets, two rigid molecules, one semi-flexible molecule and a 1:1 salt, matched the criteria for the targets from CSP2007, while the last two targets belonged to two new challenging categories – a larger, much more flexible molecule and a hydrate with more than one polymorph. Each group submitted three predictions for each target it attempted. There was at least one successful prediction for each target, and two groups were able to successfully predict the structure of the large flexible molecule as their first place submission. The results show that while not as many groups successfully predicted the structures of the three smallest molecules as in CSP2007, there is now evidence that methodologies such as dispersion-corrected density functional theory (DFT-D) are able to reliably do so. The results also highlight the many challenges posed by more complex systems and show that there are still issues to be overcome. PMID:22101543

  6. HomPPI: a class of sequence homology based protein-protein interface prediction methods

    PubMed Central

    2011-01-01

    Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i) NPS-HomPPI (Non partner-specific HomPPI), which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii) PS-HomPPI (Partner-specific HomPPI), which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC) of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of both the query and the target can be reliably identified. The HomPPI web server is available at http://homppi.cs.iastate.edu/. Conclusions Sequence homology-based methods offer a class of computationally efficient and reliable approaches for predicting the protein-protein interface residues that participate in either obligate or transient interactions. For query proteins involved in transient interactions, the reliability of interface residue prediction can be improved by exploiting knowledge of putative interaction partners. PMID:21682895

  7. Nuclease Target Site Selection for Maximizing On-target Activity and Minimizing Off-target Effects in Genome Editing

    PubMed Central

    Lee, Ciaran M; Cradick, Thomas J; Fine, Eli J; Bao, Gang

    2016-01-01

    The rapid advancement in targeted genome editing using engineered nucleases such as ZFNs, TALENs, and CRISPR/Cas9 systems has resulted in a suite of powerful methods that allows researchers to target any genomic locus of interest. A complementary set of design tools has been developed to aid researchers with nuclease design, target site selection, and experimental validation. Here, we review the various tools available for target selection in designing engineered nucleases, and for quantifying nuclease activity and specificity, including web-based search tools and experimental methods. We also elucidate challenges in target selection, especially in predicting off-target effects, and discuss future directions in precision genome editing and its applications. PMID:26750397

  8. Prediction of potential disease-associated microRNAs based on random walk.

    PubMed

    Xuan, Ping; Han, Ke; Guo, Yahong; Li, Jin; Li, Xia; Zhong, Yingli; Zhang, Zhaogong; Ding, Jian

    2015-06-01

    Identifying microRNAs associated with diseases (disease miRNAs) is helpful for exploring the pathogenesis of diseases. Because miRNAs fulfill function via the regulation of their target genes and because the current number of experimentally validated targets is insufficient, some existing methods have inferred potential disease miRNAs based on the predicted targets. It is difficult for these methods to achieve excellent performance due to the high false-positive and false-negative rates for the target prediction results. Alternatively, several methods have constructed a network composed of miRNAs based on their associated diseases and have exploited the information within the network to predict the disease miRNAs. However, these methods have failed to take into account the prior information regarding the network nodes and the respective local topological structures of the different categories of nodes. Therefore, it is essential to develop a method that exploits the more useful information to predict reliable disease miRNA candidates. miRNAs with similar functions are normally associated with similar diseases and vice versa. Therefore, the functional similarity between a pair of miRNAs is calculated based on their associated diseases to construct a miRNA network. We present a new prediction method based on random walk on the network. For the diseases with some known related miRNAs, the network nodes are divided into labeled nodes and unlabeled nodes, and the transition matrices are established for the two categories of nodes. Furthermore, different categories of nodes have different transition weights. In this way, the prior information of nodes can be completely exploited. Simultaneously, the various ranges of topologies around the different categories of nodes are integrated. In addition, how far the walker can go away from the labeled nodes is controlled by restarting the walking. This is helpful for relieving the negative effect of noisy data. For the diseases without any known related miRNAs, we extend the walking on a miRNA-disease bilayer network. During the prediction process, the similarity between diseases, the similarity between miRNAs, the known miRNA-disease associations and the topology information of the bilayer network are exploited. Moreover, the importance of information from different layers of network is considered. Our method achieves superior performance for 18 human diseases with AUC values ranging from 0.786 to 0.945. Moreover, case studies on breast neoplasms, lung neoplasms, prostatic neoplasms and 32 diseases further confirm the ability of our method to discover potential disease miRNAs. A web service for the prediction and analysis of disease miRNAs is available at http://bioinfolab.stx.hk/midp/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. A fragmentation and reassembly method for ab initio phasing.

    PubMed

    Shrestha, Rojan; Zhang, Kam Y J

    2015-02-01

    Ab initio phasing with de novo models has become a viable approach for structural solution from protein crystallographic diffraction data. This approach takes advantage of the known protein sequence information, predicts de novo models and uses them for structure determination by molecular replacement. However, even the current state-of-the-art de novo modelling method has a limit as to the accuracy of the model predicted, which is sometimes insufficient to be used as a template for successful molecular replacement. A fragment-assembly phasing method has been developed that starts from an ensemble of low-accuracy de novo models, disassembles them into fragments, places them independently in the crystallographic unit cell by molecular replacement and then reassembles them into a whole structure that can provide sufficient phase information to enable complete structure determination by automated model building. Tests on ten protein targets showed that the method could solve structures for eight of these targets, although the predicted de novo models cannot be used as templates for successful molecular replacement since the best model for each target is on average more than 4.0 Å away from the native structure. The method has extended the applicability of the ab initio phasing by de novo models approach. The method can be used to solve structures when the best de novo models are still of low accuracy.

  10. In silico target prediction for elucidating the mode of action of herbicides including prospective validation.

    PubMed

    Chiddarwar, Rucha K; Rohrer, Sebastian G; Wolf, Antje; Tresch, Stefan; Wollenhaupt, Sabrina; Bender, Andreas

    2017-01-01

    The rapid emergence of pesticide resistance has given rise to a demand for herbicides with new mode of action (MoA). In the agrochemical sector, with the availability of experimental high throughput screening (HTS) data, it is now possible to utilize in silico target prediction methods in the early discovery phase to suggest the MoA of a compound via data mining of bioactivity data. While having been established in the pharmaceutical context, in the agrochemical area this approach poses rather different challenges, as we have found in this work, partially due to different chemistry, but even more so due to different (usually smaller) amounts of data, and different ways of conducting HTS. With the aim to apply computational methods for facilitating herbicide target identification, 48,000 bioactivity data against 16 herbicide targets were processed to train Laplacian modified Naïve Bayesian (NB) classification models. The herbicide target prediction model ("HerbiMod") is an ensemble of 16 binary classification models which are evaluated by internal, external and prospective validation sets. In addition to the experimental inactives, 10,000 random agrochemical inactives were included in the training process, which showed to improve the overall balanced accuracy of our models up to 40%. For all the models, performance in terms of balanced accuracy of≥80% was achieved in five-fold cross validation. Ranking target predictions was addressed by means of z-scores which improved predictivity over using raw scores alone. An external testset of 247 compounds from ChEMBL and a prospective testset of 394 compounds from BASF SE tested against five well studied herbicide targets (ACC, ALS, HPPD, PDS and PROTOX) were used for further validation. Only 4% of the compounds in the external testset lied in the applicability domain and extrapolation (and correct prediction) was hence impossible, which on one hand was surprising, and on the other hand illustrated the utilization of using applicability domains in the first place. However, performance better than 60% in balanced accuracy was achieved on the prospective testset, where all the compounds fell within the applicability domain, and which hence underlines the possibility of using target prediction also in the area of agrochemicals. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. Geometric parameter analysis to predetermine optimal radiosurgery technique for the treatment of arteriovenous malformation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mestrovic, Ante; Clark, Brenda G.; Department of Medical Physics, British Columbia Cancer Agency, Vancouver, British Columbia

    2005-11-01

    Purpose: To develop a method of predicting the values of dose distribution parameters of different radiosurgery techniques for treatment of arteriovenous malformation (AVM) based on internal geometric parameters. Methods and Materials: For each of 18 previously treated AVM patients, four treatment plans were created: circular collimator arcs, dynamic conformal arcs, fixed conformal fields, and intensity-modulated radiosurgery. An algorithm was developed to characterize the target and critical structure shape complexity and the position of the critical structures with respect to the target. Multiple regression was employed to establish the correlation between the internal geometric parameters and the dose distribution for differentmore » treatment techniques. The results from the model were applied to predict the dosimetric outcomes of different radiosurgery techniques and select the optimal radiosurgery technique for a number of AVM patients. Results: Several internal geometric parameters showing statistically significant correlation (p < 0.05) with the treatment planning results for each technique were identified. The target volume and the average minimum distance between the target and the critical structures were the most effective predictors for normal tissue dose distribution. The structure overlap volume with the target and the mean distance between the target and the critical structure were the most effective predictors for critical structure dose distribution. The predicted values of dose distribution parameters of different radiosurgery techniques were in close agreement with the original data. Conclusions: A statistical model has been described that successfully predicts the values of dose distribution parameters of different radiosurgery techniques and may be used to predetermine the optimal technique on a patient-to-patient basis.« less

  12. Novel Modeling of Combinatorial miRNA Targeting Identifies SNP with Potential Role in Bone Density

    PubMed Central

    Coronnello, Claudia; Hartmaier, Ryan; Arora, Arshi; Huleihel, Luai; Pandit, Kusum V.; Bais, Abha S.; Butterworth, Michael; Kaminski, Naftali; Stormo, Gary D.; Oesterreich, Steffi; Benos, Panayiotis V.

    2012-01-01

    MicroRNAs (miRNAs) are post-transcriptional regulators that bind to their target mRNAs through base complementarity. Predicting miRNA targets is a challenging task and various studies showed that existing algorithms suffer from high number of false predictions and low to moderate overlap in their predictions. Until recently, very few algorithms considered the dynamic nature of the interactions, including the effect of less specific interactions, the miRNA expression level, and the effect of combinatorial miRNA binding. Addressing these issues can result in a more accurate miRNA:mRNA modeling with many applications, including efficient miRNA-related SNP evaluation. We present a novel thermodynamic model based on the Fermi-Dirac equation that incorporates miRNA expression in the prediction of target occupancy and we show that it improves the performance of two popular single miRNA target finders. Modeling combinatorial miRNA targeting is a natural extension of this model. Two other algorithms show improved prediction efficiency when combinatorial binding models were considered. ComiR (Combinatorial miRNA targeting), a novel algorithm we developed, incorporates the improved predictions of the four target finders into a single probabilistic score using ensemble learning. Combining target scores of multiple miRNAs using ComiR improves predictions over the naïve method for target combination. ComiR scoring scheme can be used for identification of SNPs affecting miRNA binding. As proof of principle, ComiR identified rs17737058 as disruptive to the miR-488-5p:NCOA1 interaction, which we confirmed in vitro. We also found rs17737058 to be significantly associated with decreased bone mineral density (BMD) in two independent cohorts indicating that the miR-488-5p/NCOA1 regulatory axis is likely critical in maintaining BMD in women. With increasing availability of comprehensive high-throughput datasets from patients ComiR is expected to become an essential tool for miRNA-related studies. PMID:23284279

  13. [Application of Kohonen Self-Organizing Feature Maps in QSAR of human ADMET and kinase data sets].

    PubMed

    Hegymegi-Barakonyi, Bálint; Orfi, László; Kéri, György; Kövesdi, István

    2013-01-01

    QSAR predictions have been proven very useful in a large number of studies for drug design, such as kinase inhibitor design as targets for cancer therapy, however the overall predictability often remains unsatisfactory. To improve predictability of ADMET features and kinase inhibitory data, we present a new method using Kohonen's Self-Organizing Feature Map (SOFM) to cluster molecules based on explanatory variables (X) and separate dissimilar ones. We calculated SOFM clusters for a large number of molecules with human ADMET and kinase inhibitory data, and we showed that chemically similar molecules were in the same SOFM cluster, and within such clusters the QSAR models had significantly better predictability. We used also target variables (Y, e.g. ADMET) jointly with X variables to create a novel type of clustering. With our method, cells of loosely coupled XY data could be identified and separated into different model building sets.

  14. Computational Methods in Drug Discovery

    PubMed Central

    Sliwoski, Gregory; Kothiwale, Sandeepkumar; Meiler, Jens

    2014-01-01

    Computer-aided drug discovery/design methods have played a major role in the development of therapeutically important small molecules for over three decades. These methods are broadly classified as either structure-based or ligand-based methods. Structure-based methods are in principle analogous to high-throughput screening in that both target and ligand structure information is imperative. Structure-based approaches include ligand docking, pharmacophore, and ligand design methods. The article discusses theory behind the most important methods and recent successful applications. Ligand-based methods use only ligand information for predicting activity depending on its similarity/dissimilarity to previously known active ligands. We review widely used ligand-based methods such as ligand-based pharmacophores, molecular descriptors, and quantitative structure-activity relationships. In addition, important tools such as target/ligand data bases, homology modeling, ligand fingerprint methods, etc., necessary for successful implementation of various computer-aided drug discovery/design methods in a drug discovery campaign are discussed. Finally, computational methods for toxicity prediction and optimization for favorable physiologic properties are discussed with successful examples from literature. PMID:24381236

  15. Biological and functional relevance of CASP predictions.

    PubMed

    Liu, Tianyun; Ish-Shalom, Shirbi; Torng, Wen; Lafita, Aleix; Bock, Christian; Mort, Matthew; Cooper, David N; Bliven, Spencer; Capitani, Guido; Mooney, Sean D; Altman, Russ B

    2018-03-01

    Our goal is to answer the question: compared with experimental structures, how useful are predicted models for functional annotation? We assessed the functional utility of predicted models by comparing the performances of a suite of methods for functional characterization on the predictions and the experimental structures. We identified 28 sites in 25 protein targets to perform functional assessment. These 28 sites included nine sites with known ligand binding (holo-sites), nine sites that are expected or suggested by experimental authors for small molecule binding (apo-sites), and Ten sites containing important motifs, loops, or key residues with important disease-associated mutations. We evaluated the utility of the predictions by comparing their microenvironments to the experimental structures. Overall structural quality correlates with functional utility. However, the best-ranked predictions (global) may not have the best functional quality (local). Our assessment provides an ability to discriminate between predictions with high structural quality. When assessing ligand-binding sites, most prediction methods have higher performance on apo-sites than holo-sites. Some servers show consistently high performance for certain types of functional sites. Finally, many functional sites are associated with protein-protein interaction. We also analyzed biologically relevant features from the protein assemblies of two targets where the active site spanned the protein-protein interface. For the assembly targets, we find that the features in the models are mainly determined by the choice of template. © 2017 The Authors Proteins: Structure, Function and Bioinformatics Published by Wiley Periodicals, Inc.

  16. Novel drug target identification for the treatment of dementia using multi-relational association mining.

    PubMed

    Nguyen, Thanh-Phuong; Priami, Corrado; Caberlotto, Laura

    2015-07-08

    Dementia is a neurodegenerative condition of the brain in which there is a progressive and permanent loss of cognitive and mental performance. Despite the fact that the number of people with dementia worldwide is steadily increasing and regardless of the advances in the molecular characterization of the disease, current medical treatments for dementia are purely symptomatic and hardly effective. We present a novel multi-relational association mining method that integrates the huge amount of scientific data accumulated in recent years to predict potential novel targets for innovative therapeutic treatment of dementia. Owing to the ability of processing large volumes of heterogeneous data, our method achieves a high performance and predicts numerous drug targets including several serine threonine kinase and a G-protein coupled receptor. The predicted drug targets are mainly functionally related to metabolism, cell surface receptor signaling pathways, immune response, apoptosis, and long-term memory. Among the highly represented kinase family and among the G-protein coupled receptors, DLG4 (PSD-95), and the bradikynin receptor 2 are highlighted also for their proposed role in memory and cognition, as described in previous studies. These novel putative targets hold promises for the development of novel therapeutic approaches for the treatment of dementia.

  17. Novel drug target identification for the treatment of dementia using multi-relational association mining

    PubMed Central

    Nguyen, Thanh-Phuong; Priami, Corrado; Caberlotto, Laura

    2015-01-01

    Dementia is a neurodegenerative condition of the brain in which there is a progressive and permanent loss of cognitive and mental performance. Despite the fact that the number of people with dementia worldwide is steadily increasing and regardless of the advances in the molecular characterization of the disease, current medical treatments for dementia are purely symptomatic and hardly effective. We present a novel multi-relational association mining method that integrates the huge amount of scientific data accumulated in recent years to predict potential novel targets for innovative therapeutic treatment of dementia. Owing to the ability of processing large volumes of heterogeneous data, our method achieves a high performance and predicts numerous drug targets including several serine threonine kinase and a G-protein coupled receptor. The predicted drug targets are mainly functionally related to metabolism, cell surface receptor signaling pathways, immune response, apoptosis, and long-term memory. Among the highly represented kinase family and among the G-protein coupled receptors, DLG4 (PSD-95), and the bradikynin receptor 2 are highlighted also for their proposed role in memory and cognition, as described in previous studies. These novel putative targets hold promises for the development of novel therapeutic approaches for the treatment of dementia. PMID:26154857

  18. A method for measuring coherent elastic neutrino-nucleus scattering at a far off-axis high-energy neutrino beam target

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brice, S. J.; Cooper, R. L.; DeJongh, F.

    2014-04-03

    We present an experimental method for measuring the process of coherent elastic neutrino-nucleus scattering (CENNS). This method uses a detector situated transverse to a high-energy neutrino beam production target. This detector would be sensitive to the low-energy neutrinos arising from decay-at-rest pions in the target. We discuss the physics motivation for making this measurement and outline the predicted backgrounds and sensitivities using this approach. We report a measurement of neutron backgrounds as found in an off-axis surface location of the Fermilab Booster Neutrino Beam (BNB) target. The results indicate that the Fermilab BNB target is a favorable location for amore » CENNS experiment.« less

  19. Imbalanced target prediction with pattern discovery on clinical data repositories.

    PubMed

    Chan, Tak-Ming; Li, Yuxi; Chiau, Choo-Chiap; Zhu, Jane; Jiang, Jie; Huo, Yong

    2017-04-20

    Clinical data repositories (CDR) have great potential to improve outcome prediction and risk modeling. However, most clinical studies require careful study design, dedicated data collection efforts, and sophisticated modeling techniques before a hypothesis can be tested. We aim to bridge this gap, so that clinical domain users can perform first-hand prediction on existing repository data without complicated handling, and obtain insightful patterns of imbalanced targets for a formal study before it is conducted. We specifically target for interpretability for domain users where the model can be conveniently explained and applied in clinical practice. We propose an interpretable pattern model which is noise (missing) tolerant for practice data. To address the challenge of imbalanced targets of interest in clinical research, e.g., deaths less than a few percent, the geometric mean of sensitivity and specificity (G-mean) optimization criterion is employed, with which a simple but effective heuristic algorithm is developed. We compared pattern discovery to clinically interpretable methods on two retrospective clinical datasets. They contain 14.9% deaths in 1 year in the thoracic dataset and 9.1% deaths in the cardiac dataset, respectively. In spite of the imbalance challenge shown on other methods, pattern discovery consistently shows competitive cross-validated prediction performance. Compared to logistic regression, Naïve Bayes, and decision tree, pattern discovery achieves statistically significant (p-values < 0.01, Wilcoxon signed rank test) favorable averaged testing G-means and F1-scores (harmonic mean of precision and sensitivity). Without requiring sophisticated technical processing of data and tweaking, the prediction performance of pattern discovery is consistently comparable to the best achievable performance. Pattern discovery has demonstrated to be robust and valuable for target prediction on existing clinical data repositories with imbalance and noise. The prediction results and interpretable patterns can provide insights in an agile and inexpensive way for the potential formal studies.

  20. Economic evaluation of targeted cancer interventions: critical review and recommendations.

    PubMed

    Elkin, Elena B; Marshall, Deborah A; Kulin, Nathalie A; Ferrusi, Ilia L; Hassett, Michael J; Ladabaum, Uri; Phillips, Kathryn A

    2011-10-01

    Scientific advances have improved our ability to target cancer interventions to individuals who will benefit most and spare the risks and costs to those who will derive little benefit or even be harmed. Several approaches are currently used for targeting interventions for cancer risk reduction, screening, and treatment, including risk prediction algorithms for identifying high-risk subgroups and diagnostic tests for tumor markers and germline genetic mutations. Economic evaluation can inform decisions about the use of targeted interventions, which may be more costly than traditional strategies. However, assessing the impact of a targeted intervention on costs and health outcomes requires explicit consideration of the method of targeting. In this study, we describe the importance of this principle by reviewing published cost-effectiveness analyses of targeted interventions in breast cancer. Few studies we identified explicitly evaluated the relationships among the method of targeting, the accuracy of the targeting test, and outcomes of the targeted intervention. Those that did found that characteristics of targeting tests had a substantial impact on outcomes. We posit that the method of targeting and the outcomes of a targeted intervention are inextricably linked and recommend that cost-effectiveness analyses of targeted interventions explicitly consider costs and outcomes of the method of targeting.

  1. PatchSurfers: Two methods for local molecular property-based binding ligand prediction.

    PubMed

    Shin, Woong-Hee; Bures, Mark Gregory; Kihara, Daisuke

    2016-01-15

    Protein function prediction is an active area of research in computational biology. Function prediction can help biologists make hypotheses for characterization of genes and help interpret biological assays, and thus is a productive area for collaboration between experimental and computational biologists. Among various function prediction methods, predicting binding ligand molecules for a target protein is an important class because ligand binding events for a protein are usually closely intertwined with the proteins' biological function, and also because predicted binding ligands can often be directly tested by biochemical assays. Binding ligand prediction methods can be classified into two types: those which are based on protein-protein (or pocket-pocket) comparison, and those that compare a target pocket directly to ligands. Recently, our group proposed two computational binding ligand prediction methods, Patch-Surfer, which is a pocket-pocket comparison method, and PL-PatchSurfer, which compares a pocket to ligand molecules. The two programs apply surface patch-based descriptions to calculate similarity or complementarity between molecules. A surface patch is characterized by physicochemical properties such as shape, hydrophobicity, and electrostatic potentials. These properties on the surface are represented using three-dimensional Zernike descriptors (3DZD), which are based on a series expansion of a 3 dimensional function. Utilizing 3DZD for describing the physicochemical properties has two main advantages: (1) rotational invariance and (2) fast comparison. Here, we introduce Patch-Surfer and PL-PatchSurfer with an emphasis on PL-PatchSurfer, which is more recently developed. Illustrative examples of PL-PatchSurfer performance on binding ligand prediction as well as virtual drug screening are also provided. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. Predicting Heart Rate at the Ventilatory Threshold for Aerobic Exercise Prescription in Persons With Chronic Stroke.

    PubMed

    Boyne, Pierce; Buhr, Sarah; Rockwell, Bradley; Khoury, Jane; Carl, Daniel; Gerson, Myron; Kissela, Brett; Dunning, Kari

    2015-10-01

    Treadmill aerobic exercise improves gait, aerobic capacity, and cardiovascular health after stroke, but a lack of specificity in current guidelines could lead to underdosing or overdosing of aerobic intensity. The ventilatory threshold (VT) has been recommended as an optimal, specific starting point for continuous aerobic exercise. However, VT measurement is not available in clinical stroke settings. Therefore, the purpose of this study was to identify an accurate method to predict heart rate at the VT (HRVT) for use as a surrogate for VT. A cross-sectional design was employed. Using symptom-limited graded exercise test (GXT) data from 17 subjects more than 6 months poststroke, prediction methods for HRVT were derived by traditional target HR calculations (percentage of HRpeak achieved during GXT, percentage of peak HR reserve [HRRpeak], percentage of age-predicted maximal HR, and percentage of age-predicted maximal HR reserve) and by regression analysis. The validity of the prediction methods was then tested among 8 additional subjects. All prediction methods were validated by the second sample, so data were pooled to calculate refined prediction equations. HRVT was accurately predicted by 80% HRpeak (R, 0.62; standard deviation of error [SDerror], 7 bpm), 62% HRRpeak (R, 0.66; SDerror, 7 bpm), and regression models that included HRpeak (R, 0.62-0.75; SDerror, 5-6 bpm). Derived regression equations, 80% HRpeak and 62% HRRpeak, provide a specific target intensity for initial aerobic exercise prescription that should minimize underdosing and overdosing for persons with chronic stroke. The specificity of these methods may lead to more efficient and effective treatment for poststroke deconditioning.Video Abstract available for more insights from the authors (see Supplemental Digital Content 1, http://links.lww.com/JNPT/A114).

  3. Location Prediction Based on Transition Probability Matrices Constructing from Sequential Rules for Spatial-Temporal K-Anonymity Dataset

    PubMed Central

    Liu, Zhao; Zhu, Yunhong; Wu, Chenxue

    2016-01-01

    Spatial-temporal k-anonymity has become a mainstream approach among techniques for protection of users’ privacy in location-based services (LBS) applications, and has been applied to several variants such as LBS snapshot queries and continuous queries. Analyzing large-scale spatial-temporal anonymity sets may benefit several LBS applications. In this paper, we propose two location prediction methods based on transition probability matrices constructing from sequential rules for spatial-temporal k-anonymity dataset. First, we define single-step sequential rules mined from sequential spatial-temporal k-anonymity datasets generated from continuous LBS queries for multiple users. We then construct transition probability matrices from mined single-step sequential rules, and normalize the transition probabilities in the transition matrices. Next, we regard a mobility model for an LBS requester as a stationary stochastic process and compute the n-step transition probability matrices by raising the normalized transition probability matrices to the power n. Furthermore, we propose two location prediction methods: rough prediction and accurate prediction. The former achieves the probabilities of arriving at target locations along simple paths those include only current locations, target locations and transition steps. By iteratively combining the probabilities for simple paths with n steps and the probabilities for detailed paths with n-1 steps, the latter method calculates transition probabilities for detailed paths with n steps from current locations to target locations. Finally, we conduct extensive experiments, and correctness and flexibility of our proposed algorithm have been verified. PMID:27508502

  4. Protein structure modeling and refinement by global optimization in CASP12.

    PubMed

    Hong, Seung Hwan; Joung, InSuk; Flores-Canales, Jose C; Manavalan, Balachandran; Cheng, Qianyi; Heo, Seungryong; Kim, Jong Yun; Lee, Sun Young; Nam, Mikyung; Joo, Keehyoung; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung

    2018-03-01

    For protein structure modeling in the CASP12 experiment, we have developed a new protocol based on our previous CASP11 approach. The global optimization method of conformational space annealing (CSA) was applied to 3 stages of modeling: multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain re-modeling. For better template selection and model selection, we updated our model quality assessment (QA) method with the newly developed SVMQA (support vector machine for quality assessment). For 3D chain building, we updated our energy function by including restraints generated from predicted residue-residue contacts. New energy terms for the predicted secondary structure and predicted solvent accessible surface area were also introduced. For difficult targets, we proposed a new method, LEEab, where the template term played a less significant role than it did in LEE, complemented by increased contributions from other terms such as the predicted contact term. For TBM (template-based modeling) targets, LEE performed better than LEEab, but for FM targets, LEEab was better. For model refinement, we modified our CASP11 molecular dynamics (MD) based protocol by using explicit solvents and tuning down restraint weights. Refinement results from MD simulations that used a new augmented statistical energy term in the force field were quite promising. Finally, when using inaccurate information (such as the predicted contacts), it was important to use the Lorentzian function for which the maximal penalty arising from wrong information is always bounded. © 2017 Wiley Periodicals, Inc.

  5. Sphinx: merging knowledge-based and ab initio approaches to improve protein loop prediction

    PubMed Central

    Marks, Claire; Nowak, Jaroslaw; Klostermann, Stefan; Georges, Guy; Dunbar, James; Shi, Jiye; Kelm, Sebastian

    2017-01-01

    Abstract Motivation: Loops are often vital for protein function, however, their irregular structures make them difficult to model accurately. Current loop modelling algorithms can mostly be divided into two categories: knowledge-based, where databases of fragments are searched to find suitable conformations and ab initio, where conformations are generated computationally. Existing knowledge-based methods only use fragments that are the same length as the target, even though loops of slightly different lengths may adopt similar conformations. Here, we present a novel method, Sphinx, which combines ab initio techniques with the potential extra structural information contained within loops of a different length to improve structure prediction. Results: We show that Sphinx is able to generate high-accuracy predictions and decoy sets enriched with near-native loop conformations, performing better than the ab initio algorithm on which it is based. In addition, it is able to provide predictions for every target, unlike some knowledge-based methods. Sphinx can be used successfully for the difficult problem of antibody H3 prediction, outperforming RosettaAntibody, one of the leading H3-specific ab initio methods, both in accuracy and speed. Availability and Implementation: Sphinx is available at http://opig.stats.ox.ac.uk/webapps/sphinx. Contact: deane@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28453681

  6. Sphinx: merging knowledge-based and ab initio approaches to improve protein loop prediction.

    PubMed

    Marks, Claire; Nowak, Jaroslaw; Klostermann, Stefan; Georges, Guy; Dunbar, James; Shi, Jiye; Kelm, Sebastian; Deane, Charlotte M

    2017-05-01

    Loops are often vital for protein function, however, their irregular structures make them difficult to model accurately. Current loop modelling algorithms can mostly be divided into two categories: knowledge-based, where databases of fragments are searched to find suitable conformations and ab initio, where conformations are generated computationally. Existing knowledge-based methods only use fragments that are the same length as the target, even though loops of slightly different lengths may adopt similar conformations. Here, we present a novel method, Sphinx, which combines ab initio techniques with the potential extra structural information contained within loops of a different length to improve structure prediction. We show that Sphinx is able to generate high-accuracy predictions and decoy sets enriched with near-native loop conformations, performing better than the ab initio algorithm on which it is based. In addition, it is able to provide predictions for every target, unlike some knowledge-based methods. Sphinx can be used successfully for the difficult problem of antibody H3 prediction, outperforming RosettaAntibody, one of the leading H3-specific ab initio methods, both in accuracy and speed. Sphinx is available at http://opig.stats.ox.ac.uk/webapps/sphinx. deane@stats.ox.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  7. Docking and scoring protein complexes: CAPRI 3rd Edition.

    PubMed

    Lensink, Marc F; Méndez, Raúl; Wodak, Shoshana J

    2007-12-01

    The performance of methods for predicting protein-protein interactions at the atomic scale is assessed by evaluating blind predictions performed during 2005-2007 as part of Rounds 6-12 of the community-wide experiment on Critical Assessment of PRedicted Interactions (CAPRI). These Rounds also included a new scoring experiment, where a larger set of models contributed by the predictors was made available to groups developing scoring functions. These groups scored the uploaded set and submitted their own best models for assessment. The structures of nine protein complexes including one homodimer were used as targets. These targets represent biologically relevant interactions involved in gene expression, signal transduction, RNA, or protein processing and membrane maintenance. For all the targets except one, predictions started from the experimentally determined structures of the free (unbound) components or from models derived by homology, making it mandatory for docking methods to model the conformational changes that often accompany association. In total, 63 groups and eight automatic servers, a substantial increase from previous years, submitted docking predictions, of which 1994 were evaluated here. Fifteen groups submitted 305 models for five targets in the scoring experiment. Assessment of the predictions reveals that 31 different groups produced models of acceptable and medium accuracy-but only one high accuracy submission-for all the targets, except the homodimer. In the latter, none of the docking procedures reproduced the large conformational adjustment required for correct assembly, underscoring yet again that handling protein flexibility remains a major challenge. In the scoring experiment, a large fraction of the groups attained the set goal of singling out the correct association modes from incorrect solutions in the limited ensembles of contributed models. But in general they seemed unable to identify the best models, indicating that current scoring methods are probably not sensitive enough. With the increased focus on protein assemblies, in particular by structural genomics efforts, the growing community of CAPRI predictors is engaged more actively than ever in the development of better scoring functions and means of modeling conformational flexibility, which hold promise for much progress in the future. (c) 2007 Wiley-Liss, Inc.

  8. The clustering-based case-based reasoning for imbalanced business failure prediction: a hybrid approach through integrating unsupervised process with supervised process

    NASA Astrophysics Data System (ADS)

    Li, Hui; Yu, Jun-Ling; Yu, Le-An; Sun, Jie

    2014-05-01

    Case-based reasoning (CBR) is one of the main forecasting methods in business forecasting, which performs well in prediction and holds the ability of giving explanations for the results. In business failure prediction (BFP), the number of failed enterprises is relatively small, compared with the number of non-failed ones. However, the loss is huge when an enterprise fails. Therefore, it is necessary to develop methods (trained on imbalanced samples) which forecast well for this small proportion of failed enterprises and performs accurately on total accuracy meanwhile. Commonly used methods constructed on the assumption of balanced samples do not perform well in predicting minority samples on imbalanced samples consisting of the minority/failed enterprises and the majority/non-failed ones. This article develops a new method called clustering-based CBR (CBCBR), which integrates clustering analysis, an unsupervised process, with CBR, a supervised process, to enhance the efficiency of retrieving information from both minority and majority in CBR. In CBCBR, various case classes are firstly generated through hierarchical clustering inside stored experienced cases, and class centres are calculated out by integrating cases information in the same clustered class. When predicting the label of a target case, its nearest clustered case class is firstly retrieved by ranking similarities between the target case and each clustered case class centre. Then, nearest neighbours of the target case in the determined clustered case class are retrieved. Finally, labels of the nearest experienced cases are used in prediction. In the empirical experiment with two imbalanced samples from China, the performance of CBCBR was compared with the classical CBR, a support vector machine, a logistic regression and a multi-variant discriminate analysis. The results show that compared with the other four methods, CBCBR performed significantly better in terms of sensitivity for identifying the minority samples and generated high total accuracy meanwhile. The proposed approach makes CBR useful in imbalanced forecasting.

  9. SU-G-JeP3-09: Tumor Location Prediction Using Natural Respiratory Volume for Respiratory Gated Radiation Therapy (RGRT): System Verification Study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, M; Jung, J; Yoon, D

    Purpose: Respiratory gated radiation therapy (RGRT) gives accurate results when a patient’s breathing is stable and regular. Thus, the patient should be fully aware during respiratory pattern training before undergoing the RGRT treatment. In order to bypass the process of respiratory pattern training, we propose a target location prediction system for RGRT that uses only natural respiratory volume, and confirm its application. Methods: In order to verify the proposed target location prediction system, an in-house phantom set was used. This set involves a chest phantom including target, external markers, and motion generator. Natural respiratory volume signals were generated using themore » random function in MATLAB code. In the chest phantom, the target takes a linear motion based on the respiratory signal. After a four-dimensional computed tomography (4DCT) scan of the in-house phantom, the motion trajectory was derived as a linear equation. The accuracy of the linear equation was compared with that of the motion algorithm used by the operating motion generator. In addition, we attempted target location prediction using random respiratory volume values. Results: The correspondence rate of the linear equation derived from the 4DCT images with the motion algorithm of the motion generator was 99.41%. In addition, the average error rate of target location prediction was 1.23% for 26 cases. Conclusion: We confirmed the applicability of our proposed target location prediction system for RGRT using natural respiratory volume. If additional clinical studies can be conducted, a more accurate prediction system can be realized without requiring respiratory pattern training.« less

  10. RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning.

    PubMed

    Gao, Yujuan; Wang, Sheng; Deng, Minghua; Xu, Jinbo

    2018-05-08

    Protein dihedral angles provide a detailed description of protein local conformation. Predicted dihedral angles can be used to narrow down the conformational space of the whole polypeptide chain significantly, thus aiding protein tertiary structure prediction. However, direct angle prediction from sequence alone is challenging. In this article, we present a novel method (named RaptorX-Angle) to predict real-valued angles by combining clustering and deep learning. Tested on a subset of PDB25 and the targets in the latest two Critical Assessment of protein Structure Prediction (CASP), our method outperforms the existing state-of-art method SPIDER2 in terms of Pearson Correlation Coefficient (PCC) and Mean Absolute Error (MAE). Our result also shows approximately linear relationship between the real prediction errors and our estimated bounds. That is, the real prediction error can be well approximated by our estimated bounds. Our study provides an alternative and more accurate prediction of dihedral angles, which may facilitate protein structure prediction and functional study.

  11. Application of Visual Attention in Seismic Attribute Analysis

    NASA Astrophysics Data System (ADS)

    He, M.; Gu, H.; Wang, F.

    2016-12-01

    It has been proved that seismic attributes can be used to predict reservoir. The joint of multi-attribute and geological statistics, data mining, artificial intelligence, further promote the development of the seismic attribute analysis. However, the existing methods tend to have multiple solutions and insufficient generalization ability, which is mainly due to the complex relationship between seismic data and geological information, and undoubtedly own partly to the methods applied. Visual attention is a mechanism model of the human visual system which can concentrate on a few significant visual objects rapidly, even in a mixed scene. Actually, the model qualify good ability of target detection and recognition. In our study, the targets to be predicted are treated as visual objects, and an object representation based on well data is made in the attribute dimensions. Then in the same attribute space, the representation is served as a criterion to search the potential targets outside the wells. This method need not predict properties by building up a complicated relation between attributes and reservoir properties, but with reference to the standard determined before. So it has pretty good generalization ability, and the problem of multiple solutions can be weakened by defining the threshold of similarity.

  12. Identification of Histone Deacetylase (HDAC) as a drug target against MRSA via interolog method of protein-protein interaction prediction.

    PubMed

    Uddin, Reaz; Tariq, Syeda Sumayya; Azam, Syed Sikander; Wadood, Abdul; Moin, Syed Tarique

    2017-08-30

    Patently, Protein-Protein Interactions (PPIs) lie at the core of significant biological functions and make the foundation of host-pathogen relationships. Hence, the current study is aimed to use computational biology techniques to predict host-pathogen Protein-Protein Interactions (HP-PPIs) between MRSA and Humans as potential drug targets ultimately proposing new possible inhibitors against them. As a matter of fact this study is based on the Interolog method which implies that homologous proteins retain their ability to interact. A distant homolog approach based on Interolog method was employed to speculate MRSA protein homologs in Humans using PSI-BLAST. In addition the protein interaction partners of these homologs as listed in Database of Interacting Proteins (DIP) were predicted to interact with MRSA as well. Moreover, a direct approach using BLAST was also applied so as to attain further confidence in the strategy. Consequently, the common HP-PPIs predicted by both approaches are suggested as potential drug targets (22%) whereas, the unique HP-PPIs estimated only through distant homolog approach are presented as novel drug targets (12%). Furthermore, the most repeated entry in our results was found to be MRSA Histone Deacetylase (HDAC) which was then modeled using SWISS-MODEL. Eventually, small molecules from ZINC, selected randomly, were docked against HDAC using Auto Dock and are suggested as potential binders (inhibitors) based on their energetic profiles. Thus the current study provides basis for further in-depth analysis of such data which not only include MRSA but other deadly pathogens as well. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Perturbation Theory/Machine Learning Model of ChEMBL Data for Dopamine Targets: Docking, Synthesis, and Assay of New l-Prolyl-l-leucyl-glycinamide Peptidomimetics.

    PubMed

    Ferreira da Costa, Joana; Silva, David; Caamaño, Olga; Brea, José M; Loza, Maria Isabel; Munteanu, Cristian R; Pazos, Alejandro; García-Mera, Xerardo; González-Díaz, Humbert

    2018-06-25

    Predicting drug-protein interactions (DPIs) for target proteins involved in dopamine pathways is a very important goal in medicinal chemistry. We can tackle this problem using Molecular Docking or Machine Learning (ML) models for one specific protein. Unfortunately, these models fail to account for large and complex big data sets of preclinical assays reported in public databases. This includes multiple conditions of assays, such as different experimental parameters, biological assays, target proteins, cell lines, organism of the target, or organism of assay. On the other hand, perturbation theory (PT) models allow us to predict the properties of a query compound or molecular system in experimental assays with multiple boundary conditions based on a previously known case of reference. In this work, we report the first PTML (PT + ML) study of a large ChEMBL data set of preclinical assays of compounds targeting dopamine pathway proteins. The best PTML model found predicts 50000 cases with accuracy of 70-91% in training and external validation series. We also compared the linear PTML model with alternative PTML models trained with multiple nonlinear methods (artificial neural network (ANN), Random Forest, Deep Learning, etc.). Some of the nonlinear methods outperform the linear model but at the cost of a notable increment of the complexity of the model. We illustrated the practical use of the new model with a proof-of-concept theoretical-experimental study. We reported for the first time the organic synthesis, chemical characterization, and pharmacological assay of a new series of l-prolyl-l-leucyl-glycinamide (PLG) peptidomimetic compounds. In addition, we performed a molecular docking study for some of these compounds with the software Vina AutoDock. The work ends with a PTML model predictive study of the outcomes of the new compounds in a large number of assays. Therefore, this study offers a new computational methodology for predicting the outcome for any compound in new assays. This PTML method focuses on the prediction with a simple linear model of multiple pharmacological parameters (IC 50 , EC 50 , K i , etc.) for compounds in assays involving different cell lines used, organisms of the protein target, or organism of assay for proteins in the dopamine pathway.

  14. Relative binding affinity prediction of farnesoid X receptor in the D3R Grand Challenge 2 using FEP.

    PubMed

    Schindler, Christina; Rippmann, Friedrich; Kuhn, Daniel

    2018-01-01

    Physics-based free energy simulations have increasingly become an important tool for predicting binding affinity and the recent introduction of automated protocols has also paved the way towards a more widespread use in the pharmaceutical industry. The D3R 2016 Grand Challenge 2 provided an opportunity to blindly test the commercial free energy calculation protocol FEP+ and assess its performance relative to other affinity prediction methods. The present D3R free energy prediction challenge was built around two experimental data sets involving inhibitors of farnesoid X receptor (FXR) which is a promising anticancer drug target. The FXR binding site is predominantly hydrophobic with few conserved interaction motifs and strong induced fit effects making it a challenging target for molecular modeling and drug design. For both data sets, we achieved reasonable prediction accuracy (RMSD ≈ 1.4 kcal/mol, rank 3-4 according to RMSD out of 20 submissions) comparable to that of state-of-the-art methods in the field. Our D3R results boosted our confidence in the method and strengthen our desire to expand its applications in future in-house drug design projects.

  15. Relative binding affinity prediction of farnesoid X receptor in the D3R Grand Challenge 2 using FEP+

    NASA Astrophysics Data System (ADS)

    Schindler, Christina; Rippmann, Friedrich; Kuhn, Daniel

    2018-01-01

    Physics-based free energy simulations have increasingly become an important tool for predicting binding affinity and the recent introduction of automated protocols has also paved the way towards a more widespread use in the pharmaceutical industry. The D3R 2016 Grand Challenge 2 provided an opportunity to blindly test the commercial free energy calculation protocol FEP+ and assess its performance relative to other affinity prediction methods. The present D3R free energy prediction challenge was built around two experimental data sets involving inhibitors of farnesoid X receptor (FXR) which is a promising anticancer drug target. The FXR binding site is predominantly hydrophobic with few conserved interaction motifs and strong induced fit effects making it a challenging target for molecular modeling and drug design. For both data sets, we achieved reasonable prediction accuracy (RMSD ≈ 1.4 kcal/mol, rank 3-4 according to RMSD out of 20 submissions) comparable to that of state-of-the-art methods in the field. Our D3R results boosted our confidence in the method and strengthen our desire to expand its applications in future in-house drug design projects.

  16. Progress Toward Efficient Laminar Flow Analysis and Design

    NASA Technical Reports Server (NTRS)

    Campbell, Richard L.; Campbell, Matthew L.; Streit, Thomas

    2011-01-01

    A multi-fidelity system of computer codes for the analysis and design of vehicles having extensive areas of laminar flow is under development at the NASA Langley Research Center. The overall approach consists of the loose coupling of a flow solver, a transition prediction method and a design module using shell scripts, along with interface modules to prepare the input for each method. This approach allows the user to select the flow solver and transition prediction module, as well as run mode for each code, based on the fidelity most compatible with the problem and available resources. The design module can be any method that designs to a specified target pressure distribution. In addition to the interface modules, two new components have been developed: 1) an efficient, empirical transition prediction module (MATTC) that provides n-factor growth distributions without requiring boundary layer information; and 2) an automated target pressure generation code (ATPG) that develops a target pressure distribution that meets a variety of flow and geometry constraints. The ATPG code also includes empirical estimates of several drag components to allow the optimization of the target pressure distribution. The current system has been developed for the design of subsonic and transonic airfoils and wings, but may be extendable to other speed ranges and components. Several analysis and design examples are included to demonstrate the current capabilities of the system.

  17. Multimodal manifold-regularized transfer learning for MCI conversion prediction.

    PubMed

    Cheng, Bo; Liu, Mingxia; Suk, Heung-Il; Shen, Dinggang; Zhang, Daoqiang

    2015-12-01

    As the early stage of Alzheimer's disease (AD), mild cognitive impairment (MCI) has high chance to convert to AD. Effective prediction of such conversion from MCI to AD is of great importance for early diagnosis of AD and also for evaluating AD risk pre-symptomatically. Unlike most previous methods that used only the samples from a target domain to train a classifier, in this paper, we propose a novel multimodal manifold-regularized transfer learning (M2TL) method that jointly utilizes samples from another domain (e.g., AD vs. normal controls (NC)) as well as unlabeled samples to boost the performance of the MCI conversion prediction. Specifically, the proposed M2TL method includes two key components. The first one is a kernel-based maximum mean discrepancy criterion, which helps eliminate the potential negative effect induced by the distributional difference between the auxiliary domain (i.e., AD and NC) and the target domain (i.e., MCI converters (MCI-C) and MCI non-converters (MCI-NC)). The second one is a semi-supervised multimodal manifold-regularized least squares classification method, where the target-domain samples, the auxiliary-domain samples, and the unlabeled samples can be jointly used for training our classifier. Furthermore, with the integration of a group sparsity constraint into our objective function, the proposed M2TL has a capability of selecting the informative samples to build a robust classifier. Experimental results on the Alzheimer's Disease Neuroimaging Initiative (ADNI) database validate the effectiveness of the proposed method by significantly improving the classification accuracy of 80.1 % for MCI conversion prediction, and also outperforming the state-of-the-art methods.

  18. Collaborative filtering on a family of biological targets.

    PubMed

    Erhan, Dumitru; L'heureux, Pierre-Jean; Yue, Shi Yi; Bengio, Yoshua

    2006-01-01

    Building a QSAR model of a new biological target for which few screening data are available is a statistical challenge. However, the new target may be part of a bigger family, for which we have more screening data. Collaborative filtering or, more generally, multi-task learning, is a machine learning approach that improves the generalization performance of an algorithm by using information from related tasks as an inductive bias. We use collaborative filtering techniques for building predictive models that link multiple targets to multiple examples. The more commonalities between the targets, the better the multi-target model that can be built. We show an example of a multi-target neural network that can use family information to produce a predictive model of an undersampled target. We evaluate JRank, a kernel-based method designed for collaborative filtering. We show their performance on compound prioritization for an HTS campaign and the underlying shared representation between targets. JRank outperformed the neural network both in the single- and multi-target models.

  19. Computational prediction of host-pathogen protein-protein interactions.

    PubMed

    Dyer, Matthew D; Murali, T M; Sobral, Bruno W

    2007-07-01

    Infectious diseases such as malaria result in millions of deaths each year. An important aspect of any host-pathogen system is the mechanism by which a pathogen can infect its host. One method of infection is via protein-protein interactions (PPIs) where pathogen proteins target host proteins. Developing computational methods that identify which PPIs enable a pathogen to infect a host has great implications in identifying potential targets for therapeutics. We present a method that integrates known intra-species PPIs with protein-domain profiles to predict PPIs between host and pathogen proteins. Given a set of intra-species PPIs, we identify the functional domains in each of the interacting proteins. For every pair of functional domains, we use Bayesian statistics to assess the probability that two proteins with that pair of domains will interact. We apply our method to the Homo sapiens-Plasmodium falciparum host-pathogen system. Our system predicts 516 PPIs between proteins from these two organisms. We show that pairs of human proteins we predict to interact with the same Plasmodium protein are close to each other in the human PPI network and that Plasmodium pairs predicted to interact with same human protein are co-expressed in DNA microarray datasets measured during various stages of the Plasmodium life cycle. Finally, we identify functionally enriched sub-networks spanned by the predicted interactions and discuss the plausibility of our predictions. Supplementary data are available at http://staff.vbi.vt.edu/dyermd/publications/dyer2007a.html. Supplementary data are available at Bioinformatics online.

  20. SChloro: directing Viridiplantae proteins to six chloroplastic sub-compartments.

    PubMed

    Savojardo, Castrense; Martelli, Pier Luigi; Fariselli, Piero; Casadio, Rita

    2017-02-01

    Chloroplasts are organelles found in plants and involved in several important cell processes. Similarly to other compartments in the cell, chloroplasts have an internal structure comprising several sub-compartments, where different proteins are targeted to perform their functions. Given the relation between protein function and localization, the availability of effective computational tools to predict protein sub-organelle localizations is crucial for large-scale functional studies. In this paper we present SChloro, a novel machine-learning approach to predict protein sub-chloroplastic localization, based on targeting signal detection and membrane protein information. The proposed approach performs multi-label predictions discriminating six chloroplastic sub-compartments that include inner membrane, outer membrane, stroma, thylakoid lumen, plastoglobule and thylakoid membrane. In comparative benchmarks, the proposed method outperforms current state-of-the-art methods in both single- and multi-compartment predictions, with an overall multi-label accuracy of 74%. The results demonstrate the relevance of the approach that is eligible as a good candidate for integration into more general large-scale annotation pipelines of protein subcellular localization. The method is available as web server at http://schloro.biocomp.unibo.it gigi@biocomp.unibo.it.

  1. Multi-Instance Metric Transfer Learning for Genome-Wide Protein Function Prediction.

    PubMed

    Xu, Yonghui; Min, Huaqing; Wu, Qingyao; Song, Hengjie; Ye, Bicui

    2017-02-06

    Multi-Instance (MI) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with multiple instances. Many studies in this literature attempted to find an appropriate Multi-Instance Learning (MIL) method for genome-wide protein function prediction under a usual assumption, the underlying distribution from testing data (target domain, i.e., TD) is the same as that from training data (source domain, i.e., SD). However, this assumption may be violated in real practice. To tackle this problem, in this paper, we propose a Multi-Instance Metric Transfer Learning (MIMTL) approach for genome-wide protein function prediction. In MIMTL, we first transfer the source domain distribution to the target domain distribution by utilizing the bag weights. Then, we construct a distance metric learning method with the reweighted bags. At last, we develop an alternative optimization scheme for MIMTL. Comprehensive experimental evidence on seven real-world organisms verifies the effectiveness and efficiency of the proposed MIMTL approach over several state-of-the-art methods.

  2. A machine learning approach for predicting CRISPR-Cas9 cleavage efficiencies and patterns underlying its mechanism of action.

    PubMed

    Abadi, Shiran; Yan, Winston X; Amar, David; Mayrose, Itay

    2017-10-01

    The adaptation of the CRISPR-Cas9 system as a genome editing technique has generated much excitement in recent years owing to its ability to manipulate targeted genes and genomic regions that are complementary to a programmed single guide RNA (sgRNA). However, the efficacy of a specific sgRNA is not uniquely defined by exact sequence homology to the target site, thus unintended off-targets might additionally be cleaved. Current methods for sgRNA design are mainly concerned with predicting off-targets for a given sgRNA using basic sequence features and employ elementary rules for ranking possible sgRNAs. Here, we introduce CRISTA (CRISPR Target Assessment), a novel algorithm within the machine learning framework that determines the propensity of a genomic site to be cleaved by a given sgRNA. We show that the predictions made with CRISTA are more accurate than other available methodologies. We further demonstrate that the occurrence of bulges is not a rare phenomenon and should be accounted for in the prediction process. Beyond predicting cleavage efficiencies, the learning process provides inferences regarding patterns that underlie the mechanism of action of the CRISPR-Cas9 system. We discover that attributes that describe the spatial structure and rigidity of the entire genomic site as well as those surrounding the PAM region are a major component of the prediction capabilities.

  3. Knowledge-transfer learning for prediction of matrix metalloprotease substrate-cleavage sites.

    PubMed

    Wang, Yanan; Song, Jiangning; Marquez-Lago, Tatiana T; Leier, André; Li, Chen; Lithgow, Trevor; Webb, Geoffrey I; Shen, Hong-Bin

    2017-07-18

    Matrix Metalloproteases (MMPs) are an important family of proteases that play crucial roles in key cellular and disease processes. Therefore, MMPs constitute important targets for drug design, development and delivery. Advanced proteomic technologies have identified type-specific target substrates; however, the complete repertoire of MMP substrates remains uncharacterized. Indeed, computational prediction of substrate-cleavage sites associated with MMPs is a challenging problem. This holds especially true when considering MMPs with few experimentally verified cleavage sites, such as for MMP-2, -3, -7, and -8. To fill this gap, we propose a new knowledge-transfer computational framework which effectively utilizes the hidden shared knowledge from some MMP types to enhance predictions of other, distinct target substrate-cleavage sites. Our computational framework uses support vector machines combined with transfer machine learning and feature selection. To demonstrate the value of the model, we extracted a variety of substrate sequence-derived features and compared the performance of our method using both 5-fold cross-validation and independent tests. The results show that our transfer-learning-based method provides a robust performance, which is at least comparable to traditional feature-selection methods for prediction of MMP-2, -3, -7, -8, -9 and -12 substrate-cleavage sites on independent tests. The results also demonstrate that our proposed computational framework provides a useful alternative for the characterization of sequence-level determinants of MMP-substrate specificity.

  4. Comprehensive prediction of drug-protein interactions and side effects for the human proteome

    PubMed Central

    Zhou, Hongyi; Gao, Mu; Skolnick, Jeffrey

    2015-01-01

    Identifying unexpected drug-protein interactions is crucial for drug repurposing. We develop a comprehensive proteome scale approach that predicts human protein targets and side effects of drugs. For drug-protein interaction prediction, FINDSITEcomb, whose average precision is ~30% and recall ~27%, is employed. For side effect prediction, a new method is developed with a precision of ~57% and a recall of ~24%. Our predictions show that drugs are quite promiscuous, with the average (median) number of human targets per drug of 329 (38), while a given protein interacts with 57 drugs. The result implies that drug side effects are inevitable and existing drugs may be useful for repurposing, with only ~1,000 human proteins likely causing serious side effects. A killing index derived from serious side effects has a strong correlation with FDA approved drugs being withdrawn. Therefore, it provides a pre-filter for new drug development. The methodology is free to the academic community on the DR. PRODIS (DRugome, PROteome, and DISeasome) webserver at http://cssb.biology.gatech.edu/dr.prodis/. DR. PRODIS provides protein targets of drugs, drugs for a given protein target, associated diseases and side effects of drugs, as well as an interface for the virtual target screening of new compounds. PMID:26057345

  5. The value of nodal information in predicting lung cancer relapse using 4DPET/4DCT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Heyse, E-mail: heyse.li@mail.utoronto.ca; Becker, Nathan; Raman, Srinivas

    2015-08-15

    Purpose: There is evidence that computed tomography (CT) and positron emission tomography (PET) imaging metrics are prognostic and predictive in nonsmall cell lung cancer (NSCLC) treatment outcomes. However, few studies have explored the use of standardized uptake value (SUV)-based image features of nodal regions as predictive features. The authors investigated and compared the use of tumor and node image features extracted from the radiotherapy target volumes to predict relapse in a cohort of NSCLC patients undergoing chemoradiation treatment. Methods: A prospective cohort of 25 patients with locally advanced NSCLC underwent 4DPET/4DCT imaging for radiation planning. Thirty-seven image features were derivedmore » from the CT-defined volumes and SUVs of the PET image from both the tumor and nodal target regions. The machine learning methods of logistic regression and repeated stratified five-fold cross-validation (CV) were used to predict local and overall relapses in 2 yr. The authors used well-known feature selection methods (Spearman’s rank correlation, recursive feature elimination) within each fold of CV. Classifiers were ranked on their Matthew’s correlation coefficient (MCC) after CV. Area under the curve, sensitivity, and specificity values are also presented. Results: For predicting local relapse, the best classifier found had a mean MCC of 0.07 and was composed of eight tumor features. For predicting overall relapse, the best classifier found had a mean MCC of 0.29 and was composed of a single feature: the volume greater than 0.5 times the maximum SUV (N). Conclusions: The best classifier for predicting local relapse had only tumor features. In contrast, the best classifier for predicting overall relapse included a node feature. Overall, the methods showed that nodes add value in predicting overall relapse but not local relapse.« less

  6. Perturbation biology nominates upstream–downstream drug combinations in RAF inhibitor resistant melanoma cells

    PubMed Central

    Korkut, Anil; Wang, Weiqing; Demir, Emek; Aksoy, Bülent Arman; Jing, Xiaohong; Molinelli, Evan J; Babur, Özgün; Bemis, Debra L; Onur Sumer, Selcuk; Solit, David B; Pratilas, Christine A; Sander, Chris

    2015-01-01

    Resistance to targeted cancer therapies is an important clinical problem. The discovery of anti-resistance drug combinations is challenging as resistance can arise by diverse escape mechanisms. To address this challenge, we improved and applied the experimental-computational perturbation biology method. Using statistical inference, we build network models from high-throughput measurements of molecular and phenotypic responses to combinatorial targeted perturbations. The models are computationally executed to predict the effects of thousands of untested perturbations. In RAF-inhibitor resistant melanoma cells, we measured 143 proteomic/phenotypic entities under 89 perturbation conditions and predicted c-Myc as an effective therapeutic co-target with BRAF or MEK. Experiments using the BET bromodomain inhibitor JQ1 affecting the level of c-Myc protein and protein kinase inhibitors targeting the ERK pathway confirmed the prediction. In conclusion, we propose an anti-cancer strategy of co-targeting a specific upstream alteration and a general downstream point of vulnerability to prevent or overcome resistance to targeted drugs. DOI: http://dx.doi.org/10.7554/eLife.04640.001 PMID:26284497

  7. EMUDRA: Ensemble of Multiple Drug Repositioning Approaches to Improve Prediction Accuracy.

    PubMed

    Zhou, Xianxiao; Wang, Minghui; Katsyv, Igor; Irie, Hanna; Zhang, Bin

    2018-04-24

    Availability of large-scale genomic, epigenetic and proteomic data in complex diseases makes it possible to objectively and comprehensively identify therapeutic targets that can lead to new therapies. The Connectivity Map has been widely used to explore novel indications of existing drugs. However, the prediction accuracy of the existing methods, such as Kolmogorov-Smirnov statistic remains low. Here we present a novel high-performance drug repositioning approach that improves over the state-of-the-art methods. We first designed an expression weighted cosine method (EWCos) to minimize the influence of the uninformative expression changes and then developed an ensemble approach termed EMUDRA (Ensemble of Multiple Drug Repositioning Approaches) to integrate EWCos and three existing state-of-the-art methods. EMUDRA significantly outperformed individual drug repositioning methods when applied to simulated and independent evaluation datasets. We predicted using EMUDRA and experimentally validated an antibiotic rifabutin as an inhibitor of cell growth in triple negative breast cancer. EMUDRA can identify drugs that more effectively target disease gene signatures and will thus be a useful tool for identifying novel therapies for complex diseases and predicting new indications for existing drugs. The EMUDRA R package is available at doi:10.7303/syn11510888. bin.zhang@mssm.edu or zhangb@hotmail.com. Supplementary data are available at Bioinformatics online.

  8. Computational Prediction of the Global Functional Genomic Landscape: Applications, Methods and Challenges

    PubMed Central

    Zhou, Weiqiang; Sherwood, Ben; Ji, Hongkai

    2017-01-01

    Technological advances have led to an explosive growth of high-throughput functional genomic data. Exploiting the correlation among different data types, it is possible to predict one functional genomic data type from other data types. Prediction tools are valuable in understanding the relationship among different functional genomic signals. They also provide a cost-efficient solution to inferring the unknown functional genomic profiles when experimental data are unavailable due to resource or technological constraints. The predicted data may be used for generating hypotheses, prioritizing targets, interpreting disease variants, facilitating data integration, quality control, and many other purposes. This article reviews various applications of prediction methods in functional genomics, discusses analytical challenges, and highlights some common and effective strategies used to develop prediction methods for functional genomic data. PMID:28076869

  9. SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines

    PubMed Central

    2014-01-01

    Background It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models. Results We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make predictions. We also trained a SVM model with two new additional features (profiles and SOV scores) on 20 CASP8 targets and found that including them can only improve the performance when real deviations between native and model are higher than 5Å. The SMOQ tool finally released uses the basic feature set trained on 85 CASP8 targets. Moreover, SMOQ implemented a way to convert predicted local quality scores into a global quality score. SMOQ was tested on the 84 CASP9 single-domain targets. The average difference between the residue-specific distance deviation predicted by our method and the actual distance deviation on the test data is 2.637Å. The global quality prediction accuracy of the tool is comparable to other good tools on the same benchmark. Conclusion SMOQ is a useful tool for protein single model quality assessment. Its source code and executable are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/. PMID:24776231

  10. SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines.

    PubMed

    Cao, Renzhi; Wang, Zheng; Wang, Yiheng; Cheng, Jianlin

    2014-04-28

    It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models. We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make predictions. We also trained a SVM model with two new additional features (profiles and SOV scores) on 20 CASP8 targets and found that including them can only improve the performance when real deviations between native and model are higher than 5Å. The SMOQ tool finally released uses the basic feature set trained on 85 CASP8 targets. Moreover, SMOQ implemented a way to convert predicted local quality scores into a global quality score. SMOQ was tested on the 84 CASP9 single-domain targets. The average difference between the residue-specific distance deviation predicted by our method and the actual distance deviation on the test data is 2.637Å. The global quality prediction accuracy of the tool is comparable to other good tools on the same benchmark. SMOQ is a useful tool for protein single model quality assessment. Its source code and executable are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/.

  11. Performance of multiple docking and refinement methods in the pose prediction D3R prospective Grand Challenge 2016

    NASA Astrophysics Data System (ADS)

    Fradera, Xavier; Verras, Andreas; Hu, Yuan; Wang, Deping; Wang, Hongwu; Fells, James I.; Armacost, Kira A.; Crespo, Alejandro; Sherborne, Brad; Wang, Huijun; Peng, Zhengwei; Gao, Ying-Duo

    2018-01-01

    We describe the performance of multiple pose prediction methods for the D3R 2016 Grand Challenge. The pose prediction challenge includes 36 ligands, which represent 4 chemotypes and some miscellaneous structures against the FXR ligand binding domain. In this study we use a mix of fully automated methods as well as human-guided methods with considerations of both the challenge data and publicly available data. The methods include ensemble docking, colony entropy pose prediction, target selection by molecular similarity, molecular dynamics guided pose refinement, and pose selection by visual inspection. We evaluated the success of our predictions by method, chemotype, and relevance of publicly available data. For the overall data set, ensemble docking, visual inspection, and molecular dynamics guided pose prediction performed the best with overall mean RMSDs of 2.4, 2.2, and 2.2 Å respectively. For several individual challenge molecules, the best performing method is evaluated in light of that particular ligand. We also describe the protein, ligand, and public information data preparations that are typical of our binding mode prediction workflow.

  12. SU-F-T-450: The Investigation of Radiotherapy Quality Assurance and Automatic Treatment Planning Based On the Kernel Density Estimation Method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fan, J; Fan, J; Hu, W

    Purpose: To develop a fast automatic algorithm based on the two dimensional kernel density estimation (2D KDE) to predict the dose-volume histogram (DVH) which can be employed for the investigation of radiotherapy quality assurance and automatic treatment planning. Methods: We propose a machine learning method that uses previous treatment plans to predict the DVH. The key to the approach is the framing of DVH in a probabilistic setting. The training consists of estimating, from the patients in the training set, the joint probability distribution of the dose and the predictive features. The joint distribution provides an estimation of the conditionalmore » probability of the dose given the values of the predictive features. For the new patient, the prediction consists of estimating the distribution of the predictive features and marginalizing the conditional probability from the training over this. Integrating the resulting probability distribution for the dose yields an estimation of the DVH. The 2D KDE is implemented to predict the joint probability distribution of the training set and the distribution of the predictive features for the new patient. Two variables, including the signed minimal distance from each OAR (organs at risk) voxel to the target boundary and its opening angle with respect to the origin of voxel coordinate, are considered as the predictive features to represent the OAR-target spatial relationship. The feasibility of our method has been demonstrated with the rectum, breast and head-and-neck cancer cases by comparing the predicted DVHs with the planned ones. Results: The consistent result has been found between these two DVHs for each cancer and the average of relative point-wise differences is about 5% within the clinical acceptable extent. Conclusion: According to the result of this study, our method can be used to predict the clinical acceptable DVH and has ability to evaluate the quality and consistency of the treatment planning.« less

  13. Machine learning approaches for estimation of prediction interval for the model output.

    PubMed

    Shrestha, Durga L; Solomatine, Dimitri P

    2006-03-01

    A novel method for estimating prediction uncertainty using machine learning techniques is presented. Uncertainty is expressed in the form of the two quantiles (constituting the prediction interval) of the underlying distribution of prediction errors. The idea is to partition the input space into different zones or clusters having similar model errors using fuzzy c-means clustering. The prediction interval is constructed for each cluster on the basis of empirical distributions of the errors associated with all instances belonging to the cluster under consideration and propagated from each cluster to the examples according to their membership grades in each cluster. Then a regression model is built for in-sample data using computed prediction limits as targets, and finally, this model is applied to estimate the prediction intervals (limits) for out-of-sample data. The method was tested on artificial and real hydrologic data sets using various machine learning techniques. Preliminary results show that the method is superior to other methods estimating the prediction interval. A new method for evaluating performance for estimating prediction interval is proposed as well.

  14. Global proteomics profiling improves drug sensitivity prediction: results from a multi-omics, pan-cancer modeling approach.

    PubMed

    Ali, Mehreen; Khan, Suleiman A; Wennerberg, Krister; Aittokallio, Tero

    2018-04-15

    Proteomics profiling is increasingly being used for molecular stratification of cancer patients and cell-line panels. However, systematic assessment of the predictive power of large-scale proteomic technologies across various drug classes and cancer types is currently lacking. To that end, we carried out the first pan-cancer, multi-omics comparative analysis of the relative performance of two proteomic technologies, targeted reverse phase protein array (RPPA) and global mass spectrometry (MS), in terms of their accuracy for predicting the sensitivity of cancer cells to both cytotoxic chemotherapeutics and molecularly targeted anticancer compounds. Our results in two cell-line panels demonstrate how MS profiling improves drug response predictions beyond that of the RPPA or the other omics profiles when used alone. However, frequent missing MS data values complicate its use in predictive modeling and required additional filtering, such as focusing on completely measured or known oncoproteins, to obtain maximal predictive performance. Rather strikingly, the two proteomics profiles provided complementary predictive signal both for the cytotoxic and targeted compounds. Further, information about the cellular-abundance of primary target proteins was found critical for predicting the response of targeted compounds, although the non-target features also contributed significantly to the predictive power. The clinical relevance of the selected protein markers was confirmed in cancer patient data. These results provide novel insights into the relative performance and optimal use of the widely applied proteomic technologies, MS and RPPA, which should prove useful in translational applications, such as defining the best combination of omics technologies and marker panels for understanding and predicting drug sensitivities in cancer patients. Processed datasets, R as well as Matlab implementations of the methods are available at https://github.com/mehr-een/bemkl-rbps. mehreen.ali@helsinki.fi or tero.aittokallio@fimm.fi. Supplementary data are available at Bioinformatics online.

  15. In Silico Screening Based on Predictive Algorithms as a Design Tool for Exon Skipping Oligonucleotides in Duchenne Muscular Dystrophy

    PubMed Central

    Echigoya, Yusuke; Mouly, Vincent; Garcia, Luis; Yokota, Toshifumi; Duddy, William

    2015-01-01

    The use of antisense ‘splice-switching’ oligonucleotides to induce exon skipping represents a potential therapeutic approach to various human genetic diseases. It has achieved greatest maturity in exon skipping of the dystrophin transcript in Duchenne muscular dystrophy (DMD), for which several clinical trials are completed or ongoing, and a large body of data exists describing tested oligonucleotides and their efficacy. The rational design of an exon skipping oligonucleotide involves the choice of an antisense sequence, usually between 15 and 32 nucleotides, targeting the exon that is to be skipped. Although parameters describing the target site can be computationally estimated and several have been identified to correlate with efficacy, methods to predict efficacy are limited. Here, an in silico pre-screening approach is proposed, based on predictive statistical modelling. Previous DMD data were compiled together and, for each oligonucleotide, some 60 descriptors were considered. Statistical modelling approaches were applied to derive algorithms that predict exon skipping for a given target site. We confirmed (1) the binding energetics of the oligonucleotide to the RNA, and (2) the distance in bases of the target site from the splice acceptor site, as the two most predictive parameters, and we included these and several other parameters (while discounting many) into an in silico screening process, based on their capacity to predict high or low efficacy in either phosphorodiamidate morpholino oligomers (89% correctly predicted) and/or 2’O Methyl RNA oligonucleotides (76% correctly predicted). Predictions correlated strongly with in vitro testing for sixteen de novo PMO sequences targeting various positions on DMD exons 44 (R2 0.89) and 53 (R2 0.89), one of which represents a potential novel candidate for clinical trials. We provide these algorithms together with a computational tool that facilitates screening to predict exon skipping efficacy at each position of a target exon. PMID:25816009

  16. Deriving Points of Departure and Performance Baselines for Predictive Modeling of Systemic Toxicity using ToxRefDB (SOT)

    EPA Science Inventory

    A primary goal of computational toxicology is to generate predictive models of toxicity. An elusive target of alternative test methods and models has been the accurate prediction of systemic toxicity points of departure (PoD). We aim not only to provide a large and valuable resou...

  17. Predicting the Types of Ion Channel-Targeted Conotoxins Based on AVC-SVM Model.

    PubMed

    Xianfang, Wang; Junmei, Wang; Xiaolei, Wang; Yue, Zhang

    2017-01-01

    The conotoxin proteins are disulfide-rich small peptides. Predicting the types of ion channel-targeted conotoxins has great value in the treatment of chronic diseases, epilepsy, and cardiovascular diseases. To solve the problem of information redundancy existing when using current methods, a new model is presented to predict the types of ion channel-targeted conotoxins based on AVC (Analysis of Variance and Correlation) and SVM (Support Vector Machine). First, the F value is used to measure the significance level of the feature for the result, and the attribute with smaller F value is filtered by rough selection. Secondly, redundancy degree is calculated by Pearson Correlation Coefficient. And the threshold is set to filter attributes with weak independence to get the result of the refinement. Finally, SVM is used to predict the types of ion channel-targeted conotoxins. The experimental results show the proposed AVC-SVM model reaches an overall accuracy of 91.98%, an average accuracy of 92.17%, and the total number of parameters of 68. The proposed model provides highly useful information for further experimental research. The prediction model will be accessed free of charge at our web server.

  18. Predicting the Types of Ion Channel-Targeted Conotoxins Based on AVC-SVM Model

    PubMed Central

    Xiaolei, Wang

    2017-01-01

    The conotoxin proteins are disulfide-rich small peptides. Predicting the types of ion channel-targeted conotoxins has great value in the treatment of chronic diseases, epilepsy, and cardiovascular diseases. To solve the problem of information redundancy existing when using current methods, a new model is presented to predict the types of ion channel-targeted conotoxins based on AVC (Analysis of Variance and Correlation) and SVM (Support Vector Machine). First, the F value is used to measure the significance level of the feature for the result, and the attribute with smaller F value is filtered by rough selection. Secondly, redundancy degree is calculated by Pearson Correlation Coefficient. And the threshold is set to filter attributes with weak independence to get the result of the refinement. Finally, SVM is used to predict the types of ion channel-targeted conotoxins. The experimental results show the proposed AVC-SVM model reaches an overall accuracy of 91.98%, an average accuracy of 92.17%, and the total number of parameters of 68. The proposed model provides highly useful information for further experimental research. The prediction model will be accessed free of charge at our web server. PMID:28497044

  19. Fuel Cell Manufacturing Research and Development | Hydrogen and Fuel Cells

    Science.gov Websites

    methods to meet volume and cost targets for transportation and other applications. Fortunately, much can set Develop predictive models to help industry design better manufacturing processes and methods

  20. Improving compound-protein interaction prediction by building up highly credible negative samples.

    PubMed

    Liu, Hui; Sun, Jianjiang; Guan, Jihong; Zheng, Jie; Zhou, Shuigeng

    2015-06-15

    Computational prediction of compound-protein interactions (CPIs) is of great importance for drug design and development, as genome-scale experimental validation of CPIs is not only time-consuming but also prohibitively expensive. With the availability of an increasing number of validated interactions, the performance of computational prediction approaches is severely impended by the lack of reliable negative CPI samples. A systematic method of screening reliable negative sample becomes critical to improving the performance of in silico prediction methods. This article aims at building up a set of highly credible negative samples of CPIs via an in silico screening method. As most existing computational models assume that similar compounds are likely to interact with similar target proteins and achieve remarkable performance, it is rational to identify potential negative samples based on the converse negative proposition that the proteins dissimilar to every known/predicted target of a compound are not much likely to be targeted by the compound and vice versa. We integrated various resources, including chemical structures, chemical expression profiles and side effects of compounds, amino acid sequences, protein-protein interaction network and functional annotations of proteins, into a systematic screening framework. We first tested the screened negative samples on six classical classifiers, and all these classifiers achieved remarkably higher performance on our negative samples than on randomly generated negative samples for both human and Caenorhabditis elegans. We then verified the negative samples on three existing prediction models, including bipartite local model, Gaussian kernel profile and Bayesian matrix factorization, and found that the performances of these models are also significantly improved on the screened negative samples. Moreover, we validated the screened negative samples on a drug bioactivity dataset. Finally, we derived two sets of new interactions by training an support vector machine classifier on the positive interactions annotated in DrugBank and our screened negative interactions. The screened negative samples and the predicted interactions provide the research community with a useful resource for identifying new drug targets and a helpful supplement to the current curated compound-protein databases. Supplementary files are available at: http://admis.fudan.edu.cn/negative-cpi/. © The Author 2015. Published by Oxford University Press.

  1. A new method for enhancer prediction based on deep belief network.

    PubMed

    Bu, Hongda; Gan, Yanglan; Wang, Yang; Zhou, Shuigeng; Guan, Jihong

    2017-10-16

    Studies have shown that enhancers are significant regulatory elements to play crucial roles in gene expression regulation. Since enhancers are unrelated to the orientation and distance to their target genes, it is a challenging mission for scholars and researchers to accurately predicting distal enhancers. In the past years, with the high-throughout ChiP-seq technologies development, several computational techniques emerge to predict enhancers using epigenetic or genomic features. Nevertheless, the inconsistency of computational models across different cell-lines and the unsatisfactory prediction performance call for further research in this area. Here, we propose a new Deep Belief Network (DBN) based computational method for enhancer prediction, which is called EnhancerDBN. This method combines diverse features, composed of DNA sequence compositional features, DNA methylation and histone modifications. Our computational results indicate that 1) EnhancerDBN outperforms 13 existing methods in prediction, and 2) GC content and DNA methylation can serve as relevant features for enhancer prediction. Deep learning is effective in boosting the performance of enhancer prediction.

  2. Performance of MDockPP in CAPRI rounds 28-29 and 31-35 including the prediction of water-mediated interactions.

    PubMed

    Xu, Xianjin; Qiu, Liming; Yan, Chengfei; Ma, Zhiwei; Grinter, Sam Z; Zou, Xiaoqin

    2017-03-01

    Protein-protein interactions are either through direct contacts between two binding partners or mediated by structural waters. Both direct contacts and water-mediated interactions are crucial to the formation of a protein-protein complex. During the recent CAPRI rounds, a novel parallel searching strategy for predicting water-mediated interactions is introduced into our protein-protein docking method, MDockPP. Briefly, a FFT-based docking algorithm is employed in generating putative binding modes, and an iteratively derived statistical potential-based scoring function, ITScorePP, in conjunction with biological information is used to assess and rank the binding modes. Up to 10 binding modes are selected as the initial protein-protein complex structures for MD simulations in explicit solvent. Water molecules near the interface are clustered based on the snapshots extracted from independent equilibrated trajectories. Then, protein-ligand docking is employed for a parallel search for water molecules near the protein-protein interface. The water molecules generated by ligand docking and the clustered water molecules generated by MD simulations are merged, referred to as the predicted structural water molecules. Here, we report the performance of this protocol for CAPRI rounds 28-29 and 31-35 containing 20 valid docking targets and 11 scoring targets. In the docking experiments, we predicted correct binding modes for nine targets, including one high-accuracy, two medium-accuracy, and six acceptable predictions. Regarding the two targets for the prediction of water-mediated interactions, we achieved models ranked as "excellent" in accordance with the CAPRI evaluation criteria; one of these two targets is considered as a difficult target for structural water prediction. Proteins 2017; 85:424-434. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  3. Molecular Target Homology as a Basis for Species Extrapolation to Assess the Ecological Risk of Veterinary Drugs

    EPA Science Inventory

    Increased identification of veterinary pharmaceutical contaminants in aquatic environments has raised concerns regarding potential adverse effects of these chemicals on non-target organisms. The purpose of this work was to develop a method for predictive species extrapolation ut...

  4. Predictive Models for Carcinogenicity and Mutagenicity: Frameworks, State-of-the-Art, and Perspectives

    EPA Science Inventory

    Mutagenicity and carcinogenicity are endpoints of major environmental and regulatory concern. These endpoints are also important targets for development of alternative methods for screening and prediction due to the large number of chemicals of potential concern and the tremendou...

  5. Predictive Models for Carcinogenicity and Mutagenicity: Frameworks,State-of-the-Art, and Perspectives

    EPA Science Inventory

    Mutagenicity and carcinogenicity are endpoints of major environmental and regulatory concern. These endpoints are also important targets for development of alternative methods for screening and prediction due to the large number of chemicals of potential concern and the tremendou...

  6. PredPPCrys: accurate prediction of sequence cloning, protein production, purification and crystallization propensity from protein sequences using multi-step heterogeneous feature fusion and selection.

    PubMed

    Wang, Huilin; Wang, Mingjun; Tan, Hao; Li, Yuan; Zhang, Ziding; Song, Jiangning

    2014-01-01

    X-ray crystallography is the primary approach to solve the three-dimensional structure of a protein. However, a major bottleneck of this method is the failure of multi-step experimental procedures to yield diffraction-quality crystals, including sequence cloning, protein material production, purification, crystallization and ultimately, structural determination. Accordingly, prediction of the propensity of a protein to successfully undergo these experimental procedures based on the protein sequence may help narrow down laborious experimental efforts and facilitate target selection. A number of bioinformatics methods based on protein sequence information have been developed for this purpose. However, our knowledge on the important determinants of propensity for a protein sequence to produce high diffraction-quality crystals remains largely incomplete. In practice, most of the existing methods display poorer performance when evaluated on larger and updated datasets. To address this problem, we constructed an up-to-date dataset as the benchmark, and subsequently developed a new approach termed 'PredPPCrys' using the support vector machine (SVM). Using a comprehensive set of multifaceted sequence-derived features in combination with a novel multi-step feature selection strategy, we identified and characterized the relative importance and contribution of each feature type to the prediction performance of five individual experimental steps required for successful crystallization. The resulting optimal candidate features were used as inputs to build the first-level SVM predictor (PredPPCrys I). Next, prediction outputs of PredPPCrys I were used as the input to build second-level SVM classifiers (PredPPCrys II), which led to significantly enhanced prediction performance. Benchmarking experiments indicated that our PredPPCrys method outperforms most existing procedures on both up-to-date and previous datasets. In addition, the predicted crystallization targets of currently non-crystallizable proteins were provided as compendium data, which are anticipated to facilitate target selection and design for the worldwide structural genomics consortium. PredPPCrys is freely available at http://www.structbioinfor.org/PredPPCrys.

  7. Predicting targets of compounds against neurological diseases using cheminformatic methodology

    NASA Astrophysics Data System (ADS)

    Nikolic, Katarina; Mavridis, Lazaros; Bautista-Aguilera, Oscar M.; Marco-Contelles, José; Stark, Holger; do Carmo Carreiras, Maria; Rossi, Ilaria; Massarelli, Paola; Agbaba, Danica; Ramsay, Rona R.; Mitchell, John B. O.

    2015-02-01

    Recently developed multi-targeted ligands are novel drug candidates able to interact with monoamine oxidase A and B; acetylcholinesterase and butyrylcholinesterase; or with histamine N-methyltransferase and histamine H3-receptor (H3R). These proteins are drug targets in the treatment of depression, Alzheimer's disease, obsessive disorders, and Parkinson's disease. A probabilistic method, the Parzen-Rosenblatt window approach, was used to build a "predictor" model using data collected from the ChEMBL database. The model can be used to predict both the primary pharmaceutical target and off-targets of a compound based on its structure. Molecular structures were represented based on the circular fingerprint methodology. The same approach was used to build a "predictor" model from the DrugBank dataset to determine the main pharmacological groups of the compound. The study of off-target interactions is now recognised as crucial to the understanding of both drug action and toxicology. Primary pharmaceutical targets and off-targets for the novel multi-target ligands were examined by use of the developed cheminformatic method. Several multi-target ligands were selected for further study, as compounds with possible additional beneficial pharmacological activities. The cheminformatic targets identifications were in agreement with four 3D-QSAR (H3R/D1R/D2R/5-HT2aR) models and by in vitro assays for serotonin 5-HT1a and 5-HT2a receptor binding of the most promising ligand ( 71/MBA-VEG8).

  8. Analysis of Free Modeling Predictions by RBO Aleph in CASP11

    PubMed Central

    Mabrouk, Mahmoud; Werner, Tim; Schneider, Michael; Putz, Ines; Brock, Oliver

    2015-01-01

    The CASP experiment is a biannual benchmark for assessing protein structure prediction methods. In CASP11, RBO Aleph ranked as one of the top-performing automated servers in the free modeling category. This category consists of targets for which structural templates are not easily retrievable. We analyze the performance of RBO Aleph and show that its success in CASP was a result of its ab initio structure prediction protocol. A detailed analysis of this protocol demonstrates that two components unique to our method greatly contributed to prediction quality: residue–residue contact prediction by EPC-map and contact–guided conformational space search by model-based search (MBS). Interestingly, our analysis also points to a possible fundamental problem in evaluating the performance of protein structure prediction methods: Improvements in components of the method do not necessarily lead to improvements of the entire method. This points to the fact that these components interact in ways that are poorly understood. This problem, if indeed true, represents a significant obstacle to community-wide progress. PMID:26492194

  9. Systematic review of computational methods for identifying miRNA-mediated RNA-RNA crosstalk.

    PubMed

    Li, Yongsheng; Jin, Xiyun; Wang, Zishan; Li, Lili; Chen, Hong; Lin, Xiaoyu; Yi, Song; Zhang, Yunpeng; Xu, Juan

    2017-10-25

    Posttranscriptional crosstalk and communication between RNAs yield large regulatory competing endogenous RNA (ceRNA) networks via shared microRNAs (miRNAs), as well as miRNA synergistic networks. The ceRNA crosstalk represents a novel layer of gene regulation that controls both physiological and pathological processes such as development and complex diseases. The rapidly expanding catalogue of ceRNA regulation has provided evidence for exploitation as a general model to predict the ceRNAs in silico. In this article, we first reviewed the current progress of RNA-RNA crosstalk in human complex diseases. Then, the widely used computational methods for modeling ceRNA-ceRNA interaction networks are further summarized into five types: two types of global ceRNA regulation prediction methods and three types of context-specific prediction methods, which are based on miRNA-messenger RNA regulation alone, or by integrating heterogeneous data, respectively. To provide guidance in the computational prediction of ceRNA-ceRNA interactions, we finally performed a comparative study of different combinations of miRNA-target methods as well as five types of ceRNA identification methods by using literature-curated ceRNA regulation and gene perturbation. The results revealed that integration of different miRNA-target prediction methods and context-specific miRNA/gene expression profiles increased the performance for identifying ceRNA regulation. Moreover, different computational methods were complementary in identifying ceRNA regulation and captured different functional parts of similar pathways. We believe that the application of these computational techniques provides valuable functional insights into ceRNA regulation and is a crucial step for informing subsequent functional validation studies. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  10. Control, Filtering and Prediction for Phased Arrays in Directed Energy Systems

    DTIC Science & Technology

    2016-04-30

    adaptive optics. 15. SUBJECT TERMS control, filtering, prediction, system identification, adaptive optics, laser beam pointing, target tracking, phase... laser beam control; furthermore, wavefront sensors are plagued by the difficulty of maintaining the required alignment and focusing in dynamic mission...developed new methods for filtering, prediction and system identification in adaptive optics for high energy laser systems including phased arrays. The

  11. TMDIM: an improved algorithm for the structure prediction of transmembrane domains of bitopic dimers.

    PubMed

    Cao, Han; Ng, Marcus C K; Jusoh, Siti Azma; Tai, Hio Kuan; Siu, Shirley W I

    2017-09-01

    [Formula: see text]-Helical transmembrane proteins are the most important drug targets in rational drug development. However, solving the experimental structures of these proteins remains difficult, therefore computational methods to accurately and efficiently predict the structures are in great demand. We present an improved structure prediction method TMDIM based on Park et al. (Proteins 57:577-585, 2004) for predicting bitopic transmembrane protein dimers. Three major algorithmic improvements are introduction of the packing type classification, the multiple-condition decoy filtering, and the cluster-based candidate selection. In a test of predicting nine known bitopic dimers, approximately 78% of our predictions achieved a successful fit (RMSD <2.0 Å) and 78% of the cases are better predicted than the two other methods compared. Our method provides an alternative for modeling TM bitopic dimers of unknown structures for further computational studies. TMDIM is freely available on the web at https://cbbio.cis.umac.mo/TMDIM . Website is implemented in PHP, MySQL and Apache, with all major browsers supported.

  12. TMDIM: an improved algorithm for the structure prediction of transmembrane domains of bitopic dimers

    NASA Astrophysics Data System (ADS)

    Cao, Han; Ng, Marcus C. K.; Jusoh, Siti Azma; Tai, Hio Kuan; Siu, Shirley W. I.

    2017-09-01

    α-Helical transmembrane proteins are the most important drug targets in rational drug development. However, solving the experimental structures of these proteins remains difficult, therefore computational methods to accurately and efficiently predict the structures are in great demand. We present an improved structure prediction method TMDIM based on Park et al. (Proteins 57:577-585, 2004) for predicting bitopic transmembrane protein dimers. Three major algorithmic improvements are introduction of the packing type classification, the multiple-condition decoy filtering, and the cluster-based candidate selection. In a test of predicting nine known bitopic dimers, approximately 78% of our predictions achieved a successful fit (RMSD <2.0 Å) and 78% of the cases are better predicted than the two other methods compared. Our method provides an alternative for modeling TM bitopic dimers of unknown structures for further computational studies. TMDIM is freely available on the web at https://cbbio.cis.umac.mo/TMDIM. Website is implemented in PHP, MySQL and Apache, with all major browsers supported.

  13. Knowledge-based fragment binding prediction.

    PubMed

    Tang, Grace W; Altman, Russ B

    2014-04-01

    Target-based drug discovery must assess many drug-like compounds for potential activity. Focusing on low-molecular-weight compounds (fragments) can dramatically reduce the chemical search space. However, approaches for determining protein-fragment interactions have limitations. Experimental assays are time-consuming, expensive, and not always applicable. At the same time, computational approaches using physics-based methods have limited accuracy. With increasing high-resolution structural data for protein-ligand complexes, there is now an opportunity for data-driven approaches to fragment binding prediction. We present FragFEATURE, a machine learning approach to predict small molecule fragments preferred by a target protein structure. We first create a knowledge base of protein structural environments annotated with the small molecule substructures they bind. These substructures have low-molecular weight and serve as a proxy for fragments. FragFEATURE then compares the structural environments within a target protein to those in the knowledge base to retrieve statistically preferred fragments. It merges information across diverse ligands with shared substructures to generate predictions. Our results demonstrate FragFEATURE's ability to rediscover fragments corresponding to the ligand bound with 74% precision and 82% recall on average. For many protein targets, it identifies high scoring fragments that are substructures of known inhibitors. FragFEATURE thus predicts fragments that can serve as inputs to fragment-based drug design or serve as refinement criteria for creating target-specific compound libraries for experimental or computational screening.

  14. Knowledge-based Fragment Binding Prediction

    PubMed Central

    Tang, Grace W.; Altman, Russ B.

    2014-01-01

    Target-based drug discovery must assess many drug-like compounds for potential activity. Focusing on low-molecular-weight compounds (fragments) can dramatically reduce the chemical search space. However, approaches for determining protein-fragment interactions have limitations. Experimental assays are time-consuming, expensive, and not always applicable. At the same time, computational approaches using physics-based methods have limited accuracy. With increasing high-resolution structural data for protein-ligand complexes, there is now an opportunity for data-driven approaches to fragment binding prediction. We present FragFEATURE, a machine learning approach to predict small molecule fragments preferred by a target protein structure. We first create a knowledge base of protein structural environments annotated with the small molecule substructures they bind. These substructures have low-molecular weight and serve as a proxy for fragments. FragFEATURE then compares the structural environments within a target protein to those in the knowledge base to retrieve statistically preferred fragments. It merges information across diverse ligands with shared substructures to generate predictions. Our results demonstrate FragFEATURE's ability to rediscover fragments corresponding to the ligand bound with 74% precision and 82% recall on average. For many protein targets, it identifies high scoring fragments that are substructures of known inhibitors. FragFEATURE thus predicts fragments that can serve as inputs to fragment-based drug design or serve as refinement criteria for creating target-specific compound libraries for experimental or computational screening. PMID:24762971

  15. Identification of HMX1 target genes: A predictive promoter model approach

    PubMed Central

    Boulling, Arnaud; Wicht, Linda

    2013-01-01

    Purpose A homozygous mutation in the H6 family homeobox 1 (HMX1) gene is responsible for a new oculoauricular defect leading to eye and auricular developmental abnormalities as well as early retinal degeneration (MIM 612109). However, the HMX1 pathway remains poorly understood, and in the first approach to better understand the pathway’s function, we sought to identify the target genes. Methods We developed a predictive promoter model (PPM) approach using a comparative transcriptomic analysis in the retina at P15 of a mouse model lacking functional Hmx1 (dmbo mouse) and its respective wild-type. This PPM was based on the hypothesis that HMX1 binding site (HMX1-BS) clusters should be more represented in promoters of HMX1 target genes. The most differentially expressed genes in the microarray experiment that contained HMX1-BS clusters were used to generate the PPM, which was then statistically validated. Finally, we developed two genome-wide target prediction methods: one that focused on conserving PPM features in human and mouse and one that was based on the co-occurrence of HMX1-BS pairs fitting the PPM, in human or in mouse, independently. Results The PPM construction revealed that sarcoglycan, gamma (35kDa dystrophin-associated glycoprotein) (Sgcg), teashirt zinc finger homeobox 2 (Tshz2), and solute carrier family 6 (neurotransmitter transporter, glycine) (Slc6a9) genes represented Hmx1 targets in the mouse retina at P15. Moreover, the genome-wide target prediction revealed that mouse genes belonging to the retinal axon guidance pathway were targeted by Hmx1. Expression of these three genes was experimentally validated using a quantitative reverse transcription PCR approach. The inhibitory activity of Hmx1 on Sgcg, as well as protein tyrosine phosphatase, receptor type, O (Ptpro) and Sema3f, two targets identified by the PPM, were validated with luciferase assay. Conclusions Gene expression analysis between wild-type and dmbo mice allowed us to develop a PPM that identified the first target genes of Hmx1. PMID:23946633

  16. Bio-AIMS Collection of Chemoinformatics Web Tools based on Molecular Graph Information and Artificial Intelligence Models.

    PubMed

    Munteanu, Cristian R; Gonzalez-Diaz, Humberto; Garcia, Rafael; Loza, Mabel; Pazos, Alejandro

    2015-01-01

    The molecular information encoding into molecular descriptors is the first step into in silico Chemoinformatics methods in Drug Design. The Machine Learning methods are a complex solution to find prediction models for specific biological properties of molecules. These models connect the molecular structure information such as atom connectivity (molecular graphs) or physical-chemical properties of an atom/group of atoms to the molecular activity (Quantitative Structure - Activity Relationship, QSAR). Due to the complexity of the proteins, the prediction of their activity is a complicated task and the interpretation of the models is more difficult. The current review presents a series of 11 prediction models for proteins, implemented as free Web tools on an Artificial Intelligence Model Server in Biosciences, Bio-AIMS (http://bio-aims.udc.es/TargetPred.php). Six tools predict protein activity, two models evaluate drug - protein target interactions and the other three calculate protein - protein interactions. The input information is based on the protein 3D structure for nine models, 1D peptide amino acid sequence for three tools and drug SMILES formulas for two servers. The molecular graph descriptor-based Machine Learning models could be useful tools for in silico screening of new peptides/proteins as future drug targets for specific treatments.

  17. Crysalis: an integrated server for computational analysis and design of protein crystallization.

    PubMed

    Wang, Huilin; Feng, Liubin; Zhang, Ziding; Webb, Geoffrey I; Lin, Donghai; Song, Jiangning

    2016-02-24

    The failure of multi-step experimental procedures to yield diffraction-quality crystals is a major bottleneck in protein structure determination. Accordingly, several bioinformatics methods have been successfully developed and employed to select crystallizable proteins. Unfortunately, the majority of existing in silico methods only allow the prediction of crystallization propensity, seldom enabling computational design of protein mutants that can be targeted for enhancing protein crystallizability. Here, we present Crysalis, an integrated crystallization analysis tool that builds on support-vector regression (SVR) models to facilitate computational protein crystallization prediction, analysis, and design. More specifically, the functionality of this new tool includes: (1) rapid selection of target crystallizable proteins at the proteome level, (2) identification of site non-optimality for protein crystallization and systematic analysis of all potential single-point mutations that might enhance protein crystallization propensity, and (3) annotation of target protein based on predicted structural properties. We applied the design mode of Crysalis to identify site non-optimality for protein crystallization on a proteome-scale, focusing on proteins currently classified as non-crystallizable. Our results revealed that site non-optimality is based on biases related to residues, predicted structures, physicochemical properties, and sequence loci, which provides in-depth understanding of the features influencing protein crystallization. Crysalis is freely available at http://nmrcen.xmu.edu.cn/crysalis/.

  18. Crysalis: an integrated server for computational analysis and design of protein crystallization

    PubMed Central

    Wang, Huilin; Feng, Liubin; Zhang, Ziding; Webb, Geoffrey I.; Lin, Donghai; Song, Jiangning

    2016-01-01

    The failure of multi-step experimental procedures to yield diffraction-quality crystals is a major bottleneck in protein structure determination. Accordingly, several bioinformatics methods have been successfully developed and employed to select crystallizable proteins. Unfortunately, the majority of existing in silico methods only allow the prediction of crystallization propensity, seldom enabling computational design of protein mutants that can be targeted for enhancing protein crystallizability. Here, we present Crysalis, an integrated crystallization analysis tool that builds on support-vector regression (SVR) models to facilitate computational protein crystallization prediction, analysis, and design. More specifically, the functionality of this new tool includes: (1) rapid selection of target crystallizable proteins at the proteome level, (2) identification of site non-optimality for protein crystallization and systematic analysis of all potential single-point mutations that might enhance protein crystallization propensity, and (3) annotation of target protein based on predicted structural properties. We applied the design mode of Crysalis to identify site non-optimality for protein crystallization on a proteome-scale, focusing on proteins currently classified as non-crystallizable. Our results revealed that site non-optimality is based on biases related to residues, predicted structures, physicochemical properties, and sequence loci, which provides in-depth understanding of the features influencing protein crystallization. Crysalis is freely available at http://nmrcen.xmu.edu.cn/crysalis/. PMID:26906024

  19. Using Chemoinformatics, Bioinformatics, and Bioassay to Predict and Explain the Antibacterial Activity of Nonantibiotic Food and Drug Administration Drugs.

    PubMed

    Kahlous, Nour Aldin; Bawarish, Muhammad Al Mohdi; Sarhan, Muhammad Arabi; Küpper, Manfred; Hasaba, Ali; Rajab, Mazen

    2017-04-01

    Discovering of new and effective antibiotics is a major issue facing scientists today. Luckily, the development of computer science offers new methods to overcome this issue. In this study, a set of computer software was used to predict the antibacterial activity of nonantibiotic Food and Drug Administration (FDA)-approved drugs, and to explain their action by possible binding to well-known bacterial protein targets, along with testing their antibacterial activity against Gram-positive and Gram-negative bacteria. A three-dimensional virtual screening method that relies on chemical and shape similarity was applied using rapid overlay of chemical structures (ROCS) software to select candidate compounds from the FDA-approved drugs database that share similarity with 17 known antibiotics. Then, to check their antibacterial activity, disk diffusion test was applied on Staphylococcus aureus and Escherichia coli. Finally, a protein docking method was applied using HYBRID software to predict the binding of the active candidate to the target receptor of its similar antibiotic. Of the 1,991 drugs that were screened, 34 had been selected and among them 10 drugs showed antibacterial activity, whereby drotaverine and metoclopramide activities were without precedent reports. Furthermore, the docking process predicted that diclofenac, drotaverine, (S)-flurbiprofen, (S)-ibuprofen, and indomethacin could bind to the protein target of their similar antibiotics. Nevertheless, their antibacterial activities are weak compared with those of their similar antibiotics, which can be potentiated further by performing chemical modifications on their structure.

  20. Pioneering topological methods for network-based drug-target prediction by exploiting a brain-network self-organization theory.

    PubMed

    Durán, Claudio; Daminelli, Simone; Thomas, Josephine M; Haupt, V Joachim; Schroeder, Michael; Cannistraci, Carlo Vittorio

    2017-04-26

    The bipartite network representation of the drug-target interactions (DTIs) in a biosystem enhances understanding of the drugs' multifaceted action modes, suggests therapeutic switching for approved drugs and unveils possible side effects. As experimental testing of DTIs is costly and time-consuming, computational predictors are of great aid. Here, for the first time, state-of-the-art DTI supervised predictors custom-made in network biology were compared-using standard and innovative validation frameworks-with unsupervised pure topological-based models designed for general-purpose link prediction in bipartite networks. Surprisingly, our results show that the bipartite topology alone, if adequately exploited by means of the recently proposed local-community-paradigm (LCP) theory-initially detected in brain-network topological self-organization and afterwards generalized to any complex network-is able to suggest highly reliable predictions, with comparable performance with the state-of-the-art-supervised methods that exploit additional (non-topological, for instance biochemical) DTI knowledge. Furthermore, a detailed analysis of the novel predictions revealed that each class of methods prioritizes distinct true interactions; hence, combining methodologies based on diverse principles represents a promising strategy to improve drug-target discovery. To conclude, this study promotes the power of bio-inspired computing, demonstrating that simple unsupervised rules inspired by principles of topological self-organization and adaptiveness arising during learning in living intelligent systems (like the brain) can efficiently equal perform complicated algorithms based on advanced, supervised and knowledge-based engineering. © The Author 2017. Published by Oxford University Press.

  1. Recovery of known T-cell epitopes by computational scanning of a viral genome

    NASA Astrophysics Data System (ADS)

    Logean, Antoine; Rognan, Didier

    2002-04-01

    A new computational method (EpiDock) is proposed for predicting peptide binding to class I MHC proteins, from the amino acid sequence of any protein of immunological interest. Starting from the primary structure of the target protein, individual three-dimensional structures of all possible MHC-peptide (8-, 9- and 10-mers) complexes are obtained by homology modelling. A free energy scoring function (Fresno) is then used to predict the absolute binding free energy of all possible peptides to the class I MHC restriction protein. Assuming that immunodominant epitopes are usually found among the top MHC binders, the method can thus be applied to predict the location of immunogenic peptides on the sequence of the protein target. When applied to the prediction of HLA-A*0201-restricted T-cell epitopes from the Hepatitis B virus, EpiDock was able to recover 92% of known high affinity binders and 80% of known epitopes within a filtered subset of all possible nonapeptides corresponding to about one tenth of the full theoretical list. The proposed method is fully automated and fast enough to scan a viral genome in less than an hour on a parallel computing architecture. As it requires very few starting experimental data, EpiDock can be used: (i) to predict potential T-cell epitopes from viral genomes (ii) to roughly predict still unknown peptide binding motifs for novel class I MHC alleles.

  2. Accurate and exact CNV identification from targeted high-throughput sequence data.

    PubMed

    Nord, Alex S; Lee, Ming; King, Mary-Claire; Walsh, Tom

    2011-04-12

    Massively parallel sequencing of barcoded DNA samples significantly increases screening efficiency for clinically important genes. Short read aligners are well suited to single nucleotide and indel detection. However, methods for CNV detection from targeted enrichment are lacking. We present a method combining coverage with map information for the identification of deletions and duplications in targeted sequence data. Sequencing data is first scanned for gains and losses using a comparison of normalized coverage data between samples. CNV calls are confirmed by testing for a signature of sequences that span the CNV breakpoint. With our method, CNVs can be identified regardless of whether breakpoints are within regions targeted for sequencing. For CNVs where at least one breakpoint is within targeted sequence, exact CNV breakpoints can be identified. In a test data set of 96 subjects sequenced across ~1 Mb genomic sequence using multiplexing technology, our method detected mutations as small as 31 bp, predicted quantitative copy count, and had a low false-positive rate. Application of this method allows for identification of gains and losses in targeted sequence data, providing comprehensive mutation screening when combined with a short read aligner.

  3. Analytical Quality by Design Approach in RP-HPLC Method Development for the Assay of Etofenamate in Dosage Forms

    PubMed Central

    Peraman, R.; Bhadraya, K.; Reddy, Y. Padmanabha; Reddy, C. Surayaprakash; Lokesh, T.

    2015-01-01

    By considering the current regulatory requirement for an analytical method development, a reversed phase high performance liquid chromatographic method for routine analysis of etofenamate in dosage form has been optimized using analytical quality by design approach. Unlike routine approach, the present study was initiated with understanding of quality target product profile, analytical target profile and risk assessment for method variables that affect the method response. A liquid chromatography system equipped with a C18 column (250×4.6 mm, 5 μ), a binary pump and photodiode array detector were used in this work. The experiments were conducted based on plan by central composite design, which could save time, reagents and other resources. Sigma Tech software was used to plan and analyses the experimental observations and obtain quadratic process model. The process model was used for predictive solution for retention time. The predicted data from contour diagram for retention time were verified actually and it satisfied with actual experimental data. The optimized method was achieved at 1.2 ml/min flow rate of using mobile phase composition of methanol and 0.2% triethylamine in water at 85:15, % v/v, pH adjusted to 6.5. The method was validated and verified for targeted method performances, robustness and system suitability during method transfer. PMID:26997704

  4. A New Method for the Production of Tetranitroglycoluril From Imidazo-[4,5-d]-Imidazoles With the Loss of Dinitrogen Oxide

    DTIC Science & Technology

    2014-02-01

    reactions over time. ............................................8 List of Tables Table 1. Performance predictions from Cheetah 7.0...making it a highly desirable target (table 1). 3 Table 1. Performance predictions from Cheetah 7.0 (4). Substance ρa ∆Hf (kJ/mol) Pcjd (GPa) Dv e (km...HMXc 1.90 75.02 37.19 9.246 11.00 –21.61 aDensity. bPredicted using the methods of Rice (10–14). c∆Hf and density numbers obtained from Cheetah 7.0

  5. Time Critical Targeting: Predictive Vs Reactionary Methods An Analysis For The Future

    DTIC Science & Technology

    2002-06-01

    critical targets. To conduct the analysis, a four-step process is used. First, research is conducted to determine which future aircraft, spacecraft , and...the most promising aircraft, spacecraft , and weapons are determined , they are categorized for use in either the reactive or preemptive method. For...no significant delays, 292; Alan Vick et al., 17. 33 Ibid. 12 sensors are Electro-optical (EO) sensors, thermal imagers , and signal intelligence

  6. Systems and Methods for Automated Vessel Navigation Using Sea State Prediction

    NASA Technical Reports Server (NTRS)

    Huntsberger, Terrance L. (Inventor); Howard, Andrew B. (Inventor); Reinhart, Rene Felix (Inventor); Aghazarian, Hrand (Inventor); Rankin, Arturo (Inventor)

    2017-01-01

    Systems and methods for sea state prediction and autonomous navigation in accordance with embodiments of the invention are disclosed. One embodiment of the invention includes a method of predicting a future sea state including generating a sequence of at least two 3D images of a sea surface using at least two image sensors, detecting peaks and troughs in the 3D images using a processor, identifying at least one wavefront in each 3D image based upon the detected peaks and troughs using the processor, characterizing at least one propagating wave based upon the propagation of wavefronts detected in the sequence of 3D images using the processor, and predicting a future sea state using at least one propagating wave characterizing the propagation of wavefronts in the sequence of 3D images using the processor. Another embodiment includes a method of autonomous vessel navigation based upon a predicted sea state and target location.

  7. Systems and Methods for Automated Vessel Navigation Using Sea State Prediction

    NASA Technical Reports Server (NTRS)

    Aghazarian, Hrand (Inventor); Reinhart, Rene Felix (Inventor); Huntsberger, Terrance L. (Inventor); Rankin, Arturo (Inventor); Howard, Andrew B. (Inventor)

    2015-01-01

    Systems and methods for sea state prediction and autonomous navigation in accordance with embodiments of the invention are disclosed. One embodiment of the invention includes a method of predicting a future sea state including generating a sequence of at least two 3D images of a sea surface using at least two image sensors, detecting peaks and troughs in the 3D images using a processor, identifying at least one wavefront in each 3D image based upon the detected peaks and troughs using the processor, characterizing at least one propagating wave based upon the propagation of wavefronts detected in the sequence of 3D images using the processor, and predicting a future sea state using at least one propagating wave characterizing the propagation of wavefronts in the sequence of 3D images using the processor. Another embodiment includes a method of autonomous vessel navigation based upon a predicted sea state and target location.

  8. Application of Discrete Huygens Method for Diffraction of Transient Ultrasonic Field

    NASA Astrophysics Data System (ADS)

    Alia, A.

    2018-01-01

    Several time-domain methods have been widely used to predict impulse response in acoustics. Despite its great potential, Discrete Huygens Method (DHM) has not been as widely used in the domain of ultrasonic diffraction as in other fields. In fact, little can be found in literature about the application of the DHM to diffraction phenomenon that can be described in terms of direct and edge waves, a concept suggested by Young since 1802. In this paper, a simple axisymmetric DHM-model has been used to simulate the transient ultrasonic field radiation of a baffled transducer and its diffraction by a target located on axis. The results are validated by impulse response based calculations. They indicate the capability of DHM to simulate diffraction occurring at transducer and target edges and to predict the complicated transient field in pulse mode.

  9. Computer-based prediction of mitochondria-targeting peptides.

    PubMed

    Martelli, Pier Luigi; Savojardo, Castrense; Fariselli, Piero; Tasco, Gianluca; Casadio, Rita

    2015-01-01

    Computational methods are invaluable when protein sequences, directly derived from genomic data, need functional and structural annotation. Subcellular localization is a feature necessary for understanding the protein role and the compartment where the mature protein is active and very difficult to characterize experimentally. Mitochondrial proteins encoded on the cytosolic ribosomes carry specific patterns in the precursor sequence from where it is possible to recognize a peptide targeting the protein to its final destination. Here we discuss to which extent it is feasible to develop computational methods for detecting mitochondrial targeting peptides in the precursor sequences and benchmark our and other methods on the human mitochondrial proteins endowed with experimentally characterized targeting peptides. Furthermore, we illustrate our newly implemented web server and its usage on the whole human proteome in order to infer mitochondrial targeting peptides, their cleavage sites, and whether the targeting peptide regions contain or not arginine-rich recurrent motifs. By this, we add some other 2,800 human proteins to the 124 ones already experimentally annotated with a mitochondrial targeting peptide.

  10. Prediction of microRNA target genes using an efficient genetic algorithm-based decision tree.

    PubMed

    Rabiee-Ghahfarrokhi, Behzad; Rafiei, Fariba; Niknafs, Ali Akbar; Zamani, Behzad

    2015-01-01

    MicroRNAs (miRNAs) are small, non-coding RNA molecules that regulate gene expression in almost all plants and animals. They play an important role in key processes, such as proliferation, apoptosis, and pathogen-host interactions. Nevertheless, the mechanisms by which miRNAs act are not fully understood. The first step toward unraveling the function of a particular miRNA is the identification of its direct targets. This step has shown to be quite challenging in animals primarily because of incomplete complementarities between miRNA and target mRNAs. In recent years, the use of machine-learning techniques has greatly increased the prediction of miRNA targets, avoiding the need for costly and time-consuming experiments to achieve miRNA targets experimentally. Among the most important machine-learning algorithms are decision trees, which classify data based on extracted rules. In the present work, we used a genetic algorithm in combination with C4.5 decision tree for prediction of miRNA targets. We applied our proposed method to a validated human datasets. We nearly achieved 93.9% accuracy of classification, which could be related to the selection of best rules.

  11. Prediction of microRNA target genes using an efficient genetic algorithm-based decision tree

    PubMed Central

    Rabiee-Ghahfarrokhi, Behzad; Rafiei, Fariba; Niknafs, Ali Akbar; Zamani, Behzad

    2015-01-01

    MicroRNAs (miRNAs) are small, non-coding RNA molecules that regulate gene expression in almost all plants and animals. They play an important role in key processes, such as proliferation, apoptosis, and pathogen–host interactions. Nevertheless, the mechanisms by which miRNAs act are not fully understood. The first step toward unraveling the function of a particular miRNA is the identification of its direct targets. This step has shown to be quite challenging in animals primarily because of incomplete complementarities between miRNA and target mRNAs. In recent years, the use of machine-learning techniques has greatly increased the prediction of miRNA targets, avoiding the need for costly and time-consuming experiments to achieve miRNA targets experimentally. Among the most important machine-learning algorithms are decision trees, which classify data based on extracted rules. In the present work, we used a genetic algorithm in combination with C4.5 decision tree for prediction of miRNA targets. We applied our proposed method to a validated human datasets. We nearly achieved 93.9% accuracy of classification, which could be related to the selection of best rules. PMID:26649272

  12. Use of a machine learning framework to predict substance use disorder treatment success

    PubMed Central

    Kelmansky, Diana; van der Laan, Mark; Sahker, Ethan; Jones, DeShauna; Arndt, Stephan

    2017-01-01

    There are several methods for building prediction models. The wealth of currently available modeling techniques usually forces the researcher to judge, a priori, what will likely be the best method. Super learning (SL) is a methodology that facilitates this decision by combining all identified prediction algorithms pertinent for a particular prediction problem. SL generates a final model that is at least as good as any of the other models considered for predicting the outcome. The overarching aim of this work is to introduce SL to analysts and practitioners. This work compares the performance of logistic regression, penalized regression, random forests, deep learning neural networks, and SL to predict successful substance use disorders (SUD) treatment. A nationwide database including 99,013 SUD treatment patients was used. All algorithms were evaluated using the area under the receiver operating characteristic curve (AUC) in a test sample that was not included in the training sample used to fit the prediction models. AUC for the models ranged between 0.793 and 0.820. SL was superior to all but one of the algorithms compared. An explanation of SL steps is provided. SL is the first step in targeted learning, an analytic framework that yields double robust effect estimation and inference with fewer assumptions than the usual parametric methods. Different aspects of SL depending on the context, its function within the targeted learning framework, and the benefits of this methodology in the addiction field are discussed. PMID:28394905

  13. Use of a machine learning framework to predict substance use disorder treatment success.

    PubMed

    Acion, Laura; Kelmansky, Diana; van der Laan, Mark; Sahker, Ethan; Jones, DeShauna; Arndt, Stephan

    2017-01-01

    There are several methods for building prediction models. The wealth of currently available modeling techniques usually forces the researcher to judge, a priori, what will likely be the best method. Super learning (SL) is a methodology that facilitates this decision by combining all identified prediction algorithms pertinent for a particular prediction problem. SL generates a final model that is at least as good as any of the other models considered for predicting the outcome. The overarching aim of this work is to introduce SL to analysts and practitioners. This work compares the performance of logistic regression, penalized regression, random forests, deep learning neural networks, and SL to predict successful substance use disorders (SUD) treatment. A nationwide database including 99,013 SUD treatment patients was used. All algorithms were evaluated using the area under the receiver operating characteristic curve (AUC) in a test sample that was not included in the training sample used to fit the prediction models. AUC for the models ranged between 0.793 and 0.820. SL was superior to all but one of the algorithms compared. An explanation of SL steps is provided. SL is the first step in targeted learning, an analytic framework that yields double robust effect estimation and inference with fewer assumptions than the usual parametric methods. Different aspects of SL depending on the context, its function within the targeted learning framework, and the benefits of this methodology in the addiction field are discussed.

  14. Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

    PubMed Central

    Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis

    2012-01-01

    Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606

  15. Developing Predictive Toxicity Signatures Using In Vitro Data from the EPA ToxCast Program

    EPA Science Inventory

    A major focus in toxicology research is the development of in vitro methods to predict in vivo chemical toxicity. Numerous studies have evaluated the use of targeted biochemical, cell-based and genomic assay approaches. Each of these techniques is potentially helpful, but provide...

  16. Analytic Guided-Search Model of Human Performance Accuracy in Target- Localization Search Tasks

    NASA Technical Reports Server (NTRS)

    Eckstein, Miguel P.; Beutter, Brent R.; Stone, Leland S.

    2000-01-01

    Current models of human visual search have extended the traditional serial/parallel search dichotomy. Two successful models for predicting human visual search are the Guided Search model and the Signal Detection Theory model. Although these models are inherently different, it has been difficult to compare them because the Guided Search model is designed to predict response time, while Signal Detection Theory models are designed to predict performance accuracy. Moreover, current implementations of the Guided Search model require the use of Monte-Carlo simulations, a method that makes fitting the model's performance quantitatively to human data more computationally time consuming. We have extended the Guided Search model to predict human accuracy in target-localization search tasks. We have also developed analytic expressions that simplify simulation of the model to the evaluation of a small set of equations using only three free parameters. This new implementation and extension of the Guided Search model will enable direct quantitative comparisons with human performance in target-localization search experiments and with the predictions of Signal Detection Theory and other search accuracy models.

  17. Small target detection using bilateral filter and temporal cross product in infrared images

    NASA Astrophysics Data System (ADS)

    Bae, Tae-Wuk

    2011-09-01

    We introduce a spatial and temporal target detection method using spatial bilateral filter (BF) and temporal cross product (TCP) of temporal pixels in infrared (IR) image sequences. At first, the TCP is presented to extract the characteristics of temporal pixels by using temporal profile in respective spatial coordinates of pixels. The TCP represents the cross product values by the gray level distance vector of a current temporal pixel and the adjacent temporal pixel, as well as the horizontal distance vector of the current temporal pixel and a temporal pixel corresponding to potential target center. The summation of TCP values of temporal pixels in spatial coordinates makes the temporal target image (TTI), which represents the temporal target information of temporal pixels in spatial coordinates. And then the proposed BF filter is used to extract the spatial target information. In order to predict background without targets, the proposed BF filter uses standard deviations obtained by an exponential mapping of the TCP value corresponding to the coordinate of a pixel processed spatially. The spatial target image (STI) is made by subtracting the predicted image from the original image. Thus, the spatial and temporal target image (STTI) is achieved by multiplying the STI and the TTI, and then targets finally are detected in STTI. In experimental result, the receiver operating characteristics (ROC) curves were computed experimentally to compare the objective performance. From the results, the proposed algorithm shows better discrimination of target and clutters and lower false alarm rates than the existing target detection methods.

  18. Prediction of target genes for miR-140-5p in pulmonary arterial hypertension using bioinformatics methods.

    PubMed

    Li, Fangwei; Shi, Wenhua; Wan, Yixin; Wang, Qingting; Feng, Wei; Yan, Xin; Wang, Jian; Chai, Limin; Zhang, Qianqian; Li, Manxiang

    2017-12-01

    The expression of microRNA (miR)-140-5p is known to be reduced in both pulmonary arterial hypertension (PAH) patients and monocrotaline-induced PAH models in rat. Identification of target genes for miR-140-5p with bioinformatics analysis may reveal new pathways and connections in PAH. This study aimed to explore downstream target genes and relevant signaling pathways regulated by miR-140-5p to provide theoretical evidences for further researches on role of miR-140-5p in PAH. Multiple downstream target genes and upstream transcription factors (TFs) of miR-140-5p were predicted in the analysis. Gene ontology (GO) enrichment analysis indicated that downstream target genes of miR-140-5p were enriched in many biological processes, such as biological regulation, signal transduction, response to chemical stimulus, stem cell proliferation, cell surface receptor signaling pathways. Kyoto Encyclopedia of Genes and Genome (KEGG) pathway analysis found that downstream target genes were mainly located in Notch, TGF-beta, PI3K/Akt, and Hippo signaling pathway. According to TF-miRNA-mRNA network, the important downstream target genes of miR-140-5p were PPI, TGF-betaR1, smad4, JAG1, ADAM10, FGF9, PDGFRA, VEGFA, LAMC1, TLR4, and CREB. After thoroughly reviewing published literature, we found that 23 target genes and seven signaling pathways were truly inhibited by miR-140-5p in various tissues or cells; most of these verified targets were in accordance with our present prediction. Other predicted targets still need further verification in vivo and in vitro .

  19. Sensitivity, Specificity, PPV, and NPV for Predictive Biomarkers

    PubMed Central

    2015-01-01

    Molecularly targeted cancer drugs are often developed with companion diagnostics that attempt to identify which patients will have better outcome on the new drug than the control regimen. Such predictive biomarkers are playing an increasingly important role in precision oncology. For diagnostic tests, sensitivity, specificity, positive predictive value, and negative predictive are usually used as performance measures. This paper discusses these indices for predictive biomarkers, provides methods for their calculation with survival or response endpoints, and describes assumptions involved in their use. PMID:26109105

  20. BeReTa: a systematic method for identifying target transcriptional regulators to enhance microbial production of chemicals.

    PubMed

    Kim, Minsuk; Sun, Gwanggyu; Lee, Dong-Yup; Kim, Byung-Gee

    2017-01-01

    Modulation of regulatory circuits governing the metabolic processes is a crucial step for developing microbial cell factories. Despite the prevalence of in silico strain design algorithms, most of them are not capable of predicting required modifications in regulatory networks. Although a few algorithms may predict relevant targets for transcriptional regulator (TR) manipulations, they have limited reliability and applicability due to their high dependency on the availability of integrated metabolic/regulatory models. We present BeReTa (Beneficial Regulator Targeting), a new algorithm for prioritization of TR manipulation targets, which makes use of unintegrated network models. BeReTa identifies TR manipulation targets by evaluating regulatory strengths of interactions and beneficial effects of reactions, and subsequently assigning beneficial scores for the TRs. We demonstrate that BeReTa can predict both known and novel TR manipulation targets for enhanced production of various chemicals in Escherichia coli Furthermore, through a case study of antibiotics production in Streptomyces coelicolor, we successfully demonstrate its wide applicability to even less-studied organisms. To the best of our knowledge, BeReTa is the first strain design algorithm exclusively designed for predicting TR manipulation targets. MATLAB code is available at https://github.com/kms1041/BeReTa (github). byungkim@snu.ac.krSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. Improvement of experimental testing and network training conditions with genome-wide microarrays for more accurate predictions of drug gene targets

    PubMed Central

    2014-01-01

    Background Genome-wide microarrays have been useful for predicting chemical-genetic interactions at the gene level. However, interpreting genome-wide microarray results can be overwhelming due to the vast output of gene expression data combined with off-target transcriptional responses many times induced by a drug treatment. This study demonstrates how experimental and computational methods can interact with each other, to arrive at more accurate predictions of drug-induced perturbations. We present a two-stage strategy that links microarray experimental testing and network training conditions to predict gene perturbations for a drug with a known mechanism of action in a well-studied organism. Results S. cerevisiae cells were treated with the antifungal, fluconazole, and expression profiling was conducted under different biological conditions using Affymetrix genome-wide microarrays. Transcripts were filtered with a formal network-based method, sparse simultaneous equation models and Lasso regression (SSEM-Lasso), under different network training conditions. Gene expression results were evaluated using both gene set and single gene target analyses, and the drug’s transcriptional effects were narrowed first by pathway and then by individual genes. Variables included: (i) Testing conditions – exposure time and concentration and (ii) Network training conditions – training compendium modifications. Two analyses of SSEM-Lasso output – gene set and single gene – were conducted to gain a better understanding of how SSEM-Lasso predicts perturbation targets. Conclusions This study demonstrates that genome-wide microarrays can be optimized using a two-stage strategy for a more in-depth understanding of how a cell manifests biological reactions to a drug treatment at the transcription level. Additionally, a more detailed understanding of how the statistical model, SSEM-Lasso, propagates perturbations through a network of gene regulatory interactions is achieved. PMID:24444313

  2. Discriminative motif discovery via simulated evolution and random under-sampling.

    PubMed

    Song, Tao; Gu, Hong

    2014-01-01

    Conserved motifs in biological sequences are closely related to their structure and functions. Recently, discriminative motif discovery methods have attracted more and more attention. However, little attention has been devoted to the data imbalance problem, which is one of the main reasons affecting the performance of the discriminative models. In this article, a simulated evolution method is applied to solve the multi-class imbalance problem at the stage of data preprocessing, and at the stage of Hidden Markov Models (HMMs) training, a random under-sampling method is introduced for the imbalance between the positive and negative datasets. It is shown that, in the task of discovering targeting motifs of nine subcellular compartments, the motifs found by our method are more conserved than the methods without considering data imbalance problem and recover the most known targeting motifs from Minimotif Miner and InterPro. Meanwhile, we use the found motifs to predict protein subcellular localization and achieve higher prediction precision and recall for the minority classes.

  3. Acoustics Discipline Overview

    NASA Technical Reports Server (NTRS)

    Envia, Edmane; Thomas, Russell

    2007-01-01

    As part of the Fundamental Aeronautics Program Annual Review, a summary of the progress made in 2007 in acoustics research under the Subsonic Fixed Wing project is given. The presentation describes highlights from in-house and external activities including partnerships and NRA-funded research with industry and academia. Brief progress reports from all acoustics Phase 1 NRAs are also included as are outlines of the planned activities for 2008 and all Phase 2 NRAs. N+1 and N+2 technology paths outlined for Subsonic Fixed Wing noise targets. NRA Round 1 progressing with focus on prediction method advancement. NRA Round 2 initiating work focused on N+2 technology, prediction methods, and validation. Excellent partnerships in progress supporting N+1 technology targets and providing key data sets.

  4. Drug Repositioning by Kernel-Based Integration of Molecular Structure, Molecular Activity, and Phenotype Data

    PubMed Central

    Wang, Yongcui; Chen, Shilong; Deng, Naiyang; Wang, Yong

    2013-01-01

    Computational inference of novel therapeutic values for existing drugs, i.e., drug repositioning, offers the great prospect for faster and low-risk drug development. Previous researches have indicated that chemical structures, target proteins, and side-effects could provide rich information in drug similarity assessment and further disease similarity. However, each single data source is important in its own way and data integration holds the great promise to reposition drug more accurately. Here, we propose a new method for drug repositioning, PreDR (Predict Drug Repositioning), to integrate molecular structure, molecular activity, and phenotype data. Specifically, we characterize drug by profiling in chemical structure, target protein, and side-effects space, and define a kernel function to correlate drugs with diseases. Then we train a support vector machine (SVM) to computationally predict novel drug-disease interactions. PreDR is validated on a well-established drug-disease network with 1,933 interactions among 593 drugs and 313 diseases. By cross-validation, we find that chemical structure, drug target, and side-effects information are all predictive for drug-disease relationships. More experimentally observed drug-disease interactions can be revealed by integrating these three data sources. Comparison with existing methods demonstrates that PreDR is competitive both in accuracy and coverage. Follow-up database search and pathway analysis indicate that our new predictions are worthy of further experimental validation. Particularly several novel predictions are supported by clinical trials databases and this shows the significant prospects of PreDR in future drug treatment. In conclusion, our new method, PreDR, can serve as a useful tool in drug discovery to efficiently identify novel drug-disease interactions. In addition, our heterogeneous data integration framework can be applied to other problems. PMID:24244318

  5. Effect of Normal Lung Definition on Lung Dosimetry and Lung Toxicity Prediction in Radiation Therapy Treatment Planning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Weili; Department of Radiation Oncology, the Fourth Affiliated Hospital, China Medical University, Shenyang; Xu, Yaping

    2013-08-01

    Purpose: This study aimed to compare lung dose–volume histogram (DVH) parameters such as mean lung dose (MLD) and the lung volume receiving ≥20 Gy (V20) of commonly used definitions of normal lung in terms of tumor/target subtraction and to determine to what extent they differ in predicting radiation pneumonitis (RP). Methods and Materials: One hundred lung cancer patients treated with definitive radiation therapy were assessed. The gross tumor volume (GTV) and clinical planning target volume (PTV{sub c}) were defined by the treating physician and dosimetrist. For this study, the clinical target volume (CTV) was defined as GTV with 8-mm uniformmore » expansion, and the PTV was defined as CTV with an 8-mm uniform expansion. Lung DVHs were generated with exclusion of targets: (1) GTV (DVH{sub G}); (2) CTV (DVH{sub C}); (3) PTV (DVH{sub P}); and (4) PTV{sub c} (DVH{sub Pc}). The lung DVHs, V20s, and MLDs from each of the 4 methods were compared, as was their significance in predicting radiation pneumonitis of grade 2 or greater (RP2). Results: There are significant differences in dosimetric parameters among the various definition methods (all Ps<.05). The mean and maximum differences in V20 are 4.4% and 12.6% (95% confidence interval 3.6%-5.1%), respectively. The mean and maximum differences in MLD are 3.3 Gy and 7.5 Gy (95% confidence interval, 1.7-4.8 Gy), respectively. MLDs of all methods are highly correlated with each other and significantly correlated with clinical RP2, although V20s are not. For RP2 prediction, on the receiver operating characteristic curve, MLD from DVH{sub G} (MLD{sub G}) has a greater area under curve of than MLD from DVH{sub C} (MLD{sub C}) or DVH{sub P} (MLD{sub P}). Limiting RP2 to 30%, the threshold is 22.4, 20.6, and 18.8 Gy, for MLD{sub G}, MLD{sub C}, and MLD{sub P}, respectively. Conclusions: The differences in MLD and V20 from various lung definitions are significant. MLD from the GTV exclusion method may be more accurate in predicting clinical significant radiation pneumonitis.« less

  6. Real-time prediction of respiratory motion based on a local dynamic model in an augmented space

    NASA Astrophysics Data System (ADS)

    Hong, S.-M.; Jung, B.-H.; Ruan, D.

    2011-03-01

    Motion-adaptive radiotherapy aims to deliver ablative radiation dose to the tumor target with minimal normal tissue exposure, by accounting for real-time target movement. In practice, prediction is usually necessary to compensate for system latency induced by measurement, communication and control. This work focuses on predicting respiratory motion, which is most dominant for thoracic and abdominal tumors. We develop and investigate the use of a local dynamic model in an augmented space, motivated by the observation that respiratory movement exhibits a locally circular pattern in a plane augmented with a delayed axis. By including the angular velocity as part of the system state, the proposed dynamic model effectively captures the natural evolution of respiratory motion. The first-order extended Kalman filter is used to propagate and update the state estimate. The target location is predicted by evaluating the local dynamic model equations at the required prediction length. This method is complementary to existing work in that (1) the local circular motion model characterizes 'turning', overcoming the limitation of linear motion models; (2) it uses a natural state representation including the local angular velocity and updates the state estimate systematically, offering explicit physical interpretations; (3) it relies on a parametric model and is much less data-satiate than the typical adaptive semiparametric or nonparametric method. We tested the performance of the proposed method with ten RPM traces, using the normalized root mean squared difference between the predicted value and the retrospective observation as the error metric. Its performance was compared with predictors based on the linear model, the interacting multiple linear models and the kernel density estimator for various combinations of prediction lengths and observation rates. The local dynamic model based approach provides the best performance for short to medium prediction lengths under relatively low observation rate. Sensitivity analysis indicates its robustness toward the choice of parameters. Its simplicity, robustness and low computation cost makes the proposed local dynamic model an attractive tool for real-time prediction with system latencies below 0.4 s.

  7. Real-time prediction of respiratory motion based on a local dynamic model in an augmented space.

    PubMed

    Hong, S-M; Jung, B-H; Ruan, D

    2011-03-21

    Motion-adaptive radiotherapy aims to deliver ablative radiation dose to the tumor target with minimal normal tissue exposure, by accounting for real-time target movement. In practice, prediction is usually necessary to compensate for system latency induced by measurement, communication and control. This work focuses on predicting respiratory motion, which is most dominant for thoracic and abdominal tumors. We develop and investigate the use of a local dynamic model in an augmented space, motivated by the observation that respiratory movement exhibits a locally circular pattern in a plane augmented with a delayed axis. By including the angular velocity as part of the system state, the proposed dynamic model effectively captures the natural evolution of respiratory motion. The first-order extended Kalman filter is used to propagate and update the state estimate. The target location is predicted by evaluating the local dynamic model equations at the required prediction length. This method is complementary to existing work in that (1) the local circular motion model characterizes 'turning', overcoming the limitation of linear motion models; (2) it uses a natural state representation including the local angular velocity and updates the state estimate systematically, offering explicit physical interpretations; (3) it relies on a parametric model and is much less data-satiate than the typical adaptive semiparametric or nonparametric method. We tested the performance of the proposed method with ten RPM traces, using the normalized root mean squared difference between the predicted value and the retrospective observation as the error metric. Its performance was compared with predictors based on the linear model, the interacting multiple linear models and the kernel density estimator for various combinations of prediction lengths and observation rates. The local dynamic model based approach provides the best performance for short to medium prediction lengths under relatively low observation rate. Sensitivity analysis indicates its robustness toward the choice of parameters. Its simplicity, robustness and low computation cost makes the proposed local dynamic model an attractive tool for real-time prediction with system latencies below 0.4 s.

  8. Analytical research on impacting load of aircraft crashing upon moveable concrete target

    NASA Astrophysics Data System (ADS)

    Zhu, Tong; Ou, Zhuocheng; Duan, Zhuoping; Huang, Fenglei

    2018-03-01

    The impact load of an aircraft impact upon moveable concrete target was analyzed in this paper by both theoretical and numerical methods. The aircraft was simplified as a one dimensional pole and stress-wave theory was used to deduce the new formula. Furthermore, aiming to compare with previous experimental data, a numerical calculation based on the new formula had been carried out which showed good agreement with the experimental data. The approach, a new formula with particular numerical method, can predict not only the impact load but also the deviation between moveable and static concrete target.

  9. Improving threading algorithms for remote homology modeling by combining fragment and template comparisons

    PubMed Central

    Zhou, Hongyi; Skolnick, Jeffrey

    2010-01-01

    In this work, we develop a method called FTCOM for assessing the global quality of protein structural models for targets of medium and hard difficulty (remote homology) produced by structure prediction approaches such as threading or ab initio structure prediction. FTCOM requires the Cα coordinates of full length models and assesses model quality based on fragment comparison and a score derived from comparison of the model to top threading templates. On a set of 361 medium/hard targets, FTCOM was applied to and assessed for its ability to improve upon the results from the SP3, SPARKS, PROSPECTOR_3, and PRO-SP3-TASSER threading algorithms. The average TM-score improves by 5%–10% for the first selected model by the new method over models obtained by the original selection procedure in the respective threading methods. Moreover the number of foldable targets (TM-score ≥0.4) increases from least 7.6% for SP3 to 54% for SPARKS. Thus, FTCOM is a promising approach to template selection. PMID:20455261

  10. Expanding the Targeting Process into the Space Domain

    DTIC Science & Technology

    2008-06-01

    planning and operations. The process is a continuous method by which information is converted into intelligence and made available to users...Targeting personnel and organizations consume intelligence produced by various agencies and organizations. Actionable and predictive intelligence applies to... intelligence and operations communities (Figure 1). 1 United States Department of Defense Joint

  11. Prediction of hot spots in protein interfaces using a random forest model with hybrid features.

    PubMed

    Wang, Lin; Liu, Zhi-Ping; Zhang, Xiang-Sun; Chen, Luonan

    2012-03-01

    Prediction of hot spots in protein interfaces provides crucial information for the research on protein-protein interaction and drug design. Existing machine learning methods generally judge whether a given residue is likely to be a hot spot by extracting features only from the target residue. However, hot spots usually form a small cluster of residues which are tightly packed together at the center of protein interface. With this in mind, we present a novel method to extract hybrid features which incorporate a wide range of information of the target residue and its spatially neighboring residues, i.e. the nearest contact residue in the other face (mirror-contact residue) and the nearest contact residue in the same face (intra-contact residue). We provide a novel random forest (RF) model to effectively integrate these hybrid features for predicting hot spots in protein interfaces. Our method can achieve accuracy (ACC) of 82.4% and Matthew's correlation coefficient (MCC) of 0.482 in Alanine Scanning Energetics Database, and ACC of 77.6% and MCC of 0.429 in Binding Interface Database. In a comparison study, performance of our RF model exceeds other existing methods, such as Robetta, FOLDEF, KFC, KFC2, MINERVA and HotPoint. Of our hybrid features, three physicochemical features of target residues (mass, polarizability and isoelectric point), the relative side-chain accessible surface area and the average depth index of mirror-contact residues are found to be the main discriminative features in hot spots prediction. We also confirm that hot spots tend to form large contact surface areas between two interacting proteins. Source data and code are available at: http://www.aporc.org/doc/wiki/HotSpot.

  12. In Silico target fishing: addressing a "Big Data" problem by ligand-based similarity rankings with data fusion.

    PubMed

    Liu, Xian; Xu, Yuan; Li, Shanshan; Wang, Yulan; Peng, Jianlong; Luo, Cheng; Luo, Xiaomin; Zheng, Mingyue; Chen, Kaixian; Jiang, Hualiang

    2014-01-01

    Ligand-based in silico target fishing can be used to identify the potential interacting target of bioactive ligands, which is useful for understanding the polypharmacology and safety profile of existing drugs. The underlying principle of the approach is that known bioactive ligands can be used as reference to predict the targets for a new compound. We tested a pipeline enabling large-scale target fishing and drug repositioning, based on simple fingerprint similarity rankings with data fusion. A large library containing 533 drug relevant targets with 179,807 active ligands was compiled, where each target was defined by its ligand set. For a given query molecule, its target profile is generated by similarity searching against the ligand sets assigned to each target, for which individual searches utilizing multiple reference structures are then fused into a single ranking list representing the potential target interaction profile of the query compound. The proposed approach was validated by 10-fold cross validation and two external tests using data from DrugBank and Therapeutic Target Database (TTD). The use of the approach was further demonstrated with some examples concerning the drug repositioning and drug side-effects prediction. The promising results suggest that the proposed method is useful for not only finding promiscuous drugs for their new usages, but also predicting some important toxic liabilities. With the rapid increasing volume and diversity of data concerning drug related targets and their ligands, the simple ligand-based target fishing approach would play an important role in assisting future drug design and discovery.

  13. Spotting and designing promiscuous ligands for drug discovery.

    PubMed

    Schneider, P; Röthlisberger, M; Reker, D; Schneider, G

    2016-01-21

    The promiscuous binding behavior of bioactive compounds forms a mechanistic basis for understanding polypharmacological drug action. We present the development and prospective application of a computational tool for identifying potential promiscuous drug-like ligands. In combination with computational target prediction methods, the approach provides a working concept for rationally designing such molecular structures. We could confirm the multi-target binding of a de novo generated compound in a proof-of-concept study relying on the new method.

  14. Accurate and Reliable Prediction of the Binding Affinities of Macrocycles to Their Protein Targets.

    PubMed

    Yu, Haoyu S; Deng, Yuqing; Wu, Yujie; Sindhikara, Dan; Rask, Amy R; Kimura, Takayuki; Abel, Robert; Wang, Lingle

    2017-12-12

    Macrocycles have been emerging as a very important drug class in the past few decades largely due to their expanded chemical diversity benefiting from advances in synthetic methods. Macrocyclization has been recognized as an effective way to restrict the conformational space of acyclic small molecule inhibitors with the hope of improving potency, selectivity, and metabolic stability. Because of their relatively larger size as compared to typical small molecule drugs and the complexity of the structures, efficient sampling of the accessible macrocycle conformational space and accurate prediction of their binding affinities to their target protein receptors poses a great challenge of central importance in computational macrocycle drug design. In this article, we present a novel method for relative binding free energy calculations between macrocycles with different ring sizes and between the macrocycles and their corresponding acyclic counterparts. We have applied the method to seven pharmaceutically interesting data sets taken from recent drug discovery projects including 33 macrocyclic ligands covering a diverse chemical space. The predicted binding free energies are in good agreement with experimental data with an overall root-mean-square error (RMSE) of 0.94 kcal/mol. This is to our knowledge the first time where the free energy of the macrocyclization of linear molecules has been directly calculated with rigorous physics-based free energy calculation methods, and we anticipate the outstanding accuracy demonstrated here across a broad range of target classes may have significant implications for macrocycle drug discovery.

  15. Machine learning properties of binary wurtzite superlattices

    DOE PAGES

    Pilania, G.; Liu, X. -Y.

    2018-01-12

    The burgeoning paradigm of high-throughput computations and materials informatics brings new opportunities in terms of targeted materials design and discovery. The discovery process can be significantly accelerated and streamlined if one can learn effectively from available knowledge and past data to predict materials properties efficiently. Indeed, a very active area in materials science research is to develop machine learning based methods that can deliver automated and cross-validated predictive models using either already available materials data or new data generated in a targeted manner. In the present paper, we show that fast and accurate predictions of a wide range of propertiesmore » of binary wurtzite superlattices, formed by a diverse set of chemistries, can be made by employing state-of-the-art statistical learning methods trained on quantum mechanical computations in combination with a judiciously chosen numerical representation to encode materials’ similarity. These surrogate learning models then allow for efficient screening of vast chemical spaces by providing instant predictions of the targeted properties. Moreover, the models can be systematically improved in an adaptive manner, incorporate properties computed at different levels of fidelities and are naturally amenable to inverse materials design strategies. Finally, while the learning approach to make predictions for a wide range of properties (including structural, elastic and electronic properties) is demonstrated here for a specific example set containing more than 1200 binary wurtzite superlattices, the adopted framework is equally applicable to other classes of materials as well.« less

  16. Machine learning properties of binary wurtzite superlattices

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pilania, G.; Liu, X. -Y.

    The burgeoning paradigm of high-throughput computations and materials informatics brings new opportunities in terms of targeted materials design and discovery. The discovery process can be significantly accelerated and streamlined if one can learn effectively from available knowledge and past data to predict materials properties efficiently. Indeed, a very active area in materials science research is to develop machine learning based methods that can deliver automated and cross-validated predictive models using either already available materials data or new data generated in a targeted manner. In the present paper, we show that fast and accurate predictions of a wide range of propertiesmore » of binary wurtzite superlattices, formed by a diverse set of chemistries, can be made by employing state-of-the-art statistical learning methods trained on quantum mechanical computations in combination with a judiciously chosen numerical representation to encode materials’ similarity. These surrogate learning models then allow for efficient screening of vast chemical spaces by providing instant predictions of the targeted properties. Moreover, the models can be systematically improved in an adaptive manner, incorporate properties computed at different levels of fidelities and are naturally amenable to inverse materials design strategies. Finally, while the learning approach to make predictions for a wide range of properties (including structural, elastic and electronic properties) is demonstrated here for a specific example set containing more than 1200 binary wurtzite superlattices, the adopted framework is equally applicable to other classes of materials as well.« less

  17. Advanced Computational Methods for High-accuracy Refinement of Protein Low-quality Models

    NASA Astrophysics Data System (ADS)

    Zang, Tianwu

    Predicting the 3-dimentional structure of protein has been a major interest in the modern computational biology. While lots of successful methods can generate models with 3˜5A root-mean-square deviation (RMSD) from the solution, the progress of refining these models is quite slow. It is therefore urgently needed to develop effective methods to bring low-quality models to higher-accuracy ranges (e.g., less than 2 A RMSD). In this thesis, I present several novel computational methods to address the high-accuracy refinement problem. First, an enhanced sampling method, named parallel continuous simulated tempering (PCST), is developed to accelerate the molecular dynamics (MD) simulation. Second, two energy biasing methods, Structure-Based Model (SBM) and Ensemble-Based Model (EBM), are introduced to perform targeted sampling around important conformations. Third, a three-step method is developed to blindly select high-quality models along the MD simulation. These methods work together to make significant refinement of low-quality models without any knowledge of the solution. The effectiveness of these methods is examined in different applications. Using the PCST-SBM method, models with higher global distance test scores (GDT_TS) are generated and selected in the MD simulation of 18 targets from the refinement category of the 10th Critical Assessment of Structure Prediction (CASP10). In addition, in the refinement test of two CASP10 targets using the PCST-EBM method, it is indicated that EBM may bring the initial model to even higher-quality levels. Furthermore, a multi-round refinement protocol of PCST-SBM improves the model quality of a protein to the level that is sufficient high for the molecular replacement in X-ray crystallography. Our results justify the crucial position of enhanced sampling in the protein structure prediction and demonstrate that a considerable improvement of low-accuracy structures is still achievable with current force fields.

  18. Recommendations for evaluation of computational methods

    NASA Astrophysics Data System (ADS)

    Jain, Ajay N.; Nicholls, Anthony

    2008-03-01

    The field of computational chemistry, particularly as applied to drug design, has become increasingly important in terms of the practical application of predictive modeling to pharmaceutical research and development. Tools for exploiting protein structures or sets of ligands known to bind particular targets can be used for binding-mode prediction, virtual screening, and prediction of activity. A serious weakness within the field is a lack of standards with respect to quantitative evaluation of methods, data set preparation, and data set sharing. Our goal should be to report new methods or comparative evaluations of methods in a manner that supports decision making for practical applications. Here we propose a modest beginning, with recommendations for requirements on statistical reporting, requirements for data sharing, and best practices for benchmark preparation and usage.

  19. Method of identifying hairpin DNA probes by partial fold analysis

    DOEpatents

    Miller, Benjamin L [Penfield, NY; Strohsahl, Christopher M [Saugerties, NY

    2009-10-06

    Method of identifying molecular beacons in which a secondary structure prediction algorithm is employed to identify oligonucleotide sequences within a target gene having the requisite hairpin structure. Isolated oligonucleotides, molecular beacons prepared from those oligonucleotides, and their use are also disclosed.

  20. Method of identifying hairpin DNA probes by partial fold analysis

    DOEpatents

    Miller, Benjamin L.; Strohsahl, Christopher M.

    2008-10-28

    Methods of identifying molecular beacons in which a secondary structure prediction algorithm is employed to identify oligonucleotide sequences within a target gene having the requisite hairpin structure. Isolated oligonucleotides, molecular beacons prepared from those oligonucleotides, and their use are also disclosed.

  1. Reverse-engineering the genetic circuitry of a cancer cell with predicted intervention in chronic lymphocytic leukemia.

    PubMed

    Vallat, Laurent; Kemper, Corey A; Jung, Nicolas; Maumy-Bertrand, Myriam; Bertrand, Frédéric; Meyer, Nicolas; Pocheville, Arnaud; Fisher, John W; Gribben, John G; Bahram, Seiamak

    2013-01-08

    Cellular behavior is sustained by genetic programs that are progressively disrupted in pathological conditions--notably, cancer. High-throughput gene expression profiling has been used to infer statistical models describing these cellular programs, and development is now needed to guide orientated modulation of these systems. Here we develop a regression-based model to reverse-engineer a temporal genetic program, based on relevant patterns of gene expression after cell stimulation. This method integrates the temporal dimension of biological rewiring of genetic programs and enables the prediction of the effect of targeted gene disruption at the system level. We tested the performance accuracy of this model on synthetic data before reverse-engineering the response of primary cancer cells to a proliferative (protumorigenic) stimulation in a multistate leukemia biological model (i.e., chronic lymphocytic leukemia). To validate the ability of our method to predict the effects of gene modulation on the global program, we performed an intervention experiment on a targeted gene. Comparison of the predicted and observed gene expression changes demonstrates the possibility of predicting the effects of a perturbation in a gene regulatory network, a first step toward an orientated intervention in a cancer cell genetic program.

  2. Learning Instance-Specific Predictive Models

    PubMed Central

    Visweswaran, Shyam; Cooper, Gregory F.

    2013-01-01

    This paper introduces a Bayesian algorithm for constructing predictive models from data that are optimized to predict a target variable well for a particular instance. This algorithm learns Markov blanket models, carries out Bayesian model averaging over a set of models to predict a target variable of the instance at hand, and employs an instance-specific heuristic to locate a set of suitable models to average over. We call this method the instance-specific Markov blanket (ISMB) algorithm. The ISMB algorithm was evaluated on 21 UCI data sets using five different performance measures and its performance was compared to that of several commonly used predictive algorithms, including nave Bayes, C4.5 decision tree, logistic regression, neural networks, k-Nearest Neighbor, Lazy Bayesian Rules, and AdaBoost. Over all the data sets, the ISMB algorithm performed better on average on all performance measures against all the comparison algorithms. PMID:25045325

  3. Chemical and protein structural basis for biological crosstalk between PPAR α and COX enzymes

    NASA Astrophysics Data System (ADS)

    Cleves, Ann E.; Jain, Ajay N.

    2015-02-01

    We have previously validated a probabilistic framework that combined computational approaches for predicting the biological activities of small molecule drugs. Molecule comparison methods included molecular structural similarity metrics and similarity computed from lexical analysis of text in drug package inserts. Here we present an analysis of novel drug/target predictions, focusing on those that were not obvious based on known pharmacological crosstalk. Considering those cases where the predicted target was an enzyme with known 3D structure allowed incorporation of information from molecular docking and protein binding pocket similarity in addition to ligand-based comparisons. Taken together, the combination of orthogonal information sources led to investigation of a surprising predicted relationship between a transcription factor and an enzyme, specifically, PPAR α and the cyclooxygenase enzymes. These predictions were confirmed by direct biochemical experiments which validate the approach and show for the first time that PPAR α agonists are cyclooxygenase inhibitors.

  4. miREE: miRNA recognition elements ensemble

    PubMed Central

    2011-01-01

    Background Computational methods for microRNA target prediction are a fundamental step to understand the miRNA role in gene regulation, a key process in molecular biology. In this paper we present miREE, a novel microRNA target prediction tool. miREE is an ensemble of two parts entailing complementary but integrated roles in the prediction. The Ab-Initio module leverages upon a genetic algorithmic approach to generate a set of candidate sites on the basis of their microRNA-mRNA duplex stability properties. Then, a Support Vector Machine (SVM) learning module evaluates the impact of microRNA recognition elements on the target gene. As a result the prediction takes into account information regarding both miRNA-target structural stability and accessibility. Results The proposed method significantly improves the state-of-the-art prediction tools in terms of accuracy with a better balance between specificity and sensitivity, as demonstrated by the experiments conducted on several large datasets across different species. miREE achieves this result by tackling two of the main challenges of current prediction tools: (1) The reduced number of false positives for the Ab-Initio part thanks to the integration of a machine learning module (2) the specificity of the machine learning part, obtained through an innovative technique for rich and representative negative records generation. The validation was conducted on experimental datasets where the miRNA:mRNA interactions had been obtained through (1) direct validation where even the binding site is provided, or through (2) indirect validation, based on gene expression variations obtained from high-throughput experiments where the specific interaction is not validated in detail and consequently the specific binding site is not provided. Conclusions The coupling of two parts: a sensitive Ab-Initio module and a selective machine learning part capable of recognizing the false positives, leads to an improved balance between sensitivity and specificity. miREE obtains a reasonable trade-off between filtering false positives and identifying targets. miREE tool is available online at http://didattica-online.polito.it/eda/miREE/ PMID:22115078

  5. Estimating the circuit delay of FPGA with a transfer learning method

    NASA Astrophysics Data System (ADS)

    Cui, Xiuhai; Liu, Datong; Peng, Yu; Peng, Xiyuan

    2017-10-01

    With the increase of FPGA (Field Programmable Gate Array, FPGA) functionality, FPGA has become an on-chip system platform. Due to increase the complexity of FPGA, estimating the delay of FPGA is a very challenge work. To solve the problems, we propose a transfer learning estimation delay (TLED) method to simplify the delay estimation of different speed grade FPGA. In fact, the same style different speed grade FPGA comes from the same process and layout. The delay has some correlation among different speed grade FPGA. Therefore, one kind of speed grade FPGA is chosen as a basic training sample in this paper. Other training samples of different speed grade can get from the basic training samples through of transfer learning. At the same time, we also select a few target FPGA samples as training samples. A general predictive model is trained by these samples. Thus one kind of estimation model is used to estimate different speed grade FPGA circuit delay. The framework of TRED includes three phases: 1) Building a basic circuit delay library which includes multipliers, adders, shifters, and so on. These circuits are used to train and build the predictive model. 2) By contrasting experiments among different algorithms, the forest random algorithm is selected to train predictive model. 3) The target circuit delay is predicted by the predictive model. The Artix-7, Kintex-7, and Virtex-7 are selected to do experiments. Each of them includes -1, -2, -2l, and -3 different speed grade. The experiments show the delay estimation accuracy score is more than 92% with the TLED method. This result shows that the TLED method is a feasible delay assessment method, especially in the high-level synthesis stage of FPGA tool, which is an efficient and effective delay assessment method.

  6. Noninvasive Prenatal Detection of Trisomy 21 by Targeted Semiconductor Sequencing: A Technical Feasibility Study.

    PubMed

    Xi, Yanwei; Arbabi, Aryan; McNaughton, Amy J M; Hamilton, Alison; Hull, Danna; Perras, Helene; Chiu, Tillie; Morrison, Shawna; Goldsmith, Claire; Creede, Emilie; Anger, Gregory J; Honeywell, Christina; Cloutier, Mireille; Macchio, Natasha; Kiss, Courtney; Liu, Xudong; Crocker, Susan; Davies, Gregory A; Brudno, Michael; Armour, Christine M

    2017-01-01

    To develop an alternate noninvasive prenatal testing method for the assessment of trisomy 21 (T21) using a targeted semiconductor sequencing approach. A customized AmpliSeq panel was designed with 1,067 primer pairs targeting specific regions on chromosomes 21, 18, 13, and others. A total of 235 samples, including 30 affected with T21, were sequenced with an Ion Torrent Proton sequencer, and a method was developed for assessing the probability of fetal aneuploidy via derivation of a risk score. Application of the derived risk score yields a bimodal distribution, with the affected samples clustering near 1.0 and the unaffected near 0. For a risk score cutoff of 0.345, above which all would be considered at "high risk," all 30 T21-positive pregnancies were correctly predicted to be affected, and 199 of the 205 non-T21 samples were correctly predicted. The average hands-on time spent on library preparation and sequencing was 19 h in total, and the average number of reads of sequence obtained was 3.75 million per sample. With the described targeted sequencing approach on the semiconductor platform using a custom-designed library and a probabilistic statistical approach, we have demonstrated the feasibility of an alternate method of assessment for fetal T21. © 2017 S. Karger AG, Basel.

  7. Recovering actives in multi-antitarget and target design of analogs of the myosin II inhibitor blebbistatin

    NASA Astrophysics Data System (ADS)

    Roman, Bart I.; Guedes, Rita C.; Stevens, Christian V.; García-Sosa, Alfonso T.

    2018-05-01

    In multitarget drug design, it is critical to identify active and inactive compounds against a variety of targets and antitargets. Multitarget strategies thus test the limits of available technology, be that in screening large databases of compounds versus a large number of targets, or in using in silico methods for understanding and reliably predicting these pharmacological outcomes. In this paper, we have evaluated the potential of several in silico approaches to predict the target, antitarget and physicochemical profile of (S)-blebbistatin, the best-known myosin II ATPase inhibitor, and a series of analogs thereof. Standard and augmented structure-based design techniques could not recover the observed activity profiles. A ligand-based method using molecular fingerprints was, however, able to select actives for myosin II inhibition. Using further ligand- and structure-based methods, we also evaluated toxicity through androgen receptor binding, affinity for an array of antitargets and the ADME profile (including assay-interfering compounds) of the series. In conclusion, in the search for (S)-blebbistatin analogs, the dissimilarity distance of molecular fingerprints to known actives and the computed antitarget and physicochemical profile of the molecules can be used for compound design for molecules with potential as tools for modulating myosin II and motility-related diseases.

  8. Analysis of free modeling predictions by RBO aleph in CASP11.

    PubMed

    Mabrouk, Mahmoud; Werner, Tim; Schneider, Michael; Putz, Ines; Brock, Oliver

    2016-09-01

    The CASP experiment is a biannual benchmark for assessing protein structure prediction methods. In CASP11, RBO Aleph ranked as one of the top-performing automated servers in the free modeling category. This category consists of targets for which structural templates are not easily retrievable. We analyze the performance of RBO Aleph and show that its success in CASP was a result of its ab initio structure prediction protocol. A detailed analysis of this protocol demonstrates that two components unique to our method greatly contributed to prediction quality: residue-residue contact prediction by EPC-map and contact-guided conformational space search by model-based search (MBS). Interestingly, our analysis also points to a possible fundamental problem in evaluating the performance of protein structure prediction methods: Improvements in components of the method do not necessarily lead to improvements of the entire method. This points to the fact that these components interact in ways that are poorly understood. This problem, if indeed true, represents a significant obstacle to community-wide progress. Proteins 2016; 84(Suppl 1):87-104. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  9. In Silico Analysis for the Study of Botulinum Toxin Structure

    NASA Astrophysics Data System (ADS)

    Suzuki, Tomonori; Miyazaki, Satoru

    2010-01-01

    Protein-protein interactions play many important roles in biological function. Knowledge of protein-protein complex structure is required for understanding the function. The determination of protein-protein complex structure by experimental studies remains difficult, therefore computational prediction of protein structures by structure modeling and docking studies is valuable method. In addition, MD simulation is also one of the most popular methods for protein structure modeling and characteristics. Here, we attempt to predict protein-protein complex structure and property using some of bioinformatic methods, and we focus botulinum toxin complex as target structure.

  10. Identification of Viscum album L. miRNAs and prediction of their medicinal values

    PubMed Central

    Adolf, Jacob; Melzig, Matthias F.

    2017-01-01

    MicroRNAs (miRNAs) are a class of approximately 22 nucleotides single-stranded non-coding RNA molecules that play crucial roles in gene expression. It has been reported that the plant miRNAs might enter mammalian bloodstream and have a functional role in human metabolism, indicating that miRNAs might be one of the hidden bioactive ingredients in medicinal plants. Viscum album L. (Loranthaceae, European mistletoe) has been widely used for the treatment of cancer and cardiovascular diseases, but its functional compounds have not been well characterized. We considered that miRNAs might be involved in the pharmacological activities of V. album. High-throughput Illumina sequencing was performed to identify the novel and conserved miRNAs of V. album. The putative human targets were predicted. In total, 699 conserved miRNAs and 1373 novel miRNAs have been identified from V. album. Based on the combined use of TargetScan, miRanda, PITA, and RNAhybrid methods, the intersection of 30697 potential human genes have been predicted as putative targets of 29 novel miRNAs, while 14559 putative targets were highly enriched in 33 KEGG pathways. Interestingly, these highly enriched KEGG pathways were associated with some human diseases, especially cancer, cardiovascular diseases and neurological disorders, which might explain the clinical use as well as folk medicine use of mistletoe. However, further experimental validation is necessary to confirm these human targets of mistletoe miRNAs. Additionally, target genes involved in bioactive components synthesis in V. album were predicted as well. A total of 68 miRNAs were predicted to be involved in terpenoid biosynthesis, while two miRNAs including val-miR152 and miR9738 were predicted to target viscotoxins and lectins, respectively, which increased the knowledge regarding miRNA-based regulation of terpenoid biosynthesis, lectin and viscotoxin expressions in V. album. PMID:29112983

  11. Elucidation of the molecular mechanisms underlying adverse reactions associated with a kinase inhibitor using systems toxicology

    PubMed Central

    Amemiya, Takahiro; Honma, Masashi; Kariya, Yoshiaki; Ghosh, Samik; Kitano, Hiroaki; Kurachi, Yoshihisa; Fujita, Ken-ichi; Sasaki, Yasutsuna; Homma, Yukio; Abernethy, Darrel R; Kume, Haruki; Suzuki, Hiroshi

    2015-01-01

    Background/Objectives: Targeted kinase inhibitors are an important class of agents in anticancer therapeutics, but their limited tolerability hampers their clinical performance. Identification of the molecular mechanisms underlying the development of adverse reactions will be helpful in establishing a rational method for the management of clinically adverse reactions. Here, we selected sunitinib as a model and demonstrated that the molecular mechanisms underlying the adverse reactions associated with kinase inhibitors can efficiently be identified using a systems toxicological approach. Methods: First, toxicological target candidates were short-listed by comparing the human kinase occupancy profiles of sunitinib and sorafenib, and the molecular mechanisms underlying adverse reactions were predicted by sequential simulations using publicly available mathematical models. Next, to evaluate the probability of these predictions, a clinical observation study was conducted in six patients treated with sunitinib. Finally, mouse experiments were performed for detailed confirmation of the hypothesized molecular mechanisms and to evaluate the efficacy of a proposed countermeasure against adverse reactions to sunitinib. Results: In silico simulations indicated the possibility that sunitinib-mediated off-target inhibition of phosphorylase kinase leads to the generation of oxidative stress in various tissues. Clinical observations of patients and mouse experiments confirmed the validity of this prediction. The simulation further suggested that concomitant use of an antioxidant may prevent sunitinib-mediated adverse reactions, which was confirmed in mouse experiments. Conclusions: A systems toxicological approach successfully predicted the molecular mechanisms underlying clinically adverse reactions associated with sunitinib and was used to plan a rational method for the management of these adverse reactions. PMID:28725458

  12. Departure Queue Prediction for Strategic and Tactical Surface Scheduler Integration

    NASA Technical Reports Server (NTRS)

    Zelinski, Shannon; Windhorst, Robert

    2016-01-01

    A departure metering concept to be demonstrated at Charlotte Douglas International Airport (CLT) will integrate strategic and tactical surface scheduling components to enable the respective collaborative decision making and improved efficiency benefits these two methods of scheduling provide. This study analyzes the effect of tactical scheduling on strategic scheduler predictability. Strategic queue predictions and target gate pushback times to achieve a desired queue length are compared between fast time simulations of CLT surface operations with and without tactical scheduling. The use of variable departure rates as a strategic scheduler input was shown to substantially improve queue predictions over static departure rates. With target queue length calibration, the strategic scheduler can be tuned to produce average delays within one minute of the tactical scheduler. However, root mean square differences between strategic and tactical delays were between 12 and 15 minutes due to the different methods the strategic and tactical schedulers use to predict takeoff times and generate gate pushback clearances. This demonstrates how difficult it is for the strategic scheduler to predict tactical scheduler assigned gate delays on an individual flight basis as the tactical scheduler adjusts departure sequence to accommodate arrival interactions. Strategic/tactical scheduler compatibility may be improved by providing more arrival information to the strategic scheduler and stabilizing tactical scheduler changes to runway sequence in response to arrivals.

  13. Uncertainty Prediction in Passive Target Motion Analysis

    DTIC Science & Technology

    2016-05-12

    fundamental property of bearings- only target motion analysis (TMA) is that bearing B to the Attorney Docket No. 300118 3 of 25 target 10 results...the measurements used to estimate them are often non-linear. This is true for the bearing observation: = tan −1 ( () () ) ( 3 ...Parameter Evaluation Plot ( PEP ) is one example of such a grid-based approach. U.S. Patent No. 7,020,046 discloses one version of this method and is

  14. Prediction of TF target sites based on atomistic models of protein-DNA complexes

    PubMed Central

    Angarica, Vladimir Espinosa; Pérez, Abel González; Vasconcelos, Ana T; Collado-Vides, Julio; Contreras-Moreira, Bruno

    2008-01-01

    Background The specific recognition of genomic cis-regulatory elements by transcription factors (TFs) plays an essential role in the regulation of coordinated gene expression. Studying the mechanisms determining binding specificity in protein-DNA interactions is thus an important goal. Most current approaches for modeling TF specific recognition rely on the knowledge of large sets of cognate target sites and consider only the information contained in their primary sequence. Results Here we describe a structure-based methodology for predicting sequence motifs starting from the coordinates of a TF-DNA complex. Our algorithm combines information regarding the direct and indirect readout of DNA into an atomistic statistical model, which is used to estimate the interaction potential. We first measure the ability of our method to correctly estimate the binding specificities of eight prokaryotic and eukaryotic TFs that belong to different structural superfamilies. Secondly, the method is applied to two homology models, finding that sampling of interface side-chain rotamers remarkably improves the results. Thirdly, the algorithm is compared with a reference structural method based on contact counts, obtaining comparable predictions for the experimental complexes and more accurate sequence motifs for the homology models. Conclusion Our results demonstrate that atomic-detail structural information can be feasibly used to predict TF binding sites. The computational method presented here is universal and might be applied to other systems involving protein-DNA recognition. PMID:18922190

  15. Computational approach to analyze isolated ssDNA aptamers against angiotensin II.

    PubMed

    Heiat, Mohammad; Najafi, Ali; Ranjbar, Reza; Latifi, Ali Mohammad; Rasaee, Mohammad Javad

    2016-07-20

    Aptamers are oligonucleotides with highly structured molecules that can bind to their targets through specific 3-D conformation. Commonly, not all the nucleotides such as primer binding fixed region and some other sequences are vital for aptamers folding and interaction. Elimination of unnecessary regions needs trustworthy prediction tools to reduce experimental efforts and errors. Here we introduced a manipulated in-silico approach to predict the 3-D structure of aptamers and their target interactions. To design an approach for computational analysis of isolated ssDNA aptamers (FLC112, FLC125 and their truncated core region including CRC112 and CRC125), their secondary and tertiary structures were modeled by Mfold and RNA composer respectively. Output PDB files were modified from RNA to DNA in the discovery studio visualizer software. Using ZDOCK server, the aptamer-target interactions were predicted. Finally, the interaction scores were compared with the experimental results. In-silico interaction scores and the experimental outcomes were in the same descending arrangement of FLC112>CRC125>CRC112>FLC125 with similar intensity. The consistent results of innovative in-silico method with experimental outputs, affirmed that the present method may be a reliable approach. Also, it showed that the exact in-silico predictions can be utilized as a credible reference to find aptameric fragments binding potency. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. Method for high-precision multi-layered thin film deposition for deep and extreme ultraviolet mirrors

    DOEpatents

    Ruffner, Judith Alison

    1999-01-01

    A method for coating (flat or non-flat) optical substrates with high-reflectivity multi-layer coatings for use at Deep Ultra-Violet ("DUV") and Extreme Ultra-Violet ("EUV") wavelengths. The method results in a product with minimum feature sizes of less than 0.10-.mu.m for the shortest wavelength (13.4-nm). The present invention employs a computer-based modeling and deposition method to enable lateral and vertical thickness control by scanning the position of the substrate with respect to the sputter target during deposition. The thickness profile of the sputter targets is modeled before deposition and then an appropriate scanning algorithm is implemented to produce any desired, radially-symmetric thickness profile. The present invention offers the ability to predict and achieve a wide range of thickness profiles on flat or figured substrates, i.e., account for 1/R.sup.2 factor in a model, and the ability to predict and accommodate changes in deposition rate as a result of plasma geometry, i.e., over figured substrates.

  17. Classifying transcription factor targets and discovering relevant biological features

    PubMed Central

    Holloway, Dustin T; Kon, Mark; DeLisi, Charles

    2008-01-01

    Background An important goal in post-genomic research is discovering the network of interactions between transcription factors (TFs) and the genes they regulate. We have previously reported the development of a supervised-learning approach to TF target identification, and used it to predict targets of 104 transcription factors in yeast. We now include a new sequence conservation measure, expand our predictions to include 59 new TFs, introduce a web-server, and implement an improved ranking method to reveal the biological features contributing to regulation. The classifiers combine 8 genomic datasets covering a broad range of measurements including sequence conservation, sequence overrepresentation, gene expression, and DNA structural properties. Principal Findings (1) Application of the method yields an amplification of information about yeast regulators. The ratio of total targets to previously known targets is greater than 2 for 11 TFs, with several having larger gains: Ash1(4), Ino2(2.6), Yaf1(2.4), and Yap6(2.4). (2) Many predicted targets for TFs match well with the known biology of their regulators. As a case study we discuss the regulator Swi6, presenting evidence that it may be important in the DNA damage response, and that the previously uncharacterized gene YMR279C plays a role in DNA damage response and perhaps in cell-cycle progression. (3) A procedure based on recursive-feature-elimination is able to uncover from the large initial data sets those features that best distinguish targets for any TF, providing clues relevant to its biology. An analysis of Swi6 suggests a possible role in lipid metabolism, and more specifically in metabolism of ceramide, a bioactive lipid currently being investigated for anti-cancer properties. (4) An analysis of global network properties highlights the transcriptional network hubs; the factors which control the most genes and the genes which are bound by the largest set of regulators. Cell-cycle and growth related regulators dominate the former; genes involved in carbon metabolism and energy generation dominate the latter. Conclusion Postprocessing of regulatory-classifier results can provide high quality predictions, and feature ranking strategies can deliver insight into the regulatory functions of TFs. Predictions are available at an online web-server, including the full transcriptional network, which can be analyzed using VisAnt network analysis suite. Reviewers This article was reviewed by Igor Jouline, Todd Mockler(nominated by Valerian Dolja), and Sandor Pongor. PMID:18513408

  18. Assessing the capability of numerical methods to predict earthquake ground motion: the Euroseistest verification and validation project

    NASA Astrophysics Data System (ADS)

    Chaljub, E. O.; Bard, P.; Tsuno, S.; Kristek, J.; Moczo, P.; Franek, P.; Hollender, F.; Manakou, M.; Raptakis, D.; Pitilakis, K.

    2009-12-01

    During the last decades, an important effort has been dedicated to develop accurate and computationally efficient numerical methods to predict earthquake ground motion in heterogeneous 3D media. The progress in methods and increasing capability of computers have made it technically feasible to calculate realistic seismograms for frequencies of interest in seismic design applications. In order to foster the use of numerical simulation in practical prediction, it is important to (1) evaluate the accuracy of current numerical methods when applied to realistic 3D applications where no reference solution exists (verification) and (2) quantify the agreement between recorded and numerically simulated earthquake ground motion (validation). Here we report the results of the Euroseistest verification and validation project - an ongoing international collaborative work organized jointly by the Aristotle University of Thessaloniki, Greece, the Cashima research project (supported by the French nuclear agency, CEA, and the Laue-Langevin institute, ILL, Grenoble), and the Joseph Fourier University, Grenoble, France. The project involves more than 10 international teams from Europe, Japan and USA. The teams employ the Finite Difference Method (FDM), the Finite Element Method (FEM), the Global Pseudospectral Method (GPSM), the Spectral Element Method (SEM) and the Discrete Element Method (DEM). The project makes use of a new detailed 3D model of the Mygdonian basin (about 5 km wide, 15 km long, sediments reach about 400 m depth, surface S-wave velocity is 200 m/s). The prime target is to simulate 8 local earthquakes with magnitude from 3 to 5. In the verification, numerical predictions for frequencies up to 4 Hz for a series of models with increasing structural and rheological complexity are analyzed and compared using quantitative time-frequency goodness-of-fit criteria. Predictions obtained by one FDM team and the SEM team are close and different from other predictions (consistent with the ESG2006 exercise which targeted the Grenoble Valley). Diffractions off the basin edges and induced surface-wave propagation mainly contribute to differences between predictions. The differences are particularly large in the elastic models but remain important also in models with attenuation. In the validation, predictions are compared with the recordings by a local array of 19 surface and borehole accelerometers. The level of agreement is found event-dependent. For the largest-magnitude event the agreement is surprisingly good even at high frequencies.

  19. NiaoDuQing granules relieve chronic kidney disease symptoms by decreasing renal fibrosis and anemia

    PubMed Central

    Wang, Xu; Yu, Suyun; Jia, Qi; Chen, Lichuan; Zhong, Jinqiu; Pan, Yanhong; Shen, Peiliang; Shen, Yin; Wang, Siliang; Wei, Zhonghong; Cao, Yuzhu; Lu, Yin

    2017-01-01

    NiaoDuQing (NDQ) granules, a traditional Chinese medicine, has been clinically used in China for over fourteen years to treat chronic kidney disease (CKD). To elucidate the mechanisms underlying the therapeutic benefits of NDQ, we designed an approach incorporating chemoinformatics, bioinformatics, network biology methods, and cellular and molecular biology experiments. A total of 182 active compounds were identified in NDQ granules, and 397 putative targets associated with different diseases were derived through ADME modelling and target prediction tools. Protein-protein interaction networks of CKD-related and putative NDQ targets were constructed, and 219 candidate targets were identified based on topological features. Pathway enrichment analysis showed that the candidate targets were mostly related to the TGF-β, the p38MAPK, and the erythropoietin (EPO) receptor signaling pathways, which are known contributors to renal fibrosis and/or renal anemia. A rat model of CKD was established to validate the drug-target mechanisms predicted by the systems pharmacology analysis. Experimental results confirmed that NDQ granules exerted therapeutic effects on CKD and its comorbidities, including renal anemia, mainly by modulating the TGF-β and EPO signaling pathways. Thus, the pharmacological actions of NDQ on CKD symptoms correlated well with in silico predictions. PMID:28915563

  20. The Search for Biosignatures on Mars: Using Predictive Geology to Optimize Exploration Targets

    NASA Technical Reports Server (NTRS)

    Oehler, Dorothy Z.; Allen, Carlton C.

    2011-01-01

    Predicting geologic context from satellite data is a method used on Earth for exploration in areas with limited ground truth. The method can be used to predict facies likely to contain organic-rich shales. Such shales concentrate and preserve organics and are major repositories of organic biosignatures on Earth [1]. Since current surface conditions on Mars are unfavorable for development of abundant life or for preservation of organic remains of past life, the chances are low of encountering organics in surface samples. Thus, focusing martian exploration on sites predicted to contain organic-rich shales would optimize the chances of discovering evidence of life, if it ever existed on that planet.

  1. PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection

    PubMed Central

    Wang, Huilin; Wang, Mingjun; Tan, Hao; Li, Yuan; Zhang, Ziding; Song, Jiangning

    2014-01-01

    X-ray crystallography is the primary approach to solve the three-dimensional structure of a protein. However, a major bottleneck of this method is the failure of multi-step experimental procedures to yield diffraction-quality crystals, including sequence cloning, protein material production, purification, crystallization and ultimately, structural determination. Accordingly, prediction of the propensity of a protein to successfully undergo these experimental procedures based on the protein sequence may help narrow down laborious experimental efforts and facilitate target selection. A number of bioinformatics methods based on protein sequence information have been developed for this purpose. However, our knowledge on the important determinants of propensity for a protein sequence to produce high diffraction-quality crystals remains largely incomplete. In practice, most of the existing methods display poorer performance when evaluated on larger and updated datasets. To address this problem, we constructed an up-to-date dataset as the benchmark, and subsequently developed a new approach termed ‘PredPPCrys’ using the support vector machine (SVM). Using a comprehensive set of multifaceted sequence-derived features in combination with a novel multi-step feature selection strategy, we identified and characterized the relative importance and contribution of each feature type to the prediction performance of five individual experimental steps required for successful crystallization. The resulting optimal candidate features were used as inputs to build the first-level SVM predictor (PredPPCrys I). Next, prediction outputs of PredPPCrys I were used as the input to build second-level SVM classifiers (PredPPCrys II), which led to significantly enhanced prediction performance. Benchmarking experiments indicated that our PredPPCrys method outperforms most existing procedures on both up-to-date and previous datasets. In addition, the predicted crystallization targets of currently non-crystallizable proteins were provided as compendium data, which are anticipated to facilitate target selection and design for the worldwide structural genomics consortium. PredPPCrys is freely available at http://www.structbioinfor.org/PredPPCrys. PMID:25148528

  2. Predicting treatment effect from surrogate endpoints and historical trials: an extrapolation involving probabilities of a binary outcome or survival to a specific time

    PubMed Central

    Sargent, Daniel J.; Buyse, Marc; Burzykowski, Tomasz

    2011-01-01

    SUMMARY Using multiple historical trials with surrogate and true endpoints, we consider various models to predict the effect of treatment on a true endpoint in a target trial in which only a surrogate endpoint is observed. This predicted result is computed using (1) a prediction model (mixture, linear, or principal stratification) estimated from historical trials and the surrogate endpoint of the target trial and (2) a random extrapolation error estimated from successively leaving out each trial among the historical trials. The method applies to either binary outcomes or survival to a particular time that is computed from censored survival data. We compute a 95% confidence interval for the predicted result and validate its coverage using simulation. To summarize the additional uncertainty from using a predicted instead of true result for the estimated treatment effect, we compute its multiplier of standard error. Software is available for download. PMID:21838732

  3. LiveBench-1: continuous benchmarking of protein structure prediction servers.

    PubMed

    Bujnicki, J M; Elofsson, A; Fischer, D; Rychlewski, L

    2001-02-01

    We present a novel, continuous approach aimed at the large-scale assessment of the performance of available fold-recognition servers. Six popular servers were investigated: PDB-Blast, FFAS, T98-lib, GenTHREADER, 3D-PSSM, and INBGU. The assessment was conducted using as prediction targets a large number of selected protein structures released from October 1999 to April 2000. A target was selected if its sequence showed no significant similarity to any of the proteins previously available in the structural database. Overall, the servers were able to produce structurally similar models for one-half of the targets, but significantly accurate sequence-structure alignments were produced for only one-third of the targets. We further classified the targets into two sets: easy and hard. We found that all servers were able to find the correct answer for the vast majority of the easy targets if a structurally similar fold was present in the server's fold libraries. However, among the hard targets--where standard methods such as PSI-BLAST fail--the most sensitive fold-recognition servers were able to produce similar models for only 40% of the cases, half of which had a significantly accurate sequence-structure alignment. Among the hard targets, the presence of updated libraries appeared to be less critical for the ranking. An "ideally combined consensus" prediction, where the results of all servers are considered, would increase the percentage of correct assignments by 50%. Each server had a number of cases with a correct assignment, where the assignments of all the other servers were wrong. This emphasizes the benefits of considering more than one server in difficult prediction tasks. The LiveBench program (http://BioInfo.PL/LiveBench) is being continued, and all interested developers are cordially invited to join.

  4. Multi-target QSPR modeling for simultaneous prediction of multiple gas-phase kinetic rate constants of diverse chemicals

    NASA Astrophysics Data System (ADS)

    Basant, Nikita; Gupta, Shikha

    2018-03-01

    The reactions of molecular ozone (O3), hydroxyl (•OH) and nitrate (NO3) radicals are among the major pathways of removal of volatile organic compounds (VOCs) in the atmospheric environment. The gas-phase kinetic rate constants (kO3, kOH, kNO3) are thus, important in assessing the ultimate fate and exposure risk of atmospheric VOCs. Experimental data for rate constants are not available for many emerging VOCs and the computational methods reported so far address a single target modeling only. In this study, we have developed a multi-target (mt) QSPR model for simultaneous prediction of multiple kinetic rate constants (kO3, kOH, kNO3) of diverse organic chemicals considering an experimental data set of VOCs for which values of all the three rate constants are available. The mt-QSPR model identified and used five descriptors related to the molecular size, degree of saturation and electron density in a molecule, which were mechanistically interpretable. These descriptors successfully predicted three rate constants simultaneously. The model yielded high correlations (R2 = 0.874-0.924) between the experimental and simultaneously predicted endpoint rate constant (kO3, kOH, kNO3) values in test arrays for all the three systems. The model also passed all the stringent statistical validation tests for external predictivity. The proposed multi-target QSPR model can be successfully used for predicting reactivity of new VOCs simultaneously for their exposure risk assessment.

  5. Lessons from (co-)evolution in the docking of proteins and peptides for CAPRI Rounds 28-35.

    PubMed

    Yu, Jinchao; Andreani, Jessica; Ochsenbein, Françoise; Guerois, Raphaël

    2017-03-01

    Computational protein-protein docking is of great importance for understanding protein interactions at the structural level. Critical assessment of prediction of interactions (CAPRI) experiments provide the protein docking community with a unique opportunity to blindly test methods based on real-life cases and help accelerate methodology development. For CAPRI Rounds 28-35, we used an automatic docking pipeline integrating the coarse-grained co-evolution-based potential InterEvScore. This score was developed to exploit the information contained in the multiple sequence alignments of binding partners and selectively recognize co-evolved interfaces. Together with Zdock/Frodock for rigid-body docking, SOAP-PP for atomic potential and Rosetta applications for structural refinement, this pipeline reached high performance on a majority of targets. For protein-peptide docking and interfacial water position predictions, we also explored different means of taking evolutionary information into account. Overall, our group ranked 1 st by correctly predicting 10 targets, composed of 1 High, 7 Medium and 2 Acceptable predictions. Excellent and Outstanding levels of accuracy were reached for each of the two water prediction targets, respectively. Altogether, in 15 out of 18 targets in total, evolutionary information, either through co-evolution or conservation analyses, could provide key constraints to guide modeling towards the most likely assemblies. These results open promising perspectives regarding the way evolutionary information can be valuable to improve docking prediction accuracy. Proteins 2017; 85:378-390. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  6. Discovering causal pathways linking genomic events to transcriptional states using Tied Diffusion Through Interacting Events (TieDIE).

    PubMed

    Paull, Evan O; Carlin, Daniel E; Niepel, Mario; Sorger, Peter K; Haussler, David; Stuart, Joshua M

    2013-11-01

    Identifying the cellular wiring that connects genomic perturbations to transcriptional changes in cancer is essential to gain a mechanistic understanding of disease initiation, progression and ultimately to predict drug response. We have developed a method called Tied Diffusion Through Interacting Events (TieDIE) that uses a network diffusion approach to connect genomic perturbations to gene expression changes characteristic of cancer subtypes. The method computes a subnetwork of protein-protein interactions, predicted transcription factor-to-target connections and curated interactions from literature that connects genomic and transcriptomic perturbations. Application of TieDIE to The Cancer Genome Atlas and a breast cancer cell line dataset identified key signaling pathways, with examples impinging on MYC activity. Interlinking genes are predicted to correspond to essential components of cancer signaling and may provide a mechanistic explanation of tumor character and suggest subtype-specific drug targets. Software is available from the Stuart lab's wiki: https://sysbiowiki.soe.ucsc.edu/tiedie. jstuart@ucsc.edu. Supplementary data are available at Bioinformatics online.

  7. Can we use genetic and genomic approaches to identify candidate animals for targeted selective treatment.

    PubMed

    Laurenson, Yan C S M; Kyriazakis, Ilias; Bishop, Stephen C

    2013-10-18

    Estimated breeding values (EBV) for faecal egg count (FEC) and genetic markers for host resistance to nematodes may be used to identify resistant animals for selective breeding programmes. Similarly, targeted selective treatment (TST) requires the ability to identify the animals that will benefit most from anthelmintic treatment. A mathematical model was used to combine the concepts and evaluate the potential of using genetic-based methods to identify animals for a TST regime. EBVs obtained by genomic prediction were predicted to be the best determinant criterion for TST in terms of the impact on average empty body weight and average FEC, whereas pedigree-based EBVs for FEC were predicted to be marginally worse than using phenotypic FEC as a determinant criterion. Whilst each method has financial implications, if the identification of host resistance is incorporated into a wider genomic selection indices or selective breeding programmes, then genetic or genomic information may be plausibly included in TST regimes. Copyright © 2013 Elsevier B.V. All rights reserved.

  8. Protein-protein docking with binding site patch prediction and network-based terms enhanced combinatorial scoring.

    PubMed

    Gong, Xinqi; Wang, Panwen; Yang, Feng; Chang, Shan; Liu, Bin; He, Hongqiu; Cao, Libin; Xu, Xianjin; Li, Chunhua; Chen, Weizu; Wang, Cunxin

    2010-11-15

    Protein-protein docking has made much progress in recent years, but challenges still exist. Here we present the application of our docking approach HoDock in CAPRI. In this approach, a binding site prediction is implemented to reduce docking sampling space and filter out unreasonable docked structures, and a network-based enhanced combinatorial scoring function HPNCscore is used to evaluate the decoys. The experimental information was combined with the predicted binding site to pick out the most likely key binding site residues. We applied the HoDock method in the recent rounds of the CAPRI experiments, and got good results as predictors on targets 39, 40, and 41. We also got good results as scorers on targets 35, 37, 40, and 41. This indicates that our docking approach can contribute to the progress of protein-protein docking methods and to the understanding of the mechanism of protein-protein interactions. © 2010 Wiley-Liss, Inc.

  9. Surflex-Dock: Docking benchmarks and real-world application

    NASA Astrophysics Data System (ADS)

    Spitzer, Russell; Jain, Ajay N.

    2012-06-01

    Benchmarks for molecular docking have historically focused on re-docking the cognate ligand of a well-determined protein-ligand complex to measure geometric pose prediction accuracy, and measurement of virtual screening performance has been focused on increasingly large and diverse sets of target protein structures, cognate ligands, and various types of decoy sets. Here, pose prediction is reported on the Astex Diverse set of 85 protein ligand complexes, and virtual screening performance is reported on the DUD set of 40 protein targets. In both cases, prepared structures of targets and ligands were provided by symposium organizers. The re-prepared data sets yielded results not significantly different than previous reports of Surflex-Dock on the two benchmarks. Minor changes to protein coordinates resulting from complex pre-optimization had large effects on observed performance, highlighting the limitations of cognate ligand re-docking for pose prediction assessment. Docking protocols developed for cross-docking, which address protein flexibility and produce discrete families of predicted poses, produced substantially better performance for pose prediction. Performance on virtual screening performance was shown to benefit by employing and combining multiple screening methods: docking, 2D molecular similarity, and 3D molecular similarity. In addition, use of multiple protein conformations significantly improved screening enrichment.

  10. Predicting appointment misses in hospitals using data analytics

    PubMed Central

    Karpagam, Sylvia; Ma, Nang Laik

    2017-01-01

    Background There is growing attention over the last few years about non-attendance in hospitals and its clinical and economic consequences. There have been several studies documenting the various aspects of non-attendance in hospitals. Project Predicting Appoint Misses (PAM) was started with the intention of being able to predict the type of patients that would not come for appointments after making bookings. Methods Historic hospital appointment data merged with “distance from hospital” variable was used to run Logistic Regression, Support Vector Machine and Recursive Partitioning to decide the contributing variables to missed appointments. Results Variables that are “class”, “time”, “demographics” related have an effect on the target variable, however, prediction models may not perform effectively due to very subtle influence on the target variable. Previously assumed major contributors like “age”, “distance” did not have a major effect on the target variable. Conclusions With the given data it will be very difficult to make any moderate/strong prediction of the Appointment misses. That being said with the help of the cut off we are able to capture all of the “appointment misses” in addition to also capturing the actualized appointments. PMID:28567409

  11. Identification of regulatory targets of tissue-specific transcription factors: application to retina-specific gene regulation

    PubMed Central

    Qian, Jiang; Esumi, Noriko; Chen, Yangjian; Wang, Qingliang; Chowers, Itay; Zack, Donald J.

    2005-01-01

    Identification of tissue-specific gene regulatory networks can yield insights into the molecular basis of a tissue's development, function and pathology. Here, we present a computational approach designed to identify potential regulatory target genes of photoreceptor cell-specific transcription factors (TFs). The approach is based on the hypothesis that genes related to the retina in terms of expression, disease and/or function are more likely to be the targets of retina-specific TFs than other genes. A list of genes that are preferentially expressed in retina was obtained by integrating expressed sequence tag, SAGE and microarray datasets. The regulatory targets of retina-specific TFs are enriched in this set of retina-related genes. A Bayesian approach was employed to integrate information about binding site location relative to a gene's transcription start site. Our method was applied to three retina-specific TFs, CRX, NRL and NR2E3, and a number of potential targets were predicted. To experimentally assess the validity of the bioinformatic predictions, mobility shift, transient transfection and chromatin immunoprecipitation assays were performed with five predicted CRX targets, and the results were suggestive of CRX regulation in 5/5, 3/5 and 4/5 cases, respectively. Together, these experiments strongly suggest that RP1, GUCY2D, ABCA4 are novel targets of CRX. PMID:15967807

  12. A Computational Approach to Finding Novel Targets for Existing Drugs

    PubMed Central

    Li, Yvonne Y.; An, Jianghong; Jones, Steven J. M.

    2011-01-01

    Repositioning existing drugs for new therapeutic uses is an efficient approach to drug discovery. We have developed a computational drug repositioning pipeline to perform large-scale molecular docking of small molecule drugs against protein drug targets, in order to map the drug-target interaction space and find novel interactions. Our method emphasizes removing false positive interaction predictions using criteria from known interaction docking, consensus scoring, and specificity. In all, our database contains 252 human protein drug targets that we classify as reliable-for-docking as well as 4621 approved and experimental small molecule drugs from DrugBank. These were cross-docked, then filtered through stringent scoring criteria to select top drug-target interactions. In particular, we used MAPK14 and the kinase inhibitor BIM-8 as examples where our stringent thresholds enriched the predicted drug-target interactions with known interactions up to 20 times compared to standard score thresholds. We validated nilotinib as a potent MAPK14 inhibitor in vitro (IC50 40 nM), suggesting a potential use for this drug in treating inflammatory diseases. The published literature indicated experimental evidence for 31 of the top predicted interactions, highlighting the promising nature of our approach. Novel interactions discovered may lead to the drug being repositioned as a therapeutic treatment for its off-target's associated disease, added insight into the drug's mechanism of action, and added insight into the drug's side effects. PMID:21909252

  13. PreCisIon: PREdiction of CIS-regulatory elements improved by gene's positION.

    PubMed

    Elati, Mohamed; Nicolle, Rémy; Junier, Ivan; Fernández, David; Fekih, Rim; Font, Julio; Képès, François

    2013-02-01

    Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases.

  14. Integrating Transcriptomics with Metabolic Modeling Predicts Biomarkers and Drug Targets for Alzheimer's Disease

    PubMed Central

    Stempler, Shiri; Yizhak, Keren; Ruppin, Eytan

    2014-01-01

    Accumulating evidence links numerous abnormalities in cerebral metabolism with the progression of Alzheimer's disease (AD), beginning in its early stages. Here, we integrate transcriptomic data from AD patients with a genome-scale computational human metabolic model to characterize the altered metabolism in AD, and employ state-of-the-art metabolic modelling methods to predict metabolic biomarkers and drug targets in AD. The metabolic descriptions derived are first tested and validated on a large scale versus existing AD proteomics and metabolomics data. Our analysis shows a significant decrease in the activity of several key metabolic pathways, including the carnitine shuttle, folate metabolism and mitochondrial transport. We predict several metabolic biomarkers of AD progression in the blood and the CSF, including succinate and prostaglandin D2. Vitamin D and steroid metabolism pathways are enriched with predicted drug targets that could mitigate the metabolic alterations observed. Taken together, this study provides the first network wide view of the metabolic alterations associated with AD progression. Most importantly, it offers a cohort of new metabolic leads for the diagnosis of AD and its treatment. PMID:25127241

  15. Simulation studies of hydrodynamic aspects of magneto-inertial fusion and high order adaptive algorithms for Maxwell equations

    NASA Astrophysics Data System (ADS)

    Wu, Lingling

    Three-dimensional simulations of the formation and implosion of plasma liners for the Plasma Jet Induced Magneto Inertial Fusion (PJMIF) have been performed using multiscale simulation technique based on the FronTier code. In the PJMIF concept, a plasma liner, formed by merging of a large number of radial, highly supersonic plasma jets, implodes on the target in the form of two compact plasma toroids, and compresses it to conditions of the nuclear fusion ignition. The propagation of a single jet with Mach number 60 from the plasma gun to the merging point was studied using the FronTier code. The simulation result was used as input to the 3D jet merger problem. The merger of 144, 125, and 625 jets and the formation and heating of plasma liner by compression waves have been studied and compared with recent theoretical predictions. The main result of the study is the prediction of the average Mach number reduction and the description of the liner structure and properties. We have also compared the effect of different merging radii. Spherically symmetric simulations of the implosion of plasma liners and compression of plasma targets have also been performed using the method of front tracking. The cases of single deuterium and xenon liners and double layer deuterium - xenon liners compressing various deuterium-tritium targets have been investigated, optimized for maximum fusion energy gains, and compared with theoretical predictions and scaling laws of [P. Parks, On the efficacy of imploding plasma liners for magnetized fusion target compression, Phys. Plasmas 15, 062506 (2008)]. In agreement with the theory, the fusion gain was significantly below unity for deuterium - tritium targets compressed by Mach 60 deuterium liners. In the most optimal setup for a given chamber size that contained a target with the initial radius of 20 cm compressed by 10 cm thick, Mach 60 xenon liner, the target ignition and fusion energy gain of 10 was achieved. Simulations also showed that composite deuterium - xenon liners reduce the energy gain due to lower target compression rates. The effect of heating of targets by alpha particles on the fusion energy gain has also been investigated. The study of the dependence of the ram pressure amplification on radial compressibility showed a good agreement with the theory. The study concludes that a liner with higher Mach number and lower adiabatic index gamma (the radio of specific heats) will generate higher ram pressure amplification and higher fusion energy gain. We implemented a second order embedded boundary method for the Maxwell equations in geometrically complex domains. The numerical scheme is second order in both space and time. Comparing to the first order stair-step approximation of complex geometries within the FDTD method, this method can avoid spurious solution introduced by the stair step approximation. Unlike the finite element method and the FE-FD hybrid method, no triangulation is needed for this scheme. This method preserves the simplicity of the embedded boundary method and it is easy to implement. We will also propose a conservative (symplectic) fourth order scheme for uniform geometry boundary.

  16. The Chemical Basis of Pharmacology

    PubMed Central

    2010-01-01

    Molecular biology now dominates pharmacology so thoroughly that it is difficult to recall that only a generation ago the field was very different. To understand drug action today, we characterize the targets through which they act and new drug leads are discovered on the basis of target structure and function. Until the mid-1980s the information often flowed in reverse: investigators began with organic molecules and sought targets, relating receptors not by sequence or structure but by their ligands. Recently, investigators have returned to this chemical view of biology, bringing to it systematic and quantitative methods of relating targets by their ligands. This has allowed the discovery of new targets for established drugs, suggested the bases for their side effects, and predicted the molecular targets underlying phenotypic screens. The bases for these new methods, some of their successes and liabilities, and new opportunities for their use are described. PMID:21058655

  17. Controlling the Shannon Entropy of Quantum Systems

    PubMed Central

    Xing, Yifan; Wu, Jun

    2013-01-01

    This paper proposes a new quantum control method which controls the Shannon entropy of quantum systems. For both discrete and continuous entropies, controller design methods are proposed based on probability density function control, which can drive the quantum state to any target state. To drive the entropy to any target at any prespecified time, another discretization method is proposed for the discrete entropy case, and the conditions under which the entropy can be increased or decreased are discussed. Simulations are done on both two- and three-dimensional quantum systems, where division and prediction are used to achieve more accurate tracking. PMID:23818819

  18. Controlling the shannon entropy of quantum systems.

    PubMed

    Xing, Yifan; Wu, Jun

    2013-01-01

    This paper proposes a new quantum control method which controls the Shannon entropy of quantum systems. For both discrete and continuous entropies, controller design methods are proposed based on probability density function control, which can drive the quantum state to any target state. To drive the entropy to any target at any prespecified time, another discretization method is proposed for the discrete entropy case, and the conditions under which the entropy can be increased or decreased are discussed. Simulations are done on both two- and three-dimensional quantum systems, where division and prediction are used to achieve more accurate tracking.

  19. Accuracy of Robotic Radiosurgical Liver Treatment Throughout the Respiratory Cycle

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Winter, Jeff D.; Wong, Raimond; Swaminath, Anand

    Purpose: To quantify random uncertainties in robotic radiosurgical treatment of liver lesions with real-time respiratory motion management. Methods and Materials: We conducted a retrospective analysis of 27 liver cancer patients treated with robotic radiosurgery over 118 fractions. The robotic radiosurgical system uses orthogonal x-ray images to determine internal target position and correlates this position with an external surrogate to provide robotic corrections of linear accelerator positioning. Verification and update of this internal–external correlation model was achieved using periodic x-ray images collected throughout treatment. To quantify random uncertainties in targeting, we analyzed logged tracking information and isolated x-ray images collected immediately beforemore » beam delivery. For translational correlation errors, we quantified the difference between correlation model–estimated target position and actual position determined by periodic x-ray imaging. To quantify prediction errors, we computed the mean absolute difference between the predicted coordinates and actual modeled position calculated 115 milliseconds later. We estimated overall random uncertainty by quadratically summing correlation, prediction, and end-to-end targeting errors. We also investigated relationships between tracking errors and motion amplitude using linear regression. Results: The 95th percentile absolute correlation errors in each direction were 2.1 mm left–right, 1.8 mm anterior–posterior, 3.3 mm cranio–caudal, and 3.9 mm 3-dimensional radial, whereas 95th percentile absolute radial prediction errors were 0.5 mm. Overall 95th percentile random uncertainty was 4 mm in the radial direction. Prediction errors were strongly correlated with modeled target amplitude (r=0.53-0.66, P<.001), whereas only weak correlations existed for correlation errors. Conclusions: Study results demonstrate that model correlation errors are the primary random source of uncertainty in Cyberknife liver treatment and, unlike prediction errors, are not strongly correlated with target motion amplitude. Aggregate 3-dimensional radial position errors presented here suggest the target will be within 4 mm of the target volume for 95% of the beam delivery.« less

  20. GeneSilico protein structure prediction meta-server.

    PubMed

    Kurowski, Michal A; Bujnicki, Janusz M

    2003-07-01

    Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta.

  1. GeneSilico protein structure prediction meta-server

    PubMed Central

    Kurowski, Michal A.; Bujnicki, Janusz M.

    2003-01-01

    Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta. PMID:12824313

  2. Classification of drug molecules considering their IC50 values using mixed-integer linear programming based hyper-boxes method.

    PubMed

    Armutlu, Pelin; Ozdemir, Muhittin E; Uney-Yuksektepe, Fadime; Kavakli, I Halil; Turkay, Metin

    2008-10-03

    A priori analysis of the activity of drugs on the target protein by computational approaches can be useful in narrowing down drug candidates for further experimental tests. Currently, there are a large number of computational methods that predict the activity of drugs on proteins. In this study, we approach the activity prediction problem as a classification problem and, we aim to improve the classification accuracy by introducing an algorithm that combines partial least squares regression with mixed-integer programming based hyper-boxes classification method, where drug molecules are classified as low active or high active regarding their binding activity (IC50 values) on target proteins. We also aim to determine the most significant molecular descriptors for the drug molecules. We first apply our approach by analyzing the activities of widely known inhibitor datasets including Acetylcholinesterase (ACHE), Benzodiazepine Receptor (BZR), Dihydrofolate Reductase (DHFR), Cyclooxygenase-2 (COX-2) with known IC50 values. The results at this stage proved that our approach consistently gives better classification accuracies compared to 63 other reported classification methods such as SVM, Naïve Bayes, where we were able to predict the experimentally determined IC50 values with a worst case accuracy of 96%. To further test applicability of this approach we first created dataset for Cytochrome P450 C17 inhibitors and then predicted their activities with 100% accuracy. Our results indicate that this approach can be utilized to predict the inhibitory effects of inhibitors based on their molecular descriptors. This approach will not only enhance drug discovery process, but also save time and resources committed.

  3. Determination of optical absorption coefficient with focusing photoacoustic imaging.

    PubMed

    Li, Zhifang; Li, Hui; Zeng, Zhiping; Xie, Wenming; Chen, Wei R

    2012-06-01

    Absorption coefficient of biological tissue is an important factor for photothermal therapy and photoacoustic imaging. However, its determination remains a challenge. In this paper, we propose a method using focusing photoacoustic imaging technique to quantify the target optical absorption coefficient. It utilizes the ratio of the amplitude of the peak signal from the top boundary of the target to that from the bottom boundary based on wavelet transform. This method is self-calibrating. Factors, such as absolute optical fluence, ultrasound parameters, and Grüneisen parameter, can be canceled by dividing the amplitudes of the two peaks. To demonstrate this method, we quantified the optical absorption coefficient of a target with various concentrations of an absorbing dye. This method is particularly useful to provide accurate absorption coefficient for predicting the outcomes of photothermal interaction for cancer treatment with absorption enhancement.

  4. A network-based multi-target computational estimation scheme for anticoagulant activities of compounds.

    PubMed

    Li, Qian; Li, Xudong; Li, Canghai; Chen, Lirong; Song, Jun; Tang, Yalin; Xu, Xiaojie

    2011-03-22

    Traditional virtual screening method pays more attention on predicted binding affinity between drug molecule and target related to a certain disease instead of phenotypic data of drug molecule against disease system, as is often less effective on discovery of the drug which is used to treat many types of complex diseases. Virtual screening against a complex disease by general network estimation has become feasible with the development of network biology and system biology. More effective methods of computational estimation for the whole efficacy of a compound in a complex disease system are needed, given the distinct weightiness of the different target in a biological process and the standpoint that partial inhibition of several targets can be more efficient than the complete inhibition of a single target. We developed a novel approach by integrating the affinity predictions from multi-target docking studies with biological network efficiency analysis to estimate the anticoagulant activities of compounds. From results of network efficiency calculation for human clotting cascade, factor Xa and thrombin were identified as the two most fragile enzymes, while the catalytic reaction mediated by complex IXa:VIIIa and the formation of the complex VIIIa:IXa were recognized as the two most fragile biological matter in the human clotting cascade system. Furthermore, the method which combined network efficiency with molecular docking scores was applied to estimate the anticoagulant activities of a serial of argatroban intermediates and eight natural products respectively. The better correlation (r = 0.671) between the experimental data and the decrease of the network deficiency suggests that the approach could be a promising computational systems biology tool to aid identification of anticoagulant activities of compounds in drug discovery. This article proposes a network-based multi-target computational estimation method for anticoagulant activities of compounds by combining network efficiency analysis with scoring function from molecular docking.

  5. A Network-Based Multi-Target Computational Estimation Scheme for Anticoagulant Activities of Compounds

    PubMed Central

    Li, Canghai; Chen, Lirong; Song, Jun; Tang, Yalin; Xu, Xiaojie

    2011-01-01

    Background Traditional virtual screening method pays more attention on predicted binding affinity between drug molecule and target related to a certain disease instead of phenotypic data of drug molecule against disease system, as is often less effective on discovery of the drug which is used to treat many types of complex diseases. Virtual screening against a complex disease by general network estimation has become feasible with the development of network biology and system biology. More effective methods of computational estimation for the whole efficacy of a compound in a complex disease system are needed, given the distinct weightiness of the different target in a biological process and the standpoint that partial inhibition of several targets can be more efficient than the complete inhibition of a single target. Methodology We developed a novel approach by integrating the affinity predictions from multi-target docking studies with biological network efficiency analysis to estimate the anticoagulant activities of compounds. From results of network efficiency calculation for human clotting cascade, factor Xa and thrombin were identified as the two most fragile enzymes, while the catalytic reaction mediated by complex IXa:VIIIa and the formation of the complex VIIIa:IXa were recognized as the two most fragile biological matter in the human clotting cascade system. Furthermore, the method which combined network efficiency with molecular docking scores was applied to estimate the anticoagulant activities of a serial of argatroban intermediates and eight natural products respectively. The better correlation (r = 0.671) between the experimental data and the decrease of the network deficiency suggests that the approach could be a promising computational systems biology tool to aid identification of anticoagulant activities of compounds in drug discovery. Conclusions This article proposes a network-based multi-target computational estimation method for anticoagulant activities of compounds by combining network efficiency analysis with scoring function from molecular docking. PMID:21445339

  6. Method for identifying type I diabetes mellitus in humans

    DOEpatents

    Metz, Thomas O [Kennewick, WA; Qian, Weijun [Richland, WA; Jacobs, Jon M [Pasco, WA; Smith, Richard D [Richland, WA

    2011-04-12

    A method and system for classifying subject populations utilizing predictive and diagnostic biomarkers for type I diabetes mellitus. The method including determining the levels of a variety of markers within the serum or plasma of a target organism and correlating this level to general populations as a screen for predisposition or progressive monitoring of disease presence or predisposition.

  7. Cis-regulatory element based targeted gene finding: genome-wide identification of abscisic acid- and abiotic stress-responsive genes in Arabidopsis thaliana.

    PubMed

    Zhang, Weixiong; Ruan, Jianhua; Ho, Tuan-Hua David; You, Youngsook; Yu, Taotao; Quatrano, Ralph S

    2005-07-15

    A fundamental problem of computational genomics is identifying the genes that respond to certain endogenous cues and environmental stimuli. This problem can be referred to as targeted gene finding. Since gene regulation is mainly determined by the binding of transcription factors and cis-regulatory DNA sequences, most existing gene annotation methods, which exploit the conservation of open reading frames, are not effective in finding target genes. A viable approach to targeted gene finding is to exploit the cis-regulatory elements that are known to be responsible for the transcription of target genes. Given such cis-elements, putative target genes whose promoters contain the elements can be identified. As a case study, we apply the above approach to predict the genes in model plant Arabidopsis thaliana which are inducible by a phytohormone, abscisic acid (ABA), and abiotic stress, such as drought, cold and salinity. We first construct and analyze two ABA specific cis-elements, ABA-responsive element (ABRE) and its coupling element (CE), in A.thaliana, based on their conservation in rice and other cereal plants. We then use the ABRE-CE module to identify putative ABA-responsive genes in A.thaliana. Based on RT-PCR verification and the results from literature, this method has an accuracy rate of 67.5% for the top 40 predictions. The cis-element based targeted gene finding approach is expected to be widely applicable since a large number of cis-elements in many species are available.

  8. Target Highlights in CASP9: Experimental Target Structures for the Critical Assessment of Techniques for Protein Structure Prediction

    PubMed Central

    Kryshtafovych, Andriy; Moult, John; Bartual, Sergio G.; Bazan, J. Fernando; Berman, Helen; Casteel, Darren E.; Christodoulou, Evangelos; Everett, John K.; Hausmann, Jens; Heidebrecht, Tatjana; Hills, Tanya; Hui, Raymond; Hunt, John F.; Jayaraman, Seetharaman; Joachimiak, Andrzej; Kennedy, Michael A.; Kim, Choel; Lingel, Andreas; Michalska, Karolina; Montelione, Gaetano T.; Otero, José M.; Perrakis, Anastassis; Pizarro, Juan C.; van Raaij, Mark J.; Ramelot, Theresa A.; Rousseau, Francois; Tong, Liang; Wernimont, Amy K.; Young, Jasmine; Schwede, Torsten

    2011-01-01

    One goal of the CASP Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction is to identify the current state of the art in protein structure prediction and modeling. A fundamental principle of CASP is blind prediction on a set of relevant protein targets, i.e. the participating computational methods are tested on a common set of experimental target proteins, for which the experimental structures are not known at the time of modeling. Therefore, the CASP experiment would not have been possible without broad support of the experimental protein structural biology community. In this manuscript, several experimental groups discuss the structures of the proteins which they provided as prediction targets for CASP9, highlighting structural and functional peculiarities of these structures: the long tail fibre protein gp37 from bacteriophage T4, the cyclic GMP-dependent protein kinase Iβ (PKGIβ) dimerization/docking domain, the ectodomain of the JTB (Jumping Translocation Breakpoint) transmembrane receptor, Autotaxin (ATX) in complex with an inhibitor, the DNA-Binding J-Binding Protein 1 (JBP1) domain essential for biosynthesis and maintenance of DNA base-J (β-D-glucosyl-hydroxymethyluracil) in Trypanosoma and Leishmania, an so far uncharacterized 73 residue domain from Ruminococcus gnavus with a fold typical for PDZ-like domains, a domain from the Phycobilisome (PBS) core-membrane linker (LCM) phycobiliprotein ApcE from Synechocystis, the Heat shock protein 90 (Hsp90) activators PFC0360w and PFC0270w from Plasmodium falciparum, and 2-oxo-3-deoxygalactonate kinase from Klebsiella pneumoniae. PMID:22020785

  9. Over-expression of the miRNA cluster at chromosome 14q32 in the alcoholic brain correlates with suppression of predicted target mRNA required for oligodendrocyte proliferation.

    PubMed

    Manzardo, A M; Gunewardena, S; Butler, M G

    2013-09-10

    We examined miRNA expression from RNA isolated from the frontal cortex (Broadman area 9) of 9 alcoholics (6 males, 3 females, mean age 48 years) and 9 matched controls using both the Affymetrix GeneChip miRNA 2.0 and Human Exon 1.0 ST Arrays to further characterize genetic influences in alcoholism and the effects of alcohol consumption on predicted target mRNA expression. A total of 12 human miRNAs were significantly up-regulated in alcohol dependent subjects (fold change≥1.5, false discovery rate (FDR)≤0.3; p<0.05) compared with controls including a cluster of 4 miRNAs (e.g., miR-377, miR-379) from the maternally expressed 14q32 chromosome region. The status of the up-regulated miRNAs was supported using the high-throughput method of exon microarrays showing decreased predicted mRNA gene target expression as anticipated from the same RNA aliquot. Predicted mRNA targets were involved in cellular adhesion (e.g., THBS2), tissue differentiation (e.g., CHN2), neuronal migration (e.g., NDE1), myelination (e.g., UGT8, CNP) and oligodendrocyte proliferation (e.g., ENPP2, SEMA4D1). Our data support an association of alcoholism with up-regulation of a cluster of miRNAs located in the genomic imprinted domain on chromosome 14q32 with their predicted gene targets involved with oligodendrocyte growth, differentiation and signaling. Copyright © 2013 Elsevier B.V. All rights reserved.

  10. Low-Energy Elastic Electron Scattering by Atomic Oxygen

    NASA Technical Reports Server (NTRS)

    Zatsarinny O.; Bartschat, K.; Tayal, S. S.

    2006-01-01

    The B-spline R-matrix method is employed to investigate the low-energy elastic electron scattering by atomic oxygen. Flexible non-orthogonal sets of radial functions are used to construct the target description and to represent the scattering functions. A detailed investigation regarding the dependence of the predicted partial and total cross sections on the scattering model and the accuracy of the target description is presented. The predicted angle-integrated elastic cross sections are in good agreement with experiment, whereas significant discrepancies are found in the angle-differential elastic cross sections near the forward direction. .The near-threshold results are found to strongly depend on the treatment of inner-core short-range correlation effects in the target description, as well as on a proper account of the target polarizability. A sharp increase in the elastic cross sections below 1 eV found in some earlier calculations is judged to be an artifact of an unbalanced description of correlation in the N-electron target structure and the (N+l)-electron-collision problems.

  11. Deep learning methods for protein torsion angle prediction.

    PubMed

    Li, Haiou; Hou, Jie; Adhikari, Badri; Lyu, Qiang; Cheng, Jianlin

    2017-09-18

    Deep learning is one of the most powerful machine learning methods that has achieved the state-of-the-art performance in many domains. Since deep learning was introduced to the field of bioinformatics in 2012, it has achieved success in a number of areas such as protein residue-residue contact prediction, secondary structure prediction, and fold recognition. In this work, we developed deep learning methods to improve the prediction of torsion (dihedral) angles of proteins. We design four different deep learning architectures to predict protein torsion angles. The architectures including deep neural network (DNN) and deep restricted Boltzmann machine (DRBN), deep recurrent neural network (DRNN) and deep recurrent restricted Boltzmann machine (DReRBM) since the protein torsion angle prediction is a sequence related problem. In addition to existing protein features, two new features (predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments) are used as input to each of the four deep learning architectures to predict phi and psi angles of protein backbone. The mean absolute error (MAE) of phi and psi angles predicted by DRNN, DReRBM, DRBM and DNN is about 20-21° and 29-30° on an independent dataset. The MAE of phi angle is comparable to the existing methods, but the MAE of psi angle is 29°, 2° lower than the existing methods. On the latest CASP12 targets, our methods also achieved the performance better than or comparable to a state-of-the art method. Our experiment demonstrates that deep learning is a valuable method for predicting protein torsion angles. The deep recurrent network architecture performs slightly better than deep feed-forward architecture, and the predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments are useful features for improving prediction accuracy.

  12. DrugECs: An Ensemble System with Feature Subspaces for Accurate Drug-Target Interaction Prediction

    PubMed Central

    Jiang, Jinjian; Wang, Nian; Zhang, Jun

    2017-01-01

    Background Drug-target interaction is key in drug discovery, especially in the design of new lead compound. However, the work to find a new lead compound for a specific target is complicated and hard, and it always leads to many mistakes. Therefore computational techniques are commonly adopted in drug design, which can save time and costs to a significant extent. Results To address the issue, a new prediction system is proposed in this work to identify drug-target interaction. First, drug-target pairs are encoded with a fragment technique and the software “PaDEL-Descriptor.” The fragment technique is for encoding target proteins, which divides each protein sequence into several fragments in order and encodes each fragment with several physiochemical properties of amino acids. The software “PaDEL-Descriptor” creates encoding vectors for drug molecules. Second, the dataset of drug-target pairs is resampled and several overlapped subsets are obtained, which are then input into kNN (k-Nearest Neighbor) classifier to build an ensemble system. Conclusion Experimental results on the drug-target dataset showed that our method performs better and runs faster than the state-of-the-art predictors. PMID:28744468

  13. Transmembrane protein topology prediction using support vector machines.

    PubMed

    Nugent, Timothy; Jones, David T

    2009-05-26

    Alpha-helical transmembrane (TM) proteins are involved in a wide range of important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion. Many are also prime drug targets, and it has been estimated that more than half of all drugs currently on the market target membrane proteins. However, due to the experimental difficulties involved in obtaining high quality crystals, this class of protein is severely under-represented in structural databases. In the absence of structural data, sequence-based prediction methods allow TM protein topology to be investigated. We present a support vector machine-based (SVM) TM protein topology predictor that integrates both signal peptide and re-entrant helix prediction, benchmarked with full cross-validation on a novel data set of 131 sequences with known crystal structures. The method achieves topology prediction accuracy of 89%, while signal peptides and re-entrant helices are predicted with 93% and 44% accuracy respectively. An additional SVM trained to discriminate between globular and TM proteins detected zero false positives, with a low false negative rate of 0.4%. We present the results of applying these tools to a number of complete genomes. Source code, data sets and a web server are freely available from http://bioinf.cs.ucl.ac.uk/psipred/. The high accuracy of TM topology prediction which includes detection of both signal peptides and re-entrant helices, combined with the ability to effectively discriminate between TM and globular proteins, make this method ideally suited to whole genome annotation of alpha-helical transmembrane proteins.

  14. Applications of information theory, genetic algorithms, and neural models to predict oil flow

    NASA Astrophysics Data System (ADS)

    Ludwig, Oswaldo; Nunes, Urbano; Araújo, Rui; Schnitman, Leizer; Lepikson, Herman Augusto

    2009-07-01

    This work introduces a new information-theoretic methodology for choosing variables and their time lags in a prediction setting, particularly when neural networks are used in non-linear modeling. The first contribution of this work is the Cross Entropy Function (XEF) proposed to select input variables and their lags in order to compose the input vector of black-box prediction models. The proposed XEF method is more appropriate than the usually applied Cross Correlation Function (XCF) when the relationship among the input and output signals comes from a non-linear dynamic system. The second contribution is a method that minimizes the Joint Conditional Entropy (JCE) between the input and output variables by means of a Genetic Algorithm (GA). The aim is to take into account the dependence among the input variables when selecting the most appropriate set of inputs for a prediction problem. In short, theses methods can be used to assist the selection of input training data that have the necessary information to predict the target data. The proposed methods are applied to a petroleum engineering problem; predicting oil production. Experimental results obtained with a real-world dataset are presented demonstrating the feasibility and effectiveness of the method.

  15. Applicability of a gene expression based prediction method to SD and Wistar rats: an example of CARCINOscreen®.

    PubMed

    Matsumoto, Hiroshi; Saito, Fumiyo; Takeyoshi, Masahiro

    2015-12-01

    Recently, the development of several gene expression-based prediction methods has been attempted in the fields of toxicology. CARCINOscreen® is a gene expression-based screening method to predict carcinogenicity of chemicals which target the liver with high accuracy. In this study, we investigated the applicability of the gene expression-based screening method to SD and Wistar rats by using CARCINOscreen®, originally developed with F344 rats, with two carcinogens, 2,4-diaminotoluen and thioacetamide, and two non-carcinogens, 2,6-diaminotoluen and sodium benzoate. After the 28-day repeated dose test was conducted with each chemical in SD and Wistar rats, microarray analysis was performed using total RNA extracted from each liver. Obtained gene expression data were applied to CARCINOscreen®. Predictive scores obtained by the CARCINOscreen® for known carcinogens were > 2 in all strains of rats, while non-carcinogens gave prediction scores below 0.5. These results suggested that the gene expression based screening method, CARCINOscreen®, can be applied to SD and Wistar rats, widely used strains in toxicological studies, by setting of an appropriate boundary line of prediction score to classify the chemicals into carcinogens and non-carcinogens.

  16. A Bayesian predictive two-stage design for phase II clinical trials.

    PubMed

    Sambucini, Valeria

    2008-04-15

    In this paper, we propose a Bayesian two-stage design for phase II clinical trials, which represents a predictive version of the single threshold design (STD) recently introduced by Tan and Machin. The STD two-stage sample sizes are determined specifying a minimum threshold for the posterior probability that the true response rate exceeds a pre-specified target value and assuming that the observed response rate is slightly higher than the target. Unlike the STD, we do not refer to a fixed experimental outcome, but take into account the uncertainty about future data. In both stages, the design aims to control the probability of getting a large posterior probability that the true response rate exceeds the target value. Such a probability is expressed in terms of prior predictive distributions of the data. The performance of the design is based on the distinction between analysis and design priors, recently introduced in the literature. The properties of the method are studied when all the design parameters vary.

  17. Ensemble method for dengue prediction

    PubMed Central

    Baugher, Benjamin; Moniz, Linda J.; Bagley, Thomas; Babin, Steven M.; Guven, Erhan

    2018-01-01

    Background In the 2015 NOAA Dengue Challenge, participants made three dengue target predictions for two locations (Iquitos, Peru, and San Juan, Puerto Rico) during four dengue seasons: 1) peak height (i.e., maximum weekly number of cases during a transmission season; 2) peak week (i.e., week in which the maximum weekly number of cases occurred); and 3) total number of cases reported during a transmission season. A dengue transmission season is the 12-month period commencing with the location-specific, historical week with the lowest number of cases. At the beginning of the Dengue Challenge, participants were provided with the same input data for developing the models, with the prediction testing data provided at a later date. Methods Our approach used ensemble models created by combining three disparate types of component models: 1) two-dimensional Method of Analogues models incorporating both dengue and climate data; 2) additive seasonal Holt-Winters models with and without wavelet smoothing; and 3) simple historical models. Of the individual component models created, those with the best performance on the prior four years of data were incorporated into the ensemble models. There were separate ensembles for predicting each of the three targets at each of the two locations. Principal findings Our ensemble models scored higher for peak height and total dengue case counts reported in a transmission season for Iquitos than all other models submitted to the Dengue Challenge. However, the ensemble models did not do nearly as well when predicting the peak week. Conclusions The Dengue Challenge organizers scored the dengue predictions of the Challenge participant groups. Our ensemble approach was the best in predicting the total number of dengue cases reported for transmission season and peak height for Iquitos, Peru. PMID:29298320

  18. Earthquake prediction analysis based on empirical seismic rate: the M8 algorithm

    NASA Astrophysics Data System (ADS)

    Molchan, G.; Romashkova, L.

    2010-12-01

    The quality of space-time earthquake prediction is usually characterized by a 2-D error diagram (n, τ), where n is the fraction of failures-to-predict and τ is the local rate of alarm averaged in space. The most reasonable averaging measure for analysis of a prediction strategy is the normalized rate of target events λ(dg) in a subarea dg. In that case the quantity H = 1 - (n + τ) determines the prediction capability of the strategy. The uncertainty of λ(dg) causes difficulties in estimating H and the statistical significance, α, of prediction results. We investigate this problem theoretically and show how the uncertainty of the measure can be taken into account in two situations, viz., the estimation of α and the construction of a confidence zone for the (n, τ)-parameters of the random strategies. We use our approach to analyse the results from prediction of M >= 8.0 events by the M8 method for the period 1985-2009 (the M8.0+ test). The model of λ(dg) based on the events Mw >= 5.5, 1977-2004, and the magnitude range of target events 8.0 <= M < 8.5 are considered as basic to this M8 analysis. We find the point and upper estimates of α and show that they are still unstable because the number of target events in the experiment is small. However, our results argue in favour of non-triviality of the M8 prediction algorithm.

  19. Inference of Expanded Lrp-Like Feast/Famine Transcription Factor Targets in a Non-Model Organism Using Protein Structure-Based Prediction

    PubMed Central

    Ashworth, Justin; Plaisier, Christopher L.; Lo, Fang Yin; Reiss, David J.; Baliga, Nitin S.

    2014-01-01

    Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer. PMID:25255272

  20. Inference of expanded Lrp-like feast/famine transcription factor targets in a non-model organism using protein structure-based prediction.

    PubMed

    Ashworth, Justin; Plaisier, Christopher L; Lo, Fang Yin; Reiss, David J; Baliga, Nitin S

    2014-01-01

    Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.

  1. In silico re-identification of properties of drug target proteins.

    PubMed

    Kim, Baeksoo; Jo, Jihoon; Han, Jonghyun; Park, Chungoo; Lee, Hyunju

    2017-05-31

    Computational approaches in the identification of drug targets are expected to reduce time and effort in drug development. Advances in genomics and proteomics provide the opportunity to uncover properties of druggable genomes. Although several studies have been conducted for distinguishing drug targets from non-drug targets, they mainly focus on the sequences and functional roles of proteins. Many other properties of proteins have not been fully investigated. Using the DrugBank (version 3.0) database containing nearly 6,816 drug entries including 760 FDA-approved drugs and 1822 of their targets and human UniProt/Swiss-Prot databases, we defined 1578 non-redundant drug target and 17,575 non-drug target proteins. To select these non-redundant protein datasets, we built four datasets (A, B, C, and D) by considering clustering of paralogous proteins. We first reassessed the widely used properties of drug target proteins. We confirmed and extended that drug target proteins (1) are likely to have more hydrophobic, less polar, less PEST sequences, and more signal peptide sequences higher and (2) are more involved in enzyme catalysis, oxidation and reduction in cellular respiration, and operational genes. In this study, we proposed new properties (essentiality, expression pattern, PTMs, and solvent accessibility) for effectively identifying drug target proteins. We found that (1) drug targetability and protein essentiality are decoupled, (2) druggability of proteins has high expression level and tissue specificity, and (3) functional post-translational modification residues are enriched in drug target proteins. In addition, to predict the drug targetability of proteins, we exploited two machine learning methods (Support Vector Machine and Random Forest). When we predicted drug targets by combining previously known protein properties and proposed new properties, an F-score of 0.8307 was obtained. When the newly proposed properties are integrated, the prediction performance is improved and these properties are related to drug targets. We believe that our study will provide a new aspect in inferring drug-target interactions.

  2. Analysis of deep learning methods for blind protein contact prediction in CASP12.

    PubMed

    Wang, Sheng; Sun, Siqi; Xu, Jinbo

    2018-03-01

    Here we present the results of protein contact prediction achieved in CASP12 by our RaptorX-Contact server, which is an early implementation of our deep learning method for contact prediction. On a set of 38 free-modeling target domains with a median family size of around 58 effective sequences, our server obtained an average top L/5 long- and medium-range contact accuracy of 47% and 44%, respectively (L = length). A complete implementation has an average accuracy of 59% and 57%, respectively. Our deep learning method formulates contact prediction as a pixel-level image labeling problem and simultaneously predicts all residue pairs of a protein using a combination of two deep residual neural networks, taking as input the residue conservation information, predicted secondary structure and solvent accessibility, contact potential, and coevolution information. Our approach differs from existing methods mainly in (1) formulating contact prediction as a pixel-level image labeling problem instead of an image-level classification problem; (2) simultaneously predicting all contacts of an individual protein to make effective use of contact occurrence patterns; and (3) integrating both one-dimensional and two-dimensional deep convolutional neural networks to effectively learn complex sequence-structure relationship including high-order residue correlation. This paper discusses the RaptorX-Contact pipeline, both contact prediction and contact-based folding results, and finally the strength and weakness of our method. © 2017 Wiley Periodicals, Inc.

  3. Miscellaneous Topics in Computer-Aided Drug Design: Synthetic Accessibility and GPU Computing, and Other Topics.

    PubMed

    Fukunishi, Yoshifumi; Mashimo, Tadaaki; Misoo, Kiyotaka; Wakabayashi, Yoshinori; Miyaki, Toshiaki; Ohta, Seiji; Nakamura, Mayu; Ikeda, Kazuyoshi

    2016-01-01

    Computer-aided drug design is still a state-of-the-art process in medicinal chemistry, and the main topics in this field have been extensively studied and well reviewed. These topics include compound databases, ligand-binding pocket prediction, protein-compound docking, virtual screening, target/off-target prediction, physical property prediction, molecular simulation and pharmacokinetics/pharmacodynamics (PK/PD) prediction. Message and Conclusion: However, there are also a number of secondary or miscellaneous topics that have been less well covered. For example, methods for synthesizing and predicting the synthetic accessibility (SA) of designed compounds are important in practical drug development, and hardware/software resources for performing the computations in computer-aided drug design are crucial. Cloud computing and general purpose graphics processing unit (GPGPU) computing have been used in virtual screening and molecular dynamics simulations. Not surprisingly, there is a growing demand for computer systems which combine these resources. In the present review, we summarize and discuss these various topics of drug design.

  4. Miscellaneous Topics in Computer-Aided Drug Design: Synthetic Accessibility and GPU Computing, and Other Topics

    PubMed Central

    Fukunishi, Yoshifumi; Mashimo, Tadaaki; Misoo, Kiyotaka; Wakabayashi, Yoshinori; Miyaki, Toshiaki; Ohta, Seiji; Nakamura, Mayu; Ikeda, Kazuyoshi

    2016-01-01

    Abstract: Background Computer-aided drug design is still a state-of-the-art process in medicinal chemistry, and the main topics in this field have been extensively studied and well reviewed. These topics include compound databases, ligand-binding pocket prediction, protein-compound docking, virtual screening, target/off-target prediction, physical property prediction, molecular simulation and pharmacokinetics/pharmacodynamics (PK/PD) prediction. Message and Conclusion: However, there are also a number of secondary or miscellaneous topics that have been less well covered. For example, methods for synthesizing and predicting the synthetic accessibility (SA) of designed compounds are important in practical drug development, and hardware/software resources for performing the computations in computer-aided drug design are crucial. Cloud computing and general purpose graphics processing unit (GPGPU) computing have been used in virtual screening and molecular dynamics simulations. Not surprisingly, there is a growing demand for computer systems which combine these resources. In the present review, we summarize and discuss these various topics of drug design. PMID:27075578

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, P; Kuo, L; Yorke, E

    Purpose: To develop a biological modeling strategy which incorporates the response observed on the mid-treatment PET/CT into a dose escalation design for adaptive radiotherapy of non-small-cell lung cancer. Method: FDG-PET/CT was acquired midway through standard fractionated treatment and registered to pre-treatment planning PET/CT to evaluate radiation response of lung cancer. Each mid-treatment PET voxel was assigned the median SUV inside a concentric 1cm-diameter sphere to account for registration and imaging uncertainties. For each voxel, the planned radiation dose, pre- and mid-treatment SUVs were used to parameterize the linear-quadratic model, which was then utilized to predict the SUV distribution after themore » full prescribed dose. Voxels with predicted post-treatment SUV≥2 were identified as the resistant target (response arm). An adaptive simultaneous integrated boost was designed to escalate dose to the resistant target as high as possible, while keeping prescription dose to the original target and lung toxicity intact. In contrast, an adaptive target volume was delineated based only on the intensity of mid-treatment PET/CT (intensity arm), and a similar adaptive boost plan was optimized. The dose escalation capability of the two approaches was compared. Result: Images of three patients were used in this planning study. For one patient, SUV prediction indicated complete response and no necessary dose escalation. For the other two, resistant targets defined in the response arm were multifocal, and on average accounted for 25% of the pre-treatment target, compared to 67% in the intensity arm. The smaller response arm targets led to a 6Gy higher mean target dose in the adaptive escalation design. Conclusion: This pilot study suggests that adaptive dose escalation to a biologically resistant target predicted from a pre- and mid-treatment PET/CT may be more effective than escalation based on the mid-treatment PET/CT alone. More plans and ultimately clinical protocols are needed to validate this approach. MSKCC has a research agreement with Varian Medical System.« less

  6. Lessons learned from participating in D3R 2016 Grand Challenge 2: compounds targeting the farnesoid X receptor

    NASA Astrophysics Data System (ADS)

    Duan, Rui; Xu, Xianjin; Zou, Xiaoqin

    2018-01-01

    D3R 2016 Grand Challenge 2 focused on predictions of binding modes and affinities for 102 compounds against the farnesoid X receptor (FXR). In this challenge, two distinct methods, a docking-based method and a template-based method, were employed by our team for the binding mode prediction. For the new template-based method, 3D ligand similarities were calculated for each query compound against the ligands in the co-crystal structures of FXR available in Protein Data Bank. The binding mode was predicted based on the co-crystal protein structure containing the ligand with the best ligand similarity score against the query compound. For the FXR dataset, the template-based method achieved a better performance than the docking-based method on the binding mode prediction. For the binding affinity prediction, an in-house knowledge-based scoring function ITScore2 and MM/PBSA approach were employed. Good performance was achieved for MM/PBSA, whereas the performance of ITScore2 was sensitive to ligand composition, e.g. the percentage of carbon atoms in the compounds. The sensitivity to ligand composition could be a clue for the further improvement of our knowledge-based scoring function.

  7. Ligand Binding Site Detection by Local Structure Alignment and Its Performance Complementarity

    PubMed Central

    Lee, Hui Sun; Im, Wonpil

    2013-01-01

    Accurate determination of potential ligand binding sites (BS) is a key step for protein function characterization and structure-based drug design. Despite promising results of template-based BS prediction methods using global structure alignment (GSA), there is a room to improve the performance by properly incorporating local structure alignment (LSA) because BS are local structures and often similar for proteins with dissimilar global folds. We present a template-based ligand BS prediction method using G-LoSA, our LSA tool. A large benchmark set validation shows that G-LoSA predicts drug-like ligands’ positions in single-chain protein targets more precisely than TM-align, a GSA-based method, while the overall success rate of TM-align is better. G-LoSA is particularly efficient for accurate detection of local structures conserved across proteins with diverse global topologies. Recognizing the performance complementarity of G-LoSA to TM-align and a non-template geometry-based method, fpocket, a robust consensus scoring method, CMCS-BSP (Complementary Methods and Consensus Scoring for ligand Binding Site Prediction), is developed and shows improvement on prediction accuracy. The G-LoSA source code is freely available at http://im.bioinformatics.ku.edu/GLoSA. PMID:23957286

  8. Autonomous Motion Planning Using a Predictive Temporal Method

    DTIC Science & Technology

    2009-01-01

    interception test. ......150 5-20 Target and solution path heading angles for target interception test. ..............................151 10 LIST...environment as a series of distances and angles . Regardless of the technique, this knowledge of the surrounding area is crucial for the issue of...to, the rather simplistic vector driver algorithms which compute the angle between the current vehicle heading and the heading to the goal and

  9. MicroRNA-Target Network Inference and Local Network Enrichment Analysis Identify Two microRNA Clusters with Distinct Functions in Head and Neck Squamous Cell Carcinoma

    PubMed Central

    Sass, Steffen; Pitea, Adriana; Unger, Kristian; Hess, Julia; Mueller, Nikola S.; Theis, Fabian J.

    2015-01-01

    MicroRNAs represent ~22 nt long endogenous small RNA molecules that have been experimentally shown to regulate gene expression post-transcriptionally. One main interest in miRNA research is the investigation of their functional roles, which can typically be accomplished by identification of mi-/mRNA interactions and functional annotation of target gene sets. We here present a novel method “miRlastic”, which infers miRNA-target interactions using transcriptomic data as well as prior knowledge and performs functional annotation of target genes by exploiting the local structure of the inferred network. For the network inference, we applied linear regression modeling with elastic net regularization on matched microRNA and messenger RNA expression profiling data to perform feature selection on prior knowledge from sequence-based target prediction resources. The novelty of miRlastic inference originates in predicting data-driven intra-transcriptome regulatory relationships through feature selection. With synthetic data, we showed that miRlastic outperformed commonly used methods and was suitable even for low sample sizes. To gain insight into the functional role of miRNAs and to determine joint functional properties of miRNA clusters, we introduced a local enrichment analysis procedure. The principle of this procedure lies in identifying regions of high functional similarity by evaluating the shortest paths between genes in the network. We can finally assign functional roles to the miRNAs by taking their regulatory relationships into account. We thoroughly evaluated miRlastic on a cohort of head and neck cancer (HNSCC) patients provided by The Cancer Genome Atlas. We inferred an mi-/mRNA regulatory network for human papilloma virus (HPV)-associated miRNAs in HNSCC. The resulting network best enriched for experimentally validated miRNA-target interaction, when compared to common methods. Finally, the local enrichment step identified two functional clusters of miRNAs that were predicted to mediate HPV-associated dysregulation in HNSCC. Our novel approach was able to characterize distinct pathway regulations from matched miRNA and mRNA data. An R package of miRlastic was made available through: http://icb.helmholtz-muenchen.de/mirlastic. PMID:26694379

  10. MicroRNA-Target Network Inference and Local Network Enrichment Analysis Identify Two microRNA Clusters with Distinct Functions in Head and Neck Squamous Cell Carcinoma.

    PubMed

    Sass, Steffen; Pitea, Adriana; Unger, Kristian; Hess, Julia; Mueller, Nikola S; Theis, Fabian J

    2015-12-18

    MicroRNAs represent ~22 nt long endogenous small RNA molecules that have been experimentally shown to regulate gene expression post-transcriptionally. One main interest in miRNA research is the investigation of their functional roles, which can typically be accomplished by identification of mi-/mRNA interactions and functional annotation of target gene sets. We here present a novel method "miRlastic", which infers miRNA-target interactions using transcriptomic data as well as prior knowledge and performs functional annotation of target genes by exploiting the local structure of the inferred network. For the network inference, we applied linear regression modeling with elastic net regularization on matched microRNA and messenger RNA expression profiling data to perform feature selection on prior knowledge from sequence-based target prediction resources. The novelty of miRlastic inference originates in predicting data-driven intra-transcriptome regulatory relationships through feature selection. With synthetic data, we showed that miRlastic outperformed commonly used methods and was suitable even for low sample sizes. To gain insight into the functional role of miRNAs and to determine joint functional properties of miRNA clusters, we introduced a local enrichment analysis procedure. The principle of this procedure lies in identifying regions of high functional similarity by evaluating the shortest paths between genes in the network. We can finally assign functional roles to the miRNAs by taking their regulatory relationships into account. We thoroughly evaluated miRlastic on a cohort of head and neck cancer (HNSCC) patients provided by The Cancer Genome Atlas. We inferred an mi-/mRNA regulatory network for human papilloma virus (HPV)-associated miRNAs in HNSCC. The resulting network best enriched for experimentally validated miRNA-target interaction, when compared to common methods. Finally, the local enrichment step identified two functional clusters of miRNAs that were predicted to mediate HPV-associated dysregulation in HNSCC. Our novel approach was able to characterize distinct pathway regulations from matched miRNA and mRNA data. An R package of miRlastic was made available through: http://icb.helmholtz-muenchen.de/mirlastic.

  11. A network analysis of the Chinese medicine Lianhua-Qingwen formula to identify its main effective components.

    PubMed

    Wang, Chun-Hua; Zhong, Yi; Zhang, Yan; Liu, Jin-Ping; Wang, Yue-Fei; Jia, Wei-Na; Wang, Guo-Cai; Li, Zheng; Zhu, Yan; Gao, Xiu-Mei

    2016-02-01

    Chinese medicine is known to treat complex diseases with multiple components and multiple targets. However, the main effective components and their related key targets and functions remain to be identified. Herein, a network analysis method was developed to identify the main effective components and key targets of a Chinese medicine, Lianhua-Qingwen Formula (LQF). The LQF is commonly used for the prevention and treatment of viral influenza in China. It is composed of 11 herbs, gypsum and menthol with 61 compounds being identified in our previous work. In this paper, these 61 candidate compounds were used to find their related targets and construct the predicted-target (PT) network. An influenza-related protein-protein interaction (PPI) network was constructed and integrated with the PT network. Then the compound-effective target (CET) network and compound-ineffective target network (CIT) were extracted, respectively. A novel approach was developed to identify effective components by comparing CET and CIT networks. As a result, 15 main effective components were identified along with 61 corresponding targets. 7 of these main effective components were further experimentally validated to have antivirus efficacy in vitro. The main effective component-target (MECT) network was further constructed with main effective components and their key targets. Gene Ontology (GO) analysis of the MECT network predicted key functions such as NO production being modulated by the LQF. Interestingly, five effective components were experimentally tested and exhibited inhibitory effects on NO production in the LPS induced RAW 264.7 cell. In summary, we have developed a novel approach to identify the main effective components in a Chinese medicine LQF and experimentally validated some of the predictions.

  12. Prediction of thermal coagulation from the instantaneous strain distribution induced by high-intensity focused ultrasound

    NASA Astrophysics Data System (ADS)

    Iwasaki, Ryosuke; Takagi, Ryo; Tomiyasu, Kentaro; Yoshizawa, Shin; Umemura, Shin-ichiro

    2017-07-01

    The targeting of the ultrasound beam and the prediction of thermal lesion formation in advance are the requirements for monitoring high-intensity focused ultrasound (HIFU) treatment with safety and reproducibility. To visualize the HIFU focal zone, we utilized an acoustic radiation force impulse (ARFI) imaging-based method. After inducing displacements inside tissues with pulsed HIFU called the push pulse exposure, the distribution of axial displacements started expanding and moving. To acquire RF data immediately after and during the HIFU push pulse exposure to improve prediction accuracy, we attempted methods using extrapolation estimation and applying HIFU noise elimination. The distributions going back in the time domain from the end of push pulse exposure are in good agreement with tissue coagulation at the center. The results suggest that the proposed focal zone visualization employing pulsed HIFU entailing the high-speed ARFI imaging method is useful for the prediction of thermal coagulation in advance.

  13. Prediction of missing common genes for disease pairs using network based module separation on incomplete human interactome.

    PubMed

    Akram, Pakeeza; Liao, Li

    2017-12-06

    Identification of common genes associated with comorbid diseases can be critical in understanding their pathobiological mechanism. This work presents a novel method to predict missing common genes associated with a disease pair. Searching for missing common genes is formulated as an optimization problem to minimize network based module separation from two subgraphs produced by mapping genes associated with disease onto the interactome. Using cross validation on more than 600 disease pairs, our method achieves significantly higher average receiver operating characteristic ROC Score of 0.95 compared to a baseline ROC score 0.60 using randomized data. Missing common genes prediction is aimed to complete gene set associated with comorbid disease for better understanding of biological intervention. It will also be useful for gene targeted therapeutics related to comorbid diseases. This method can be further considered for prediction of missing edges to complete the subgraph associated with disease pair.

  14. TARGET Research Goals

    Cancer.gov

    TARGET researchers use various sequencing and array-based methods to examine the genomes, transcriptomes, and for some diseases epigenomes of select childhood cancers. This “multi-omic” approach generates a comprehensive profile of molecular alterations for each cancer type. Alterations are changes in DNA or RNA, such as rearrangements in chromosome structure or variations in gene expression, respectively. Through computational analyses and assays to validate biological function, TARGET researchers predict which alterations disrupt the function of a gene or pathway and promote cancer growth, progression, and/or survival. Researchers identify candidate therapeutic targets and/or prognostic markers from the cancer-associated alterations.

  15. Biochemical interpretation of quantitative structure-activity relationships (QSAR) for biodegradation of N-heterocycles: a complementary approach to predict biodegradability.

    PubMed

    Philipp, Bodo; Hoff, Malte; Germa, Florence; Schink, Bernhard; Beimborn, Dieter; Mersch-Sundermann, Volker

    2007-02-15

    Prediction of the biodegradability of organic compounds is an ecologically desirable and economically feasible tool for estimating the environmental fate of chemicals. We combined quantitative structure-activity relationships (QSAR) with the systematic collection of biochemical knowledge to establish rules for the prediction of aerobic biodegradation of N-heterocycles. Validated biodegradation data of 194 N-heterocyclic compounds were analyzed using the MULTICASE-method which delivered two QSAR models based on 17 activating (OSAR 1) and on 16 inactivating molecular fragments (GSAR 2), which were statistically significantly linked to efficient or poor biodegradability, respectively. The percentages of correct classifications were over 99% for both models, and cross-validation resulted in 67.9% (GSAR 1) and 70.4% (OSAR 2) correct predictions. Biochemical interpretation of the activating and inactivating characteristics of the molecular fragments delivered plausible mechanistic interpretations and enabled us to establish the following biodegradation rules: (1) Target sites for amidohydrolases and for cytochrome P450 monooxygenases enhance biodegradation of nonaromatic N-heterocycles. (2) Target sites for molybdenum hydroxylases enhance biodegradation of aromatic N-heterocycles. (3) Target sites for hydratation by an urocanase-like mechanism enhance biodegradation of imidazoles. Our complementary approach represents a feasible strategy for generating concrete rules for the prediction of biodegradability of organic compounds.

  16. Study of moving object detecting and tracking algorithm for video surveillance system

    NASA Astrophysics Data System (ADS)

    Wang, Tao; Zhang, Rongfu

    2010-10-01

    This paper describes a specific process of moving target detecting and tracking in the video surveillance.Obtain high-quality background is the key to achieving differential target detecting in the video surveillance.The paper is based on a block segmentation method to build clear background,and using the method of background difference to detecing moving target,after a series of treatment we can be extracted the more comprehensive object from original image,then using the smallest bounding rectangle to locate the object.In the video surveillance system, the delay of camera and other reasons lead to tracking lag,the model of Kalman filter based on template matching was proposed,using deduced and estimated capacity of Kalman,the center of smallest bounding rectangle for predictive value,predicted the position in the next moment may appare,followed by template matching in the region as the center of this position,by calculate the cross-correlation similarity of current image and reference image,can determine the best matching center.As narrowed the scope of searching,thereby reduced the searching time,so there be achieve fast-tracking.

  17. Method for high-precision multi-layered thin film deposition for deep and extreme ultraviolet mirrors

    DOEpatents

    Ruffner, J.A.

    1999-06-15

    A method for coating (flat or non-flat) optical substrates with high-reflectivity multi-layer coatings for use at Deep Ultra-Violet (DUV) and Extreme Ultra-Violet (EUV) wavelengths. The method results in a product with minimum feature sizes of less than 0.10 [micro]m for the shortest wavelength (13.4 nm). The present invention employs a computer-based modeling and deposition method to enable lateral and vertical thickness control by scanning the position of the substrate with respect to the sputter target during deposition. The thickness profile of the sputter targets is modeled before deposition and then an appropriate scanning algorithm is implemented to produce any desired, radially-symmetric thickness profile. The present invention offers the ability to predict and achieve a wide range of thickness profiles on flat or figured substrates, i.e., account for 1/R[sup 2] factor in a model, and the ability to predict and accommodate changes in deposition rate as a result of plasma geometry, i.e., over figured substrates. 15 figs.

  18. Evaluating resective surgery targets in epilepsy patients: A comparison of quantitative EEG methods.

    PubMed

    Müller, Michael; Schindler, Kaspar; Goodfellow, Marc; Pollo, Claudio; Rummel, Christian; Steimer, Andreas

    2018-07-15

    Quantitative analysis of intracranial EEG is a promising tool to assist clinicians in the planning of resective brain surgery in patients suffering from pharmacoresistant epilepsies. Quantifying the accuracy of such tools, however, is nontrivial as a ground truth to verify predictions about hypothetical resections is missing. As one possibility to address this, we use customized hypotheses tests to examine the agreement of the methods on a common set of patients. One method uses machine learning techniques to enable the predictive modeling of EEG time series. The other estimates nonlinear interrelation between EEG channels. Both methods were independently shown to distinguish patients with excellent post-surgical outcome (Engel class I) from those without improvement (Engel class IV) when assessing the electrodes associated with the tissue that was actually resected during brain surgery. Using the AND and OR conjunction of both methods we evaluate the performance gain that can be expected when combining them. Both methods' assessments correlate strongly positively with the similarity between a hypothetical resection and the corresponding actual resection in class I patients. Moreover, the Spearman rank correlation between the methods' patient rankings is significantly positive. To our best knowledge, this is the first study comparing surgery target assessments from fundamentally differing techniques. Although conceptually completely independent, there is a relation between the predictions obtained from both methods. Their broad consensus supports their application in clinical practice to provide physicians additional information in the process of presurgical evaluation. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  19. Differential Expression of MicroRNA and Predicted Targets in Pulmonary Sarcoidosis

    PubMed Central

    Crouser, Elliott D.; Julian, Mark W.; Crawford, Melissa; Shao, Guohong; Yu, Lianbo; Planck, Stephen R.; Rosenbaum, James T.; Nana-Sinkam, S. Patrick

    2014-01-01

    Background Recent studies show that various inflammatory diseases are regulated at the level of RNA translation by small non-coding RNAs, termed microRNAs (miRNAs). We sought to determine whether sarcoidosis tissues harbor a distinct pattern of miRNA expression and then considered their potential molecular targets. Methods and Results Genome-wide microarray analysis of miRNA expression in lung tissue and peripheral blood mononuclear cells (PBMCs) was performed and differentially expressed (DE)-miRNAs were then validated by real-time PCR. A distinct pattern of DE-miRNA expression was identified in both lung tissue and PBMCs of sarcoidosis patients. A subgroup of DE-miRNAs common to lung and lymph node tissues were predicted to target transforming growth factor (TGFβ)-regulated pathways. Likewise, the DE-miRNAs identified in PBMCs of sarcoidosis patients were predicted to target the TGFβ-regulated “wingless and integrase-1” (WNT) pathway. Conclusions This study is the first to profile miRNAs in sarcoidosis tissues and to consider their possible roles in disease pathogenesis. Our results suggest that miRNA regulate TGFβ and related WNT pathways in sarcoidosis tissues, pathways previously incriminated in the pathogenesis of sarcoidosis. PMID:22209793

  20. Charge-to-mass dispersion methods for abrasion-ablation fragmentation models

    NASA Technical Reports Server (NTRS)

    Townsend, L. W.; Norbury, J. W.

    1985-01-01

    Methods to describe the charge-to-mass dispersion distributions of projectile prefragments are presented and used to determine individual isotope cross-sections or various elements produced in the fragmentation of relativistic argon nuclei by carbon targets. Although slight improvements in predicted cross-sections are obtained for the quantum mechanical giant dipole resonance (GDR) distribution when compared qith the predictions of the geometric GDR model, the closest agreement between theory and experiment continues to be obtained with the simple hypergeometric distribution, which treats the nucleons in the nucleus as completely uncorrelated.

  1. Local Geometry and Evolutionary Conservation of Protein Surfaces Reveal the Multiple Recognition Patches in Protein-Protein Interactions

    PubMed Central

    Laine, Elodie; Carbone, Alessandra

    2015-01-01

    Protein-protein interactions (PPIs) are essential to all biological processes and they represent increasingly important therapeutic targets. Here, we present a new method for accurately predicting protein-protein interfaces, understanding their properties, origins and binding to multiple partners. Contrary to machine learning approaches, our method combines in a rational and very straightforward way three sequence- and structure-based descriptors of protein residues: evolutionary conservation, physico-chemical properties and local geometry. The implemented strategy yields very precise predictions for a wide range of protein-protein interfaces and discriminates them from small-molecule binding sites. Beyond its predictive power, the approach permits to dissect interaction surfaces and unravel their complexity. We show how the analysis of the predicted patches can foster new strategies for PPIs modulation and interaction surface redesign. The approach is implemented in JET2, an automated tool based on the Joint Evolutionary Trees (JET) method for sequence-based protein interface prediction. JET2 is freely available at www.lcqb.upmc.fr/JET2. PMID:26690684

  2. Quantitative prediction of drug side effects based on drug-related features.

    PubMed

    Niu, Yanqing; Zhang, Wen

    2017-09-01

    Unexpected side effects of drugs are great concern in the drug development, and the identification of side effects is an important task. Recently, machine learning methods are proposed to predict the presence or absence of interested side effects for drugs, but it is difficult to make the accurate prediction for all of them. In this paper, we transform side effect profiles of drugs as their quantitative scores, by summing up their side effects with weights. The quantitative scores may measure the dangers of drugs, and thus help to compare the risk of different drugs. Here, we attempt to predict quantitative scores of drugs, namely the quantitative prediction. Specifically, we explore a variety of drug-related features and evaluate their discriminative powers for the quantitative prediction. Then, we consider several feature combination strategies (direct combination, average scoring ensemble combination) to integrate three informative features: chemical substructures, targets, and treatment indications. Finally, the average scoring ensemble model which produces the better performances is used as the final quantitative prediction model. Since weights for side effects are empirical values, we randomly generate different weights in the simulation experiments. The experimental results show that the quantitative method is robust to different weights, and produces satisfying results. Although other state-of-the-art methods cannot make the quantitative prediction directly, the prediction results can be transformed as the quantitative scores. By indirect comparison, the proposed method produces much better results than benchmark methods in the quantitative prediction. In conclusion, the proposed method is promising for the quantitative prediction of side effects, which may work cooperatively with existing state-of-the-art methods to reveal dangers of drugs.

  3. The MicroRNA Interaction Network of Lipid Diseases

    PubMed Central

    Kandhro, Abdul H.; Shoombuatong, Watshara; Nantasenamat, Chanin; Prachayasittikul, Virapong; Nuchnoi, Pornlada

    2017-01-01

    Background: Dyslipidemia is one of the major forms of lipid disorder, characterized by increased triglycerides (TGs), increased low-density lipoprotein-cholesterol (LDL-C), and decreased high-density lipoprotein-cholesterol (HDL-C) levels in blood. Recently, MicroRNAs (miRNAs) have been reported to involve in various biological processes; their potential usage being a biomarkers and in diagnosis of various diseases. Computational approaches including text mining have been used recently to analyze abstracts from the public databases to observe the relationships/associations between the biological molecules, miRNAs, and disease phenotypes. Materials and Methods: In the present study, significance of text mined extracted pair associations (miRNA-lipid disease) were estimated by one-sided Fisher's exact test. The top 20 significant miRNA-disease associations were visualized on Cytoscape. The CyTargetLinker plug-in tool on Cytoscape was used to extend the network and predicts new miRNA target genes. The Biological Networks Gene Ontology (BiNGO) plug-in tool on Cytoscape was used to retrieve gene ontology (GO) annotations for the targeted genes. Results: We retrieved 227 miRNA-lipid disease associations including 148 miRNAs. The top 20 significant miRNAs analysis on CyTargetLinker provides defined, predicted and validated gene targets, further targeted genes analyzed by BiNGO showed targeted genes were significantly associated with lipid, cholesterol, apolipoprotein, and fatty acids GO terms. Conclusion: We are the first to provide a reliable miRNA-lipid disease association network based on text mining. This could help future experimental studies that aim to validate predicted gene targets. PMID:29018475

  4. Progress in sensor performance testing, modeling and range prediction using the TOD method: an overview

    NASA Astrophysics Data System (ADS)

    Bijl, Piet; Hogervorst, Maarten A.; Toet, Alexander

    2017-05-01

    The Triangle Orientation Discrimination (TOD) methodology includes i) a widely applicable, accurate end-to-end EO/IR sensor test, ii) an image-based sensor system model and iii) a Target Acquisition (TA) range model. The method has been extensively validated against TA field performance for a wide variety of well- and under-sampled imagers, systems with advanced image processing techniques such as dynamic super resolution and local adaptive contrast enhancement, and sensors showing smear or noise drift, for both static and dynamic test stimuli and as a function of target contrast. Recently, significant progress has been made in various directions. Dedicated visual and NIR test charts for lab and field testing are available and thermal test benches are on the market. Automated sensor testing using an objective synthetic human observer is within reach. Both an analytical and an image-based TOD model have recently been developed and are being implemented in the European Target Acquisition model ECOMOS and in the EOSTAR TDA. Further, the methodology is being applied for design optimization of high-end security camera systems. Finally, results from a recent perception study suggest that DRI ranges for real targets can be predicted by replacing the relevant distinctive target features by TOD test patterns of the same characteristic size and contrast, enabling a new TA modeling approach. This paper provides an overview.

  5. Docking and scoring protein interactions: CAPRI 2009.

    PubMed

    Lensink, Marc F; Wodak, Shoshana J

    2010-11-15

    Protein docking algorithms are assessed by evaluating blind predictions performed during 2007-2009 in Rounds 13-19 of the community-wide experiment on critical assessment of predicted interactions (CAPRI). We evaluated the ability of these algorithms to sample docking poses and to single out specific association modes in 14 targets, representing 11 distinct protein complexes. These complexes play important biological roles in RNA maturation, G-protein signal processing, and enzyme inhibition and function. One target involved protein-RNA interactions not previously considered in CAPRI, several others were hetero-oligomers, or featured multiple interfaces between the same protein pair. For most targets, predictions started from the experimentally determined structures of the free (unbound) components, or from models built from known structures of related or similar proteins. To succeed they therefore needed to account for conformational changes and model inaccuracies. In total, 64 groups and 12 web-servers submitted docking predictions of which 4420 were evaluated. Overall our assessment reveals that 67% of the groups, more than ever before, produced acceptable models or better for at least one target, with many groups submitting multiple high- and medium-accuracy models for two to six targets. Forty-one groups including four web-servers participated in the scoring experiment with 1296 evaluated models. Scoring predictions also show signs of progress evidenced from the large proportion of correct models submitted. But singling out the best models remains a challenge, which also adversely affects the ability to correctly rank docking models. With the increased interest in translating abstract protein interaction networks into realistic models of protein assemblies, the growing CAPRI community is actively developing more efficient and reliable docking and scoring methods for everyone to use. © 2010 Wiley-Liss, Inc.

  6. Genome-wide prediction of vaccine targets for human herpes simplex viruses using Vaxign reverse vaccinology

    PubMed Central

    2013-01-01

    Herpes simplex virus (HSV) types 1 and 2 (HSV-1 and HSV-2) are the most common infectious agents of humans. No safe and effective HSV vaccines have been licensed. Reverse vaccinology is an emerging and revolutionary vaccine development strategy that starts with the prediction of vaccine targets by informatics analysis of genome sequences. Vaxign (http://www.violinet.org/vaxign) is the first web-based vaccine design program based on reverse vaccinology. In this study, we used Vaxign to analyze 52 herpesvirus genomes, including 3 HSV-1 genomes, one HSV-2 genome, 8 other human herpesvirus genomes, and 40 non-human herpesvirus genomes. The HSV-1 strain 17 genome that contains 77 proteins was used as the seed genome. These 77 proteins are conserved in two other HSV-1 strains (strain F and strain H129). Two envelope glycoproteins gJ and gG do not have orthologs in HSV-2 or 8 other human herpesviruses. Seven HSV-1 proteins (including gJ and gG) do not have orthologs in all 40 non-human herpesviruses. Nineteen proteins are conserved in all human herpesviruses, including capsid scaffold protein UL26.5 (NP_044628.1). As the only HSV-1 protein predicted to be an adhesin, UL26.5 is a promising vaccine target. The MHC Class I and II epitopes were predicted by the Vaxign Vaxitop prediction program and IEDB prediction programs recently installed and incorporated in Vaxign. Our comparative analysis found that the two programs identified largely the same top epitopes but also some positive results predicted from one program might not be positive from another program. Overall, our Vaxign computational prediction provides many promising candidates for rational HSV vaccine development. The method is generic and can also be used to predict other viral vaccine targets. PMID:23514126

  7. Comprehensive human transcription factor binding site map for combinatory binding motifs discovery.

    PubMed

    Müller-Molina, Arnoldo J; Schöler, Hans R; Araúzo-Bravo, Marcos J

    2012-01-01

    To know the map between transcription factors (TFs) and their binding sites is essential to reverse engineer the regulation process. Only about 10%-20% of the transcription factor binding motifs (TFBMs) have been reported. This lack of data hinders understanding gene regulation. To address this drawback, we propose a computational method that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters. The method starts by predicting a dictionary of regulatory "DNA words." From this dictionary, it distills 4098 novel predictions. To disclose the crosstalk between motifs, an additional algorithm extracts TF combinatorial binding patterns creating a collection of TF regulatory syntactic rules. Using these rules, we narrowed down a list of 504 novel motifs that appear frequently in syntax patterns. We tested the predictions against 509 known motifs confirming that our system can reliably predict ab initio motifs with an accuracy of 81%-far higher than previous approaches. We found that on average, 90% of the discovered combinatorial binding patterns target at least 10 genes, suggesting that to control in an independent manner smaller gene sets, supplementary regulatory mechanisms are required. Additionally, we discovered that the new TFBMs and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions. Thus, among all the possible available targets in the genome, the TFs tend to regulate other TFs and genes involved in developmental functions. We provide a comprehensive resource for regulation analysis that includes a dictionary of "DNA words," newly predicted motifs and their corresponding combinatorial patterns. Combinatorial patterns are a useful filter to discover TFBMs that play a major role in orchestrating other factors and thus, are likely to lock/unlock cellular functional clusters.

  8. Comprehensive Human Transcription Factor Binding Site Map for Combinatory Binding Motifs Discovery

    PubMed Central

    Müller-Molina, Arnoldo J.; Schöler, Hans R.; Araúzo-Bravo, Marcos J.

    2012-01-01

    To know the map between transcription factors (TFs) and their binding sites is essential to reverse engineer the regulation process. Only about 10%–20% of the transcription factor binding motifs (TFBMs) have been reported. This lack of data hinders understanding gene regulation. To address this drawback, we propose a computational method that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters. The method starts by predicting a dictionary of regulatory “DNA words.” From this dictionary, it distills 4098 novel predictions. To disclose the crosstalk between motifs, an additional algorithm extracts TF combinatorial binding patterns creating a collection of TF regulatory syntactic rules. Using these rules, we narrowed down a list of 504 novel motifs that appear frequently in syntax patterns. We tested the predictions against 509 known motifs confirming that our system can reliably predict ab initio motifs with an accuracy of 81%—far higher than previous approaches. We found that on average, 90% of the discovered combinatorial binding patterns target at least 10 genes, suggesting that to control in an independent manner smaller gene sets, supplementary regulatory mechanisms are required. Additionally, we discovered that the new TFBMs and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions. Thus, among all the possible available targets in the genome, the TFs tend to regulate other TFs and genes involved in developmental functions. We provide a comprehensive resource for regulation analysis that includes a dictionary of “DNA words,” newly predicted motifs and their corresponding combinatorial patterns. Combinatorial patterns are a useful filter to discover TFBMs that play a major role in orchestrating other factors and thus, are likely to lock/unlock cellular functional clusters. PMID:23209563

  9. Enhancing emotional-based target prediction

    NASA Astrophysics Data System (ADS)

    Gosnell, Michael; Woodley, Robert

    2008-04-01

    This work extends existing agent-based target movement prediction to include key ideas of behavioral inertia, steady states, and catastrophic change from existing psychological, sociological, and mathematical work. Existing target prediction work inherently assumes a single steady state for target behavior, and attempts to classify behavior based on a single emotional state set. The enhanced, emotional-based target prediction maintains up to three distinct steady states, or typical behaviors, based on a target's operating conditions and observed behaviors. Each steady state has an associated behavioral inertia, similar to the standard deviation of behaviors within that state. The enhanced prediction framework also allows steady state transitions through catastrophic change and individual steady states could be used in an offline analysis with additional modeling efforts to better predict anticipated target reactions.

  10. A scoring function based on solvation thermodynamics for protein structure prediction

    PubMed Central

    Du, Shiqiao; Harano, Yuichi; Kinoshita, Masahiro; Sakurai, Minoru

    2012-01-01

    We predict protein structure using our recently developed free energy function for describing protein stability, which is focused on solvation thermodynamics. The function is combined with the current most reliable sampling methods, i.e., fragment assembly (FA) and comparative modeling (CM). The prediction is tested using 11 small proteins for which high-resolution crystal structures are available. For 8 of these proteins, sequence similarities are found in the database, and the prediction is performed with CM. Fairly accurate models with average Cα root mean square deviation (RMSD) ∼ 2.0 Å are successfully obtained for all cases. For the rest of the target proteins, we perform the prediction following FA protocols. For 2 cases, we obtain predicted models with an RMSD ∼ 3.0 Å as the best-scored structures. For the other case, the RMSD remains larger than 7 Å. For all the 11 target proteins, our scoring function identifies the experimentally determined native structure as the best structure. Starting from the predicted structure, replica exchange molecular dynamics is performed to further refine the structures. However, we are unable to improve its RMSD toward the experimental structure. The exhaustive sampling by coarse-grained normal mode analysis around the native structures reveals that our function has a linear correlation with RMSDs < 3.0 Å. These results suggest that the function is quite reliable for the protein structure prediction while the sampling method remains one of the major limiting factors in it. The aspects through which the methodology could further be improved are discussed. PMID:27493529

  11. Explaining the disease phenotype of intergenic SNP through predicted long range regulation

    PubMed Central

    Chen, Jingqi; Tian, Weidong

    2016-01-01

    Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes. PMID:27280978

  12. Combining self- and cross-docking as benchmark tools: the performance of DockBench in the D3R Grand Challenge 2

    NASA Astrophysics Data System (ADS)

    Salmaso, Veronica; Sturlese, Mattia; Cuzzolin, Alberto; Moro, Stefano

    2018-01-01

    Molecular docking is a powerful tool in the field of computer-aided molecular design. In particular, it is the technique of choice for the prediction of a ligand pose within its target binding site. A multitude of docking methods is available nowadays, whose performance may vary depending on the data set. Therefore, some non-trivial choices should be made before starting a docking simulation. In the same framework, the selection of the target structure to use could be challenging, since the number of available experimental structures is increasing. Both issues have been explored within this work. The pose prediction of a pool of 36 compounds provided by D3R Grand Challenge 2 organizers was preceded by a pipeline to choose the best protein/docking-method couple for each blind ligand. An integrated benchmark approach including ligand shape comparison and cross-docking evaluations was implemented inside our DockBench software. The results are encouraging and show that bringing attention to the choice of the docking simulation fundamental components improves the results of the binding mode predictions.

  13. Parametric bicubic spline and CAD tools for complex targets shape modelling in physical optics radar cross section prediction

    NASA Astrophysics Data System (ADS)

    Delogu, A.; Furini, F.

    1991-09-01

    Increasing interest in radar cross section (RCS) reduction is placing new demands on theoretical, computation, and graphic techniques for calculating scattering properties of complex targets. In particular, computer codes capable of predicting the RCS of an entire aircraft at high frequency and of achieving RCS control with modest structural changes, are becoming of paramount importance in stealth design. A computer code, evaluating the RCS of arbitrary shaped metallic objects that are computer aided design (CAD) generated, and its validation with measurements carried out using ALENIA RCS test facilities are presented. The code, based on the physical optics method, is characterized by an efficient integration algorithm with error control, in order to contain the computer time within acceptable limits, and by an accurate parametric representation of the target surface in terms of bicubic splines.

  14. Maximum a posteriori Bayesian estimation of mycophenolic Acid area under the concentration-time curve: is this clinically useful for dosage prediction yet?

    PubMed

    Staatz, Christine E; Tett, Susan E

    2011-12-01

    This review seeks to summarize the available data about Bayesian estimation of area under the plasma concentration-time curve (AUC) and dosage prediction for mycophenolic acid (MPA) and evaluate whether sufficient evidence is available for routine use of Bayesian dosage prediction in clinical practice. A literature search identified 14 studies that assessed the predictive performance of maximum a posteriori Bayesian estimation of MPA AUC and one report that retrospectively evaluated how closely dosage recommendations based on Bayesian forecasting achieved targeted MPA exposure. Studies to date have mostly been undertaken in renal transplant recipients, with limited investigation in patients treated with MPA for autoimmune disease or haematopoietic stem cell transplantation. All of these studies have involved use of the mycophenolate mofetil (MMF) formulation of MPA, rather than the enteric-coated mycophenolate sodium (EC-MPS) formulation. Bias associated with estimation of MPA AUC using Bayesian forecasting was generally less than 10%. However some difficulties with imprecision was evident, with values ranging from 4% to 34% (based on estimation involving two or more concentration measurements). Evaluation of whether MPA dosing decisions based on Bayesian forecasting (by the free website service https://pharmaco.chu-limoges.fr) achieved target drug exposure has only been undertaken once. When MMF dosage recommendations were applied by clinicians, a higher proportion (72-80%) of subsequent estimated MPA AUC values were within the 30-60 mg · h/L target range, compared with when dosage recommendations were not followed (only 39-57% within target range). Such findings provide evidence that Bayesian dosage prediction is clinically useful for achieving target MPA AUC. This study, however, was retrospective and focussed only on adult renal transplant recipients. Furthermore, in this study, Bayesian-generated AUC estimations and dosage predictions were not compared with a later full measured AUC but rather with a further AUC estimate based on a second Bayesian analysis. This study also provided some evidence that a useful monitoring schedule for MPA AUC following adult renal transplant would be every 2 weeks during the first month post-transplant, every 1-3 months between months 1 and 12, and each year thereafter. It will be interesting to see further validations in different patient groups using the free website service. In summary, the predictive performance of Bayesian estimation of MPA, comparing estimated with measured AUC values, has been reported in several studies. However, the next step of predicting dosages based on these Bayesian-estimated AUCs, and prospectively determining how closely these predicted dosages give drug exposure matching targeted AUCs, remains largely unaddressed. Further prospective studies are required, particularly in non-renal transplant patients and with the EC-MPS formulation. Other important questions remain to be answered, such as: do Bayesian forecasting methods devised to date use the best population pharmacokinetic models or most accurate algorithms; are the methods simple to use for routine clinical practice; do the algorithms actually improve dosage estimations beyond empirical recommendations in all groups that receive MPA therapy; and, importantly, do the dosage predictions, when followed, improve patient health outcomes?

  15. Predictive Modeling of Developmental Toxicity

    EPA Science Inventory

    The use of alternative methods in conjunction with traditional in vivo developmental toxicity testing has the potential to (1) reduce cost and increase throughput of testing the chemical universe, (2) prioritize chemicals for further targeted toxicity testing and risk assessment,...

  16. Finding lesion correspondences in different views of automated 3D breast ultrasound

    NASA Astrophysics Data System (ADS)

    Tan, Tao; Platel, Bram; Hicks, Michael; Mann, Ritse M.; Karssemeijer, Nico

    2013-02-01

    Screening with automated 3D breast ultrasound (ABUS) is gaining popularity. However, the acquisition of multiple views required to cover an entire breast makes radiologic reading time-consuming. Linking lesions across views can facilitate the reading process. In this paper, we propose a method to automatically predict the position of a lesion in the target ABUS views, given the location of the lesion in a source ABUS view. We combine features describing the lesion location with respect to the nipple, the transducer and the chestwall, with features describing lesion properties such as intensity, spiculation, blobness, contrast and lesion likelihood. By using a grid search strategy, the location of the lesion was predicted in the target view. Our method achieved an error of 15.64 mm+/-16.13 mm. The error is small enough to help locate the lesion with minor additional interaction.

  17. Three-Dimensional Dynamic Deformation Measurements Using Stereoscopic Imaging and Digital Speckle Photography

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Prentice, H. J.; Proud, W. G.

    2006-07-28

    A technique has been developed to determine experimentally the three-dimensional displacement field on the rear surface of a dynamically deforming plate. The technique combines speckle analysis with stereoscopy, using a modified angular-lens method: this incorporates split-frame photography and a simple method by which the effective lens separation can be adjusted and calibrated in situ. Whilst several analytical models exist to predict deformation in extended or semi-infinite targets, the non-trivial nature of the wave interactions complicates the generation and development of analytical models for targets of finite depth. By interrogating specimens experimentally to acquire three-dimensional strain data points, both analytical andmore » numerical model predictions can be verified more rigorously. The technique is applied to the quasi-static deformation of a rubber sheet and dynamically to Mild Steel sheets of various thicknesses.« less

  18. Self Organizing Map-Based Classification of Cathepsin k and S Inhibitors with Different Selectivity Profiles Using Different Structural Molecular Fingerprints: Design and Application for Discovery of Novel Hits.

    PubMed

    Ihmaid, Saleh K; Ahmed, Hany E A; Zayed, Mohamed F; Abadleh, Mohammed M

    2016-01-30

    The main step in a successful drug discovery pipeline is the identification of small potent compounds that selectively bind to the target of interest with high affinity. However, there is still a shortage of efficient and accurate computational methods with powerful capability to study and hence predict compound selectivity properties. In this work, we propose an affordable machine learning method to perform compound selectivity classification and prediction. For this purpose, we have collected compounds with reported activity and built a selectivity database formed of 153 cathepsin K and S inhibitors that are considered of medicinal interest. This database has three compound sets, two K/S and S/K selective ones and one non-selective KS one. We have subjected this database to the selectivity classification tool 'Emergent Self-Organizing Maps' for exploring its capability to differentiate selective cathepsin inhibitors for one target over the other. The method exhibited good clustering performance for selective ligands with high accuracy (up to 100 %). Among the possibilites, BAPs and MACCS molecular structural fingerprints were used for such a classification. The results exhibited the ability of the method for structure-selectivity relationship interpretation and selectivity markers were identified for the design of further novel inhibitors with high activity and target selectivity.

  19. Inter-kingdom prediction certainty evaluation of protein subcellular localization tools: microbial pathogenesis approach for deciphering host microbe interaction.

    PubMed

    Khan, Abdul Arif; Khan, Zakir; Kalam, Mohd Abul; Khan, Azmat Ali

    2018-01-01

    Microbial pathogenesis involves several aspects of host-pathogen interactions, including microbial proteins targeting host subcellular compartments and subsequent effects on host physiology. Such studies are supported by experimental data, but recent detection of bacterial proteins localization through computational eukaryotic subcellular protein targeting prediction tools has also come into practice. We evaluated inter-kingdom prediction certainty of these tools. The bacterial proteins experimentally known to target host subcellular compartments were predicted with eukaryotic subcellular targeting prediction tools, and prediction certainty was assessed. The results indicate that these tools alone are not sufficient for inter-kingdom protein targeting prediction. The correct prediction of pathogen's protein subcellular targeting depends on several factors, including presence of localization signal, transmembrane domain and molecular weight, etc., in addition to approach for subcellular targeting prediction. The detection of protein targeting in endomembrane system is comparatively difficult, as the proteins in this location are channelized to different compartments. In addition, the high specificity of training data set also creates low inter-kingdom prediction accuracy. Current data can help to suggest strategy for correct prediction of bacterial protein's subcellular localization in host cell. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  20. Application of frequency-domain linearized Euler solutions to the prediction of aft fan tones and comparison with experimental measurements on model scale turbofan exhaust nozzles

    NASA Astrophysics Data System (ADS)

    Özyörük, Y.; Tester, B. J.

    2011-08-01

    Although it is widely accepted that aircraft noise needs to be further reduced, there is an equally important, on-going requirement to accurately predict the strengths of all the different aircraft noise sources, not only to ensure that a new aircraft is certifiable and can meet the ever more stringent local airport noise rules but also to prioritize and apply appropriate noise source reduction technologies at the design stage. As the bypass ratio of aircraft engines is increased - in order to reduce fuel consumption, emissions and jet mixing noise - the fan noise that radiates from the bypass exhaust nozzle is becoming one of the loudest engine sources, despite the large areas of acoustically absorptive treatment in the bypass duct. This paper addresses this 'aft fan' noise source, in particular the prediction of the propagation of fan noise through the bypass exhaust nozzle/jet exhaust flow and radiation out to the far-field observer. The proposed prediction method is equally applicable to fan tone and fan broadband noise (and also turbine and core noise) but here the method is validated with measured test data using simulated fan tones. The measured data had been previously acquired on two model scale turbofan engine exhausts with bypass and heated core flows typical of those found in a modern high bypass engine, but under static conditions (i.e. no flight simulation). The prediction method is based on frequency-domain solutions of the linearized Euler equations in conjunction with perfectly matched layer equations at the inlet and far-field boundaries using high-order finite differences. The discrete system of equations is inverted by the parallel sparse solver MUMPS. Far-field predictions are carried out by integrating Kirchhoff's formula in frequency domain. In addition to the acoustic modes excited and radiated, some non-acoustic waves within the cold stream-ambient shear layer are also captured by the computations at some flow and excitation frequencies. By extracting phase speed information from the near-field pressure solution, these non-acoustic waves are shown to be convective Kelvin-Helmholtz instability waves. Strouhal numbers computed along the shear layer, based on the local momentum thickness also confirm this in accordance with Michalke's instability criterion for incompressible round jets with a similar shear layer profile. Comparisons of the computed far-field results with the measured acoustic data reveal that, in general, the solver predicts the peak sound levels well when the farfield is dominated by the in-duct target mode (the target mode being the one specified to the in-duct mode generator). Calculations also show that the agreement can be considerably improved when the non-target modes are also included, despite their low in-duct levels. This is due to the fact that each duct mode has its own distinct directionality and a non-target low level mode may become dominant at angles where the higher-level target mode is directionally weak. The overall agreement between the computations and experiment strongly suggests that, at least for the range of mean flows and acoustic conditions considered, the physical aeroacoustic radiation processes are fully captured through the frequency-domain solutions to the linearized Euler equations and hence this could form the basis of a reliable aircraft noise prediction method.

  1. Protein-Protein Interface Predictions by Data-Driven Methods: A Review

    PubMed Central

    Xue, Li C; Dobbs, Drena; Bonvin, Alexandre M.J.J.; Honavar, Vasant

    2015-01-01

    Reliably pinpointing which specific amino acid residues form the interface(s) between a protein and its binding partner(s) is critical for understanding the structural and physicochemical determinants of protein recognition and binding affinity, and has wide applications in modeling and validating protein interactions predicted by high-throughput methods, in engineering proteins, and in prioritizing drug targets. Here, we review the basic concepts, principles and recent advances in computational approaches to the analysis and prediction of protein-protein interfaces. We point out caveats for objectively evaluating interface predictors, and discuss various applications of data-driven interface predictors for improving energy model-driven protein-protein docking. Finally, we stress the importance of exploiting binding partner information in reliably predicting interfaces and highlight recent advances in this emerging direction. PMID:26460190

  2. Data-Driven prioritisation of antibody-drug conjugate targets in head and neck squamous cell carcinoma.

    PubMed

    Hanemaaijer, Saskia H; van Gijn, Stephanie E; Oosting, Sjoukje F; Plaat, Boudewijn E C; Moek, Kirsten L; Schuuring, Ed M; van der Laan, Bernard F A M; Roodenburg, Jan L N; van Vugt, Marcel A T M; van der Vegt, Bert; Fehrmann, Rudolf S N

    2018-05-01

    For patients with recurrent or metastatic head and neck squamous cell carcinoma (HNSCC) palliative treatment options that improve overall survival are limited. The prognosis in this group remains poor and there is an unmet need for new therapeutic options. An emerging class of therapeutics, targeting tumor-specific antigens, are antibodies bound to a cytotoxic agent, known as antibody-drug conjugates (ADCs). The aim of this study was to prioritize ADC targets in HNSCC. With a systematic search, we identified 55 different ADC targets currently targeted by registered ADCs and ADCs under clinical evaluation. For these 55 ADC targets, protein overexpression was predicted in a dataset containing 344 HNSCC mRNA expression profiles by using a method called functional genomic mRNA profiling. The ADC target with the highest predicted overexpression was validated by performing immunohistochemistry (IHC) on an independent tissue microarray containing 414 HNSCC tumors. The predicted top 5 overexpressed ADC targets in HNSCC were: glycoprotein nmb (GPNMB), SLIT and NTRK-like family member 6, epidermal growth factor receptor, CD74 and CD44. IHC validation showed combined cytoplasmic and membranous GPNMB protein expression in 92.0% of the cases. Strong expression was seen in 65.9% of the cases. In addition, 86.5% and 67.7% of cases showed ≥5% and >25% GPNMB positive tumor cells, respectively. This study provides a data-driven prioritization of ADCs targets that will facilitate clinicians and drug developers in deciding which ADC should be taken for further clinical evaluation in HNSCC. This might help to improve disease outcome of HNSCC patients. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. A Performance Weighted Collaborative Filtering algorithm for personalized radiology education.

    PubMed

    Lin, Hongli; Yang, Xuedong; Wang, Weisheng; Luo, Jiawei

    2014-10-01

    Devising an accurate prediction algorithm that can predict the difficulty level of cases for individuals and then selects suitable cases for them is essential to the development of a personalized training system. In this paper, we propose a novel approach, called Performance Weighted Collaborative Filtering (PWCF), to predict the difficulty level of each case for individuals. The main idea of PWCF is to assign an optimal weight to each rating used for predicting the difficulty level of a target case for a trainee, rather than using an equal weight for all ratings as in traditional collaborative filtering methods. The assigned weight is a function of the performance level of the trainee at which the rating was made. The PWCF method and the traditional method are compared using two datasets. The experimental data are then evaluated by means of the MAE metric. Our experimental results show that PWCF outperforms the traditional methods by 8.12% and 17.05%, respectively, over the two datasets, in terms of prediction precision. This suggests that PWCF is a viable method for the development of personalized training systems in radiology education. Copyright © 2014. Published by Elsevier Inc.

  4. Tools4miRs – one place to gather all the tools for miRNA analysis

    PubMed Central

    Lukasik, Anna; Wójcikowski, Maciej; Zielenkiewicz, Piotr

    2016-01-01

    Summary: MiRNAs are short, non-coding molecules that negatively regulate gene expression and thereby play several important roles in living organisms. Dozens of computational methods for miRNA-related research have been developed, which greatly differ in various aspects. The substantial availability of difficult-to-compare approaches makes it challenging for the user to select a proper tool and prompts the need for a solution that will collect and categorize all the methods. Here, we present tools4miRs, the first platform that gathers currently more than 160 methods for broadly defined miRNA analysis. The collected tools are classified into several general and more detailed categories in which the users can additionally filter the available methods according to their specific research needs, capabilities and preferences. Tools4miRs is also a web-based target prediction meta-server that incorporates user-designated target prediction methods into the analysis of user-provided data. Availability and Implementation: Tools4miRs is implemented in Python using Django and is freely available at tools4mirs.org. Contact: piotr@ibb.waw.pl Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153626

  5. Tools4miRs - one place to gather all the tools for miRNA analysis.

    PubMed

    Lukasik, Anna; Wójcikowski, Maciej; Zielenkiewicz, Piotr

    2016-09-01

    MiRNAs are short, non-coding molecules that negatively regulate gene expression and thereby play several important roles in living organisms. Dozens of computational methods for miRNA-related research have been developed, which greatly differ in various aspects. The substantial availability of difficult-to-compare approaches makes it challenging for the user to select a proper tool and prompts the need for a solution that will collect and categorize all the methods. Here, we present tools4miRs, the first platform that gathers currently more than 160 methods for broadly defined miRNA analysis. The collected tools are classified into several general and more detailed categories in which the users can additionally filter the available methods according to their specific research needs, capabilities and preferences. Tools4miRs is also a web-based target prediction meta-server that incorporates user-designated target prediction methods into the analysis of user-provided data. Tools4miRs is implemented in Python using Django and is freely available at tools4mirs.org. piotr@ibb.waw.pl Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  6. High-Frequency Jet Ventilation for Complete Target Immobilization and Reduction of Planning Target Volume in Stereotactic High Single-Dose Irradiation of Stage I Non-Small Cell Lung Cancer and Lung Metastases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fritz, Peter, E-mail: p.h.fritz@t-online.d; Kraus, Hans-Joerg; Muehlnickel, Werner

    2010-09-01

    Purpose: To demonstrate the feasibility of complete target immobilization by means of high-frequency jet ventilation (HFJV); and to show that the saving of planning target volume (PTV) on the stereotactic body radiation therapy (SBRT) under HFJV, compared with SBRT with respiratory motion, can be predicted with reliable accuracy by computed tomography (CT) scans at peak inspiration phase. Methods and Materials: A comparison regarding different methods for defining the PTV was carried out in 22 patients with tumors that clearly moved with respiration. A movement span of the gross tumor volume (GTV) was defined by fusing respiration-correlated CT scans. The PTVmore » enclosed the GTV positions with a safety margin throughout the breathing cycle. To create a PTV from CT scans acquired under HFJV, the same margins were drawn around the immobilized target. In addition, peak inspiration phase CT images (PIP-CTs) were used to approximate a target immobilized by HFJV. Results: The resulting HFJV-PTVs were between 11.6% and 45.4% smaller than the baseline values calculated as respiration-correlated CT-PTVs (median volume reduction, 25.4%). Tentative planning by means of PIP-CT PTVs predicted that in 19 of 22 patients, use of HFJV would lead to a reduction in volume of {>=}20%. Using this threshold yielded a positive predictive value of 0.89, as well as a sensitivity of 0.94 and a specificity of 0.5. Conclusions: In all patients, SBRT under HFJV provided a reliable immobilization of the GTVs and achieved a reduction in PTVs, regardless of patient compliance. Tentative planning facilitated the selection of patients who could better undergo radiation in respiratory standstill, both with greater accuracy and lung protection.« less

  7. Free energy landscape for the binding process of Huperzine A to acetylcholinesterase

    PubMed Central

    Bai, Fang; Xu, Yechun; Chen, Jing; Liu, Qiufeng; Gu, Junfeng; Wang, Xicheng; Ma, Jianpeng; Li, Honglin; Onuchic, José N.; Jiang, Hualiang

    2013-01-01

    Drug-target residence time (t = 1/koff, where koff is the dissociation rate constant) has become an important index in discovering better- or best-in-class drugs. However, little effort has been dedicated to developing computational methods that can accurately predict this kinetic parameter or related parameters, koff and activation free energy of dissociation (). In this paper, energy landscape theory that has been developed to understand protein folding and function is extended to develop a generally applicable computational framework that is able to construct a complete ligand-target binding free energy landscape. This enables both the binding affinity and the binding kinetics to be accurately estimated. We applied this method to simulate the binding event of the anti-Alzheimer’s disease drug (−)−Huperzine A to its target acetylcholinesterase (AChE). The computational results are in excellent agreement with our concurrent experimental measurements. All of the predicted values of binding free energy and activation free energies of association and dissociation deviate from the experimental data only by less than 1 kcal/mol. The method also provides atomic resolution information for the (−)−Huperzine A binding pathway, which may be useful in designing more potent AChE inhibitors. We expect this methodology to be widely applicable to drug discovery and development. PMID:23440190

  8. Free energy landscape for the binding process of Huperzine A to acetylcholinesterase.

    PubMed

    Bai, Fang; Xu, Yechun; Chen, Jing; Liu, Qiufeng; Gu, Junfeng; Wang, Xicheng; Ma, Jianpeng; Li, Honglin; Onuchic, José N; Jiang, Hualiang

    2013-03-12

    Drug-target residence time (t = 1/k(off), where k(off) is the dissociation rate constant) has become an important index in discovering better- or best-in-class drugs. However, little effort has been dedicated to developing computational methods that can accurately predict this kinetic parameter or related parameters, k(off) and activation free energy of dissociation (ΔG(off)≠). In this paper, energy landscape theory that has been developed to understand protein folding and function is extended to develop a generally applicable computational framework that is able to construct a complete ligand-target binding free energy landscape. This enables both the binding affinity and the binding kinetics to be accurately estimated. We applied this method to simulate the binding event of the anti-Alzheimer's disease drug (-)-Huperzine A to its target acetylcholinesterase (AChE). The computational results are in excellent agreement with our concurrent experimental measurements. All of the predicted values of binding free energy and activation free energies of association and dissociation deviate from the experimental data only by less than 1 kcal/mol. The method also provides atomic resolution information for the (-)-Huperzine A binding pathway, which may be useful in designing more potent AChE inhibitors. We expect this methodology to be widely applicable to drug discovery and development.

  9. Sensitivity, Specificity, PPV, and NPV for Predictive Biomarkers.

    PubMed

    Simon, Richard

    2015-08-01

    Molecularly targeted cancer drugs are often developed with companion diagnostics that attempt to identify which patients will have better outcome on the new drug than the control regimen. Such predictive biomarkers are playing an increasingly important role in precision oncology. For diagnostic tests, sensitivity, specificity, positive predictive value, and negative predictive are usually used as performance measures. This paper discusses these indices for predictive biomarkers, provides methods for their calculation with survival or response endpoints, and describes assumptions involved in their use. Published by Oxford University Press 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  10. D3R grand challenge 2015: Evaluation of protein-ligand pose and affinity predictions

    NASA Astrophysics Data System (ADS)

    Gathiaka, Symon; Liu, Shuai; Chiu, Michael; Yang, Huanwang; Stuckey, Jeanne A.; Kang, You Na; Delproposto, Jim; Kubish, Ginger; Dunbar, James B.; Carlson, Heather A.; Burley, Stephen K.; Walters, W. Patrick; Amaro, Rommie E.; Feher, Victoria A.; Gilson, Michael K.

    2016-09-01

    The Drug Design Data Resource (D3R) ran Grand Challenge 2015 between September 2015 and February 2016. Two targets served as the framework to test community docking and scoring methods: (1) HSP90, donated by AbbVie and the Community Structure Activity Resource (CSAR), and (2) MAP4K4, donated by Genentech. The challenges for both target datasets were conducted in two stages, with the first stage testing pose predictions and the capacity to rank compounds by affinity with minimal structural data; and the second stage testing methods for ranking compounds with knowledge of at least a subset of the ligand-protein poses. An additional sub-challenge provided small groups of chemically similar HSP90 compounds amenable to alchemical calculations of relative binding free energy. Unlike previous blinded Challenges, we did not provide cognate receptors or receptors prepared with hydrogens and likewise did not require a specified crystal structure to be used for pose or affinity prediction in Stage 1. Given the freedom to select from over 200 crystal structures of HSP90 in the PDB, participants employed workflows that tested not only core docking and scoring technologies, but also methods for addressing water-mediated ligand-protein interactions, binding pocket flexibility, and the optimal selection of protein structures for use in docking calculations. Nearly 40 participating groups submitted over 350 prediction sets for Grand Challenge 2015. This overview describes the datasets and the organization of the challenge components, summarizes the results across all submitted predictions, and considers broad conclusions that may be drawn from this collaborative community endeavor.

  11. D3R Grand Challenge 2015: Evaluation of Protein-Ligand Pose and Affinity Predictions

    PubMed Central

    Gathiaka, Symon; Liu, Shuai; Chiu, Michael; Yang, Huanwang; Stuckey, Jeanne A; Kang, You Na; Delproposto, Jim; Kubish, Ginger; Dunbar, James B.; Carlson, Heather A.; Burley, Stephen K.; Walters, W. Patrick; Amaro, Rommie E.; Feher, Victoria A.; Gilson, Michael K.

    2017-01-01

    The Drug Design Data Resource (D3R) ran Grand Challenge 2015 between September 2015 and February 2016. Two targets served as the framework to test community docking and scoring methods: (i) HSP90, donated by AbbVie and the Community Structure Activity Resource (CSAR), and (ii) MAP4K4, donated by Genentech. The challenges for both target datasets were conducted in two stages, with the first stage testing pose predictions and the capacity to rank compounds by affinity with minimal structural data; and the second stage testing methods for ranking compounds with knowledge of at least a subset of the ligand-protein poses. An additional sub-challenge provided small groups of chemically similar HSP90 compounds amenable to alchemical calculations of relative binding free energy. Unlike previous blinded Challenges, we did not provide cognate receptors or receptors prepared with hydrogens and likewise did not require a specified crystal structure to be used for pose or affinity prediction in Stage 1. Given the freedom to select from over 200 crystal structures of HSP90 in the PDB, participants employed workflows that tested not only core docking and scoring technologies, but also methods for addressing water-mediated ligand-protein interactions, binding pocket flexibility, and the optimal selection of protein structures for use in docking calculations. Nearly 40 participating groups submitted over 350 prediction sets for Grand Challenge 2015. This overview describes the datasets and the organization of the challenge components, summarizes the results across all submitted predictions, and considers broad conclusions that may be drawn from this collaborative community endeavor. PMID:27696240

  12. Physics-based protein-structure prediction using a hierarchical protocol based on the UNRES force field: assessment in two blind tests.

    PubMed

    Ołdziej, S; Czaplewski, C; Liwo, A; Chinchio, M; Nanias, M; Vila, J A; Khalili, M; Arnautova, Y A; Jagielska, A; Makowski, M; Schafroth, H D; Kaźmierkiewicz, R; Ripoll, D R; Pillardy, J; Saunders, J A; Kang, Y K; Gibson, K D; Scheraga, H A

    2005-05-24

    Recent improvements in the protein-structure prediction method developed in our laboratory, based on the thermodynamic hypothesis, are described. The conformational space is searched extensively at the united-residue level by using our physics-based UNRES energy function and the conformational space annealing method of global optimization. The lowest-energy coarse-grained structures are then converted to an all-atom representation and energy-minimized with the ECEPP/3 force field. The procedure was assessed in two recent blind tests of protein-structure prediction. During the first blind test, we predicted large fragments of alpha and alpha+beta proteins [60-70 residues with C(alpha) rms deviation (rmsd) <6 A]. However, for alpha+beta proteins, significant topological errors occurred despite low rmsd values. In the second exercise, we predicted whole structures of five proteins (two alpha and three alpha+beta, with sizes of 53-235 residues) with remarkably good accuracy. In particular, for the genomic target TM0487 (a 102-residue alpha+beta protein from Thermotoga maritima), we predicted the complete, topologically correct structure with 7.3-A C(alpha) rmsd. So far this protein is the largest alpha+beta protein predicted based solely on the amino acid sequence and a physics-based potential-energy function and search procedure. For target T0198, a phosphate transport system regulator PhoU from T. maritima (a 235-residue mainly alpha-helical protein), we predicted the topology of the whole six-helix bundle correctly within 8 A rmsd, except the 32 C-terminal residues, most of which form a beta-hairpin. These and other examples described in this work demonstrate significant progress in physics-based protein-structure prediction.

  13. Reverse screening methods to search for the protein targets of chemopreventive compounds

    NASA Astrophysics Data System (ADS)

    Huang, Hongbin; Zhang, Guigui; Zhou, Yuquan; Lin, Chenru; Chen, Suling; Lin, Yutong; Mai, Shangkang; Huang, Zunnan

    2018-05-01

    This article is a systematic review of reverse screening methods used to search for the protein targets of chemopreventive compounds or drugs. Typical chemopreventive compounds include components of traditional Chinese medicine, natural compounds and Food and Drug Administration (FDA)-approved drugs. Such compounds are somewhat selective but are predisposed to bind multiple protein targets distributed throughout diverse signaling pathways in human cells. In contrast to conventional virtual screening, which identifies the ligands of a targeted protein from a compound database, reverse screening is used to identify the potential targets or unintended targets of a given compound from a large number of receptors by examining their known ligands or crystal structures. This method, also known as in silico or computational target fishing, is highly valuable for discovering the target receptors of query molecules from terrestrial or marine natural products, exploring the molecular mechanisms of chemopreventive compounds, finding alternative indications of existing drugs by drug repositioning, and detecting adverse drug reactions and drug toxicity. Reverse screening can be divided into three major groups: shape screening, pharmacophore screening and reverse docking. Several large software packages, such as Schrödinger and Discovery Studio; typical software/network services such as ChemMapper, PharmMapper, idTarget and INVDOCK; and practical databases of known target ligands and receptor crystal structures, such as ChEMBL, BindingDB and the Protein Data Bank (PDB), are available for use in these computational methods. Different programs, online services and databases have different applications and constraints. Here, we conducted a systematic analysis and multilevel classification of the computational programs, online services and compound libraries available for shape screening, pharmacophore screening and reverse docking to enable non-specialist users to quickly learn and grasp the types of calculations used in protein target fishing. In addition, we review the main features of these methods, programs and databases and provide a variety of examples illustrating the application of one or a combination of reverse screening methods for accurate target prediction.

  14. Reverse Screening Methods to Search for the Protein Targets of Chemopreventive Compounds.

    PubMed

    Huang, Hongbin; Zhang, Guigui; Zhou, Yuquan; Lin, Chenru; Chen, Suling; Lin, Yutong; Mai, Shangkang; Huang, Zunnan

    2018-01-01

    This article is a systematic review of reverse screening methods used to search for the protein targets of chemopreventive compounds or drugs. Typical chemopreventive compounds include components of traditional Chinese medicine, natural compounds and Food and Drug Administration (FDA)-approved drugs. Such compounds are somewhat selective but are predisposed to bind multiple protein targets distributed throughout diverse signaling pathways in human cells. In contrast to conventional virtual screening, which identifies the ligands of a targeted protein from a compound database, reverse screening is used to identify the potential targets or unintended targets of a given compound from a large number of receptors by examining their known ligands or crystal structures. This method, also known as in silico or computational target fishing, is highly valuable for discovering the target receptors of query molecules from terrestrial or marine natural products, exploring the molecular mechanisms of chemopreventive compounds, finding alternative indications of existing drugs by drug repositioning, and detecting adverse drug reactions and drug toxicity. Reverse screening can be divided into three major groups: shape screening, pharmacophore screening and reverse docking. Several large software packages, such as Schrödinger and Discovery Studio; typical software/network services such as ChemMapper, PharmMapper, idTarget, and INVDOCK; and practical databases of known target ligands and receptor crystal structures, such as ChEMBL, BindingDB, and the Protein Data Bank (PDB), are available for use in these computational methods. Different programs, online services and databases have different applications and constraints. Here, we conducted a systematic analysis and multilevel classification of the computational programs, online services and compound libraries available for shape screening, pharmacophore screening and reverse docking to enable non-specialist users to quickly learn and grasp the types of calculations used in protein target fishing. In addition, we review the main features of these methods, programs and databases and provide a variety of examples illustrating the application of one or a combination of reverse screening methods for accurate target prediction.

  15. Reverse Screening Methods to Search for the Protein Targets of Chemopreventive Compounds

    PubMed Central

    Huang, Hongbin; Zhang, Guigui; Zhou, Yuquan; Lin, Chenru; Chen, Suling; Lin, Yutong; Mai, Shangkang; Huang, Zunnan

    2018-01-01

    This article is a systematic review of reverse screening methods used to search for the protein targets of chemopreventive compounds or drugs. Typical chemopreventive compounds include components of traditional Chinese medicine, natural compounds and Food and Drug Administration (FDA)-approved drugs. Such compounds are somewhat selective but are predisposed to bind multiple protein targets distributed throughout diverse signaling pathways in human cells. In contrast to conventional virtual screening, which identifies the ligands of a targeted protein from a compound database, reverse screening is used to identify the potential targets or unintended targets of a given compound from a large number of receptors by examining their known ligands or crystal structures. This method, also known as in silico or computational target fishing, is highly valuable for discovering the target receptors of query molecules from terrestrial or marine natural products, exploring the molecular mechanisms of chemopreventive compounds, finding alternative indications of existing drugs by drug repositioning, and detecting adverse drug reactions and drug toxicity. Reverse screening can be divided into three major groups: shape screening, pharmacophore screening and reverse docking. Several large software packages, such as Schrödinger and Discovery Studio; typical software/network services such as ChemMapper, PharmMapper, idTarget, and INVDOCK; and practical databases of known target ligands and receptor crystal structures, such as ChEMBL, BindingDB, and the Protein Data Bank (PDB), are available for use in these computational methods. Different programs, online services and databases have different applications and constraints. Here, we conducted a systematic analysis and multilevel classification of the computational programs, online services and compound libraries available for shape screening, pharmacophore screening and reverse docking to enable non-specialist users to quickly learn and grasp the types of calculations used in protein target fishing. In addition, we review the main features of these methods, programs and databases and provide a variety of examples illustrating the application of one or a combination of reverse screening methods for accurate target prediction. PMID:29868550

  16. Baseline and Target Values for PV Forecasts: Toward Improved Solar Power Forecasting: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Jie; Hodge, Bri-Mathias; Lu, Siyuan

    2015-08-05

    Accurate solar power forecasting allows utilities to get the most out of the solar resources on their systems. To truly measure the improvements that any new solar forecasting methods can provide, it is important to first develop (or determine) baseline and target solar forecasting at different spatial and temporal scales. This paper aims to develop baseline and target values for solar forecasting metrics. These were informed by close collaboration with utility and independent system operator partners. The baseline values are established based on state-of-the-art numerical weather prediction models and persistence models. The target values are determined based on the reductionmore » in the amount of reserves that must be held to accommodate the uncertainty of solar power output. forecasting metrics. These were informed by close collaboration with utility and independent system operator partners. The baseline values are established based on state-of-the-art numerical weather prediction models and persistence models. The target values are determined based on the reduction in the amount of reserves that must be held to accommodate the uncertainty of solar power output.« less

  17. SAMPL4 & DOCK3.7: lessons for automated docking procedures

    NASA Astrophysics Data System (ADS)

    Coleman, Ryan G.; Sterling, Teague; Weiss, Dahlia R.

    2014-03-01

    The SAMPL4 challenges were used to test current automated methods for solvation energy, virtual screening, pose and affinity prediction of the molecular docking pipeline DOCK 3.7. Additionally, first-order models of binding affinity were proposed as milestones for any method predicting binding affinity. Several important discoveries about the molecular docking software were made during the challenge: (1) Solvation energies of ligands were five-fold worse than any other method used in SAMPL4, including methods that were similarly fast, (2) HIV Integrase is a challenging target, but automated docking on the correct allosteric site performed well in terms of virtual screening and pose prediction (compared to other methods) but affinity prediction, as expected, was very poor, (3) Molecular docking grid sizes can be very important, serious errors were discovered with default settings that have been adjusted for all future work. Overall, lessons from SAMPL4 suggest many changes to molecular docking tools, not just DOCK 3.7, that could improve the state of the art. Future difficulties and projects will be discussed.

  18. ProTSAV: A protein tertiary structure analysis and validation server.

    PubMed

    Singh, Ankita; Kaushik, Rahul; Mishra, Avinash; Shanker, Asheesh; Jayaram, B

    2016-01-01

    Quality assessment of predicted model structures of proteins is as important as the protein tertiary structure prediction. A highly efficient quality assessment of predicted model structures directs further research on function. Here we present a new server ProTSAV, capable of evaluating predicted model structures based on some popular online servers and standalone tools. ProTSAV furnishes the user with a single quality score in case of individual protein structure along with a graphical representation and ranking in case of multiple protein structure assessment. The server is validated on ~64,446 protein structures including experimental structures from RCSB and predicted model structures for CASP targets and from public decoy sets. ProTSAV succeeds in predicting quality of protein structures with a specificity of 100% and a sensitivity of 98% on experimentally solved structures and achieves a specificity of 88%and a sensitivity of 91% on predicted protein structures of CASP11 targets under 2Å.The server overcomes the limitations of any single server/method and is seen to be robust in helping in quality assessment. ProTSAV is freely available at http://www.scfbio-iitd.res.in/software/proteomics/protsav.jsp. Copyright © 2015 Elsevier B.V. All rights reserved.

  19. Modeling of video compression effects on target acquisition performance

    NASA Astrophysics Data System (ADS)

    Cha, Jae H.; Preece, Bradley; Espinola, Richard L.

    2009-05-01

    The effect of video compression on image quality was investigated from the perspective of target acquisition performance modeling. Human perception tests were conducted recently at the U.S. Army RDECOM CERDEC NVESD, measuring identification (ID) performance on simulated military vehicle targets at various ranges. These videos were compressed with different quality and/or quantization levels utilizing motion JPEG, motion JPEG2000, and MPEG-4 encoding. To model the degradation on task performance, the loss in image quality is fit to an equivalent Gaussian MTF scaled by the Structural Similarity Image Metric (SSIM). Residual compression artifacts are treated as 3-D spatio-temporal noise. This 3-D noise is found by taking the difference of the uncompressed frame, with the estimated equivalent blur applied, and the corresponding compressed frame. Results show good agreement between the experimental data and the model prediction. This method has led to a predictive performance model for video compression by correlating various compression levels to particular blur and noise input parameters for NVESD target acquisition performance model suite.

  20. Developing a Risk Model to Target High-risk Preventive Interventions for Sexual Assault Victimization among Female U.S. Army Soldiers

    PubMed Central

    Street, Amy E.; Rosellini, Anthony J.; Ursano, Robert J.; Heeringa, Steven G.; Hill, Eric D.; Monahan, John; Naifeh, James A.; Petukhova, Maria V.; Reis, Ben Y.; Sampson, Nancy A.; Bliese, Paul D.; Stein, Murray B.; Zaslavsky, Alan M.; Kessler, Ronald C.

    2016-01-01

    Sexual violence victimization is a significant problem among female U.S. military personnel. Preventive interventions for high-risk individuals might reduce prevalence, but would require accurate targeting. We attempted to develop a targeting model for female Regular U.S. Army soldiers based on theoretically-guided predictors abstracted from administrative data records. As administrative reports of sexual assault victimization are known to be incomplete, parallel machine learning models were developed to predict administratively-recorded (in the population) and self-reported (in a representative survey) victimization. Capture-recapture methods were used to combine predictions across models. Key predictors included low status, crime involvement, and treated mental disorders. Area under the Receiver Operating Characteristic curve was .83−.88. 33.7-63.2% of victimizations occurred among soldiers in the highest-risk ventile (5%). This high concentration of risk suggests that the models could be useful in targeting preventive interventions, although final determination would require careful weighing of intervention costs, effectiveness, and competing risks. PMID:28154788

  1. Receptor-Mediated Uptake and Intracellular Sorting of Multivalent Lipid Nanoparticles Against the Epidermal Growth Factor Receptor (EGFR) and the Human EGFR 2 (HER2)

    NASA Astrophysics Data System (ADS)

    Tran, David Tu

    In the area of receptor-targeted lipid nanoparticles for drug delivery, efficiency has been mainly focused on cell-specificity, endocytosis, and subsequently effects on bioactivity such as cell growth inhibition. Aspects of targeted liposomal uptake and intracellular sorting are not well defined. This dissertation assessed a series of ligands as targeted functional groups against HER2 and EGFR for liposomal drug delivery. Receptor-mediated uptake, both mono-targeted and dual-targeted to multiple receptors of different ligand valence, and the intracellular sorting of lipid nanoparticles were investigated to improve the delivery of drugs to cancer cells. Lipid nanoparticles were functionalized through a new sequential micelle transfer---conjugation method, while the micelle transfer method was extended to growth factors. Through a combination of both techniques, anti-HER2 and anti-EGFR dual-targeted immunoliposomes with different combinations of ligand valence were developed for comparative studies. With the array of lipid nanoparticles, the uptake and cytotoxicity of lipid nanoparticles in relationship to ligand valence, both mono-targeting and dual-targeting, were evaluated on a small panel of breast cancer cell lines that express HER2 and EGFR of varying levels. Comparable uptake ratios of ligand to expressed receptor and apparent cooperativity were observed. For cell lines that express both receptors, additive dose-uptake effects were also observed with dual-targeted immunoliposomes, which translated to marginal improvements in cell growth inhibition with doxorubicin delivery. Colocalization analysis revealed that ligand-conjugated lipid nanoparticles settle to endosomal compartments similar to their attached ligands. Pathway transregulation and pathway saturation were also observed to affect trafficking. In the end, liposomes routed to the recycling endosomes were never observed to traffic beyond the endosomes nor to be exocytose like recycled ligands. Based on the experimental data, models were developed to help interpret and predict the binding and trafficking of lipid nanoparticles. The crosslink multivalent binding model of lipid nanoparticles to monovalent receptors was able to predict ligand valence for optimum binding, cell association concentrations, offer explanations to the antagonistic effects observed from high ligand valence, and predict the binding limitations of both ligand valence and ligand affinity. Hopefully, the models will serve as valuable tools for future optimizations in targeted liposomal drug delivery.

  2. Analysis of A Drug Target-based Classification System using Molecular Descriptors.

    PubMed

    Lu, Jing; Zhang, Pin; Bi, Yi; Luo, Xiaomin

    2016-01-01

    Drug-target interaction is an important topic in drug discovery and drug repositioning. KEGG database offers a drug annotation and classification using a target-based classification system. In this study, we gave an investigation on five target-based classes: (I) G protein-coupled receptors; (II) Nuclear receptors; (III) Ion channels; (IV) Enzymes; (V) Pathogens, using molecular descriptors to represent each drug compound. Two popular feature selection methods, maximum relevance minimum redundancy and incremental feature selection, were adopted to extract the important descriptors. Meanwhile, an optimal prediction model based on nearest neighbor algorithm was constructed, which got the best result in identifying drug target-based classes. Finally, some key descriptors were discussed to uncover their important roles in the identification of drug-target classes.

  3. Permeation fill-tube design for inertial confinement fusion target capsules

    DOE PAGES

    Rice, B. S.; Ulreich, J.; Fella, C.; ...

    2017-03-22

    A unique approach for permeation filling of nonpermeable inertial confinement fusion target capsules with deuterium–tritium (DT) is presented. This process uses a permeable capsule coupled into the final target capsule with a 0.03-mm-diameter fill tube. Leak free permeation filling of glow-discharge polymerization (GDP) targets using this method have been successfully demonstrated, as well as ice layering of the target, yielding an inner ice surface roughness of 1-more » $$\\unicode[STIX]{x03BC}$$m rms (root mean square). Finally, the measured DT ice-thickness profile for this experiment was used to validate a thermal model’s prediction of the same thickness profile.« less

  4. Domain Regeneration for Cross-Database Micro-Expression Recognition

    NASA Astrophysics Data System (ADS)

    Zong, Yuan; Zheng, Wenming; Huang, Xiaohua; Shi, Jingang; Cui, Zhen; Zhao, Guoying

    2018-05-01

    In this paper, we investigate the cross-database micro-expression recognition problem, where the training and testing samples are from two different micro-expression databases. Under this setting, the training and testing samples would have different feature distributions and hence the performance of most existing micro-expression recognition methods may decrease greatly. To solve this problem, we propose a simple yet effective method called Target Sample Re-Generator (TSRG) in this paper. By using TSRG, we are able to re-generate the samples from target micro-expression database and the re-generated target samples would share same or similar feature distributions with the original source samples. For this reason, we can then use the classifier learned based on the labeled source samples to accurately predict the micro-expression categories of the unlabeled target samples. To evaluate the performance of the proposed TSRG method, extensive cross-database micro-expression recognition experiments designed based on SMIC and CASME II databases are conducted. Compared with recent state-of-the-art cross-database emotion recognition methods, the proposed TSRG achieves more promising results.

  5. Gambling score in earthquake prediction analysis

    NASA Astrophysics Data System (ADS)

    Molchan, G.; Romashkova, L.

    2011-03-01

    The number of successes and the space-time alarm rate are commonly used to characterize the strength of an earthquake prediction method and the significance of prediction results. It has been recently suggested to use a new characteristic to evaluate the forecaster's skill, the gambling score (GS), which incorporates the difficulty of guessing each target event by using different weights for different alarms. We expand parametrization of the GS and use the M8 prediction algorithm to illustrate difficulties of the new approach in the analysis of the prediction significance. We show that the level of significance strongly depends (1) on the choice of alarm weights, (2) on the partitioning of the entire alarm volume into component parts and (3) on the accuracy of the spatial rate measure of target events. These tools are at the disposal of the researcher and can affect the significance estimate. Formally, all reasonable GSs discussed here corroborate that the M8 method is non-trivial in the prediction of 8.0 ≤M < 8.5 events because the point estimates of the significance are in the range 0.5-5 per cent. However, the conservative estimate 3.7 per cent based on the number of successes seems preferable owing to two circumstances: (1) it is based on relative values of the spatial rate and hence is more stable and (2) the statistic of successes enables us to construct analytically an upper estimate of the significance taking into account the uncertainty of the spatial rate measure.

  6. Dose-volume histogram prediction using density estimation.

    PubMed

    Skarpman Munter, Johanna; Sjölund, Jens

    2015-09-07

    Knowledge of what dose-volume histograms can be expected for a previously unseen patient could increase consistency and quality in radiotherapy treatment planning. We propose a machine learning method that uses previous treatment plans to predict such dose-volume histograms. The key to the approach is the framing of dose-volume histograms in a probabilistic setting.The training consists of estimating, from the patients in the training set, the joint probability distribution of some predictive features and the dose. The joint distribution immediately provides an estimate of the conditional probability of the dose given the values of the predictive features. The prediction consists of estimating, from the new patient, the distribution of the predictive features and marginalizing the conditional probability from the training over this. Integrating the resulting probability distribution for the dose yields an estimate of the dose-volume histogram.To illustrate how the proposed method relates to previously proposed methods, we use the signed distance to the target boundary as a single predictive feature. As a proof-of-concept, we predicted dose-volume histograms for the brainstems of 22 acoustic schwannoma patients treated with stereotactic radiosurgery, and for the lungs of 9 lung cancer patients treated with stereotactic body radiation therapy. Comparing with two previous attempts at dose-volume histogram prediction we find that, given the same input data, the predictions are similar.In summary, we propose a method for dose-volume histogram prediction that exploits the intrinsic probabilistic properties of dose-volume histograms. We argue that the proposed method makes up for some deficiencies in previously proposed methods, thereby potentially increasing ease of use, flexibility and ability to perform well with small amounts of training data.

  7. Common features of microRNA target prediction tools

    PubMed Central

    Peterson, Sarah M.; Thompson, Jeffrey A.; Ufkin, Melanie L.; Sathyanarayana, Pradeep; Liaw, Lucy; Congdon, Clare Bates

    2014-01-01

    The human genome encodes for over 1800 microRNAs (miRNAs), which are short non-coding RNA molecules that function to regulate gene expression post-transcriptionally. Due to the potential for one miRNA to target multiple gene transcripts, miRNAs are recognized as a major mechanism to regulate gene expression and mRNA translation. Computational prediction of miRNA targets is a critical initial step in identifying miRNA:mRNA target interactions for experimental validation. The available tools for miRNA target prediction encompass a range of different computational approaches, from the modeling of physical interactions to the incorporation of machine learning. This review provides an overview of the major computational approaches to miRNA target prediction. Our discussion highlights three tools for their ease of use, reliance on relatively updated versions of miRBase, and range of capabilities, and these are DIANA-microT-CDS, miRanda-mirSVR, and TargetScan. In comparison across all miRNA target prediction tools, four main aspects of the miRNA:mRNA target interaction emerge as common features on which most target prediction is based: seed match, conservation, free energy, and site accessibility. This review explains these features and identifies how they are incorporated into currently available target prediction tools. MiRNA target prediction is a dynamic field with increasing attention on development of new analysis tools. This review attempts to provide a comprehensive assessment of these tools in a manner that is accessible across disciplines. Understanding the basis of these prediction methodologies will aid in user selection of the appropriate tools and interpretation of the tool output. PMID:24600468

  8. Common features of microRNA target prediction tools.

    PubMed

    Peterson, Sarah M; Thompson, Jeffrey A; Ufkin, Melanie L; Sathyanarayana, Pradeep; Liaw, Lucy; Congdon, Clare Bates

    2014-01-01

    The human genome encodes for over 1800 microRNAs (miRNAs), which are short non-coding RNA molecules that function to regulate gene expression post-transcriptionally. Due to the potential for one miRNA to target multiple gene transcripts, miRNAs are recognized as a major mechanism to regulate gene expression and mRNA translation. Computational prediction of miRNA targets is a critical initial step in identifying miRNA:mRNA target interactions for experimental validation. The available tools for miRNA target prediction encompass a range of different computational approaches, from the modeling of physical interactions to the incorporation of machine learning. This review provides an overview of the major computational approaches to miRNA target prediction. Our discussion highlights three tools for their ease of use, reliance on relatively updated versions of miRBase, and range of capabilities, and these are DIANA-microT-CDS, miRanda-mirSVR, and TargetScan. In comparison across all miRNA target prediction tools, four main aspects of the miRNA:mRNA target interaction emerge as common features on which most target prediction is based: seed match, conservation, free energy, and site accessibility. This review explains these features and identifies how they are incorporated into currently available target prediction tools. MiRNA target prediction is a dynamic field with increasing attention on development of new analysis tools. This review attempts to provide a comprehensive assessment of these tools in a manner that is accessible across disciplines. Understanding the basis of these prediction methodologies will aid in user selection of the appropriate tools and interpretation of the tool output.

  9. Sequence similarity is more relevant than species specificity in probabilistic backtranslation.

    PubMed

    Ferro, Alfredo; Giugno, Rosalba; Pigola, Giuseppe; Pulvirenti, Alfredo; Di Pietro, Cinzia; Purrello, Michele; Ragusa, Marco

    2007-02-21

    Backtranslation is the process of decoding a sequence of amino acids into the corresponding codons. All synthetic gene design systems include a backtranslation module. The degeneracy of the genetic code makes backtranslation potentially ambiguous since most amino acids are encoded by multiple codons. The common approach to overcome this difficulty is based on imitation of codon usage within the target species. This paper describes EasyBack, a new parameter-free, fully-automated software for backtranslation using Hidden Markov Models. EasyBack is not based on imitation of codon usage within the target species, but instead uses a sequence-similarity criterion. The model is trained with a set of proteins with known cDNA coding sequences, constructed from the input protein by querying the NCBI databases with BLAST. Unlike existing software, the proposed method allows the quality of prediction to be estimated. When tested on a group of proteins that show different degrees of sequence conservation, EasyBack outperforms other published methods in terms of precision. The prediction quality of a protein backtranslation methis markedly increased by replacing the criterion of most used codon in the same species with a Hidden Markov Model trained with a set of most similar sequences from all species. Moreover, the proposed method allows the quality of prediction to be estimated probabilistically.

  10. A computational method for predicting regulation of human microRNAs on the influenza virus genome

    PubMed Central

    2013-01-01

    Background While it has been suggested that host microRNAs (miRNAs) may downregulate viral gene expression as an antiviral defense mechanism, such a mechanism has not been explored in the influenza virus for human flu studies. As it is difficult to conduct related experiments on humans, computational studies can provide some insight. Although many computational tools have been designed for miRNA target prediction, there is a need for cross-species prediction, especially for predicting viral targets of human miRNAs. However, finding putative human miRNAs targeting influenza virus genome is still challenging. Results We developed machine-learning features and conducted comprehensive data training for predicting interactions between H1N1 genome segments and host miRNA. We defined our seed region as the first ten nucleotides from the 5' end of the miRNA to the 3' end of the miRNA and integrated various features including the number of consecutive matching bases in the seed region of 10 bases, a triplet feature in seed regions, thermodynamic energy, penalty of bulges and wobbles at binding sites, and the secondary structure of viral RNA for the prediction. Conclusions Compared to general predictive models, our model fully takes into account the conservation patterns and features of viral RNA secondary structures, and greatly improves the prediction accuracy. Our model identified some key miRNAs including hsa-miR-489, hsa-miR-325, hsa-miR-876-3p and hsa-miR-2117, which target HA, PB2, MP and NS of H1N1, respectively. Our study provided an interesting hypothesis concerning the miRNA-based antiviral defense mechanism against influenza virus in human, i.e., the binding between human miRNA and viral RNAs may not result in gene silencing but rather may block the viral RNA replication. PMID:24565017

  11. Rosetta Structure Prediction as a Tool for Solving Difficult Molecular Replacement Problems.

    PubMed

    DiMaio, Frank

    2017-01-01

    Molecular replacement (MR), a method for solving the crystallographic phase problem using phases derived from a model of the target structure, has proven extremely valuable, accounting for the vast majority of structures solved by X-ray crystallography. However, when the resolution of data is low, or the starting model is very dissimilar to the target protein, solving structures via molecular replacement may be very challenging. In recent years, protein structure prediction methodology has emerged as a powerful tool in model building and model refinement for difficult molecular replacement problems. This chapter describes some of the tools available in Rosetta for model building and model refinement specifically geared toward difficult molecular replacement cases.

  12. The search for drug-targetable diagnostic, prognostic and predictive biomarkers in chronic graft-versus-host disease.

    PubMed

    Ren, Hong-Gang; Adom, Djamilatou; Paczesny, Sophie

    2018-05-01

    Chronic graft-versus-host disease (cGVHD) continues to be the leading cause of late morbidity and mortality after allogeneic hematopoietic stem cell transplantation (allo-HSCT), which is an increasingly applied curative method for both benign and malignant hematologic disorders. Biomarker identification is crucial for the development of noninvasive and cost-effective cGVHD diagnostic, prognostic, and predictive test for use in clinic. Furthermore, biomarkers may help to gain a better insight on ongoing pathophysiological processes. The recent widespread application of omics technologies including genomics, transcriptomics, proteomics and cytomics provided opportunities to discover novel biomarkers. Areas covered: This review focuses on biomarkers identified through omics that play a critical role in target identification for drug development, and that were verified in at least two independent cohorts. It also summarizes the current status on omics tools used to identify these useful cGVHD targets. We briefly list the biomarkers identified and verified so far. We further address challenges associated to their exploitation and application in the management of cGVHD patients. Finally, insights on biomarkers that are drug targetable and represent potential therapeutic targets are discussed. Expert commentary: We focus on biomarkers that play an essential role in target identification.

  13. RISC RNA sequencing for context-specific identification of in vivo miR targets

    PubMed Central

    Matkovich, Scot J; Van Booven, Derek J; Eschenbacher, William H; Dorn, Gerald W

    2010-01-01

    Rationale MicroRNAs (miRs) are expanding our understanding of cardiac disease and have the potential to transform cardiovascular therapeutics. One miR can target hundreds of individual mRNAs, but existing methodologies are not sufficient to accurately and comprehensively identify these mRNA targets in vivo. Objective To develop methods permitting identification of in vivo miR targets in an unbiased manner, using massively parallel sequencing of mouse cardiac transcriptomes in combination with sequencing of mRNA associated with mouse cardiac RNA-induced silencing complexes (RISCs). Methods and Results We optimized techniques for expression profiling small amounts of RNA without introducing amplification bias, and applied this to anti-Argonaute 2 immunoprecipitated RISCs (RISC-Seq) from mouse hearts. By comparing RNA-sequencing results of cardiac RISC and transcriptome from the same individual hearts, we defined 1,645 mRNAs consistently targeted to mouse cardiac RISCs. We employed this approach in hearts overexpressing miRs from Myh6 promoter-driven precursors (programmed RISC-Seq) to identify 209 in vivo targets of miR-133a and 81 in vivo targets of miR-499. Consistent with the fact that miR-133a and miR-499 have widely differing ‘seed’ sequences and belong to different miR families, only 6 targets were common to miR-133a- and miR-499-programmed hearts. Conclusions RISC-sequencing is a highly sensitive method for general RISC profiling and individual miR target identification in biological context, and is applicable to any tissue and any disease state. Summary MicroRNAs (miRs) are key regulators of mRNA translation in health and disease. While bioinformatic predictions suggest that a single miR may target hundreds of mRNAs, the number of experimentally verified targets of miRs is low. To enable comprehensive, unbiased examination of miR targets, we have performed deep RNA sequencing of cardiac transcriptomes in parallel with cardiac RNA-induced silencing complex (RISC)-associated RNAs (the RISCome), called RISC sequencing. We developed methods that did not require cross-linking of RNAs to RISCs or amplification of mRNA prior to sequencing, making it possible to rapidly perform RISC sequencing from intact tissue while avoiding amplification bias. Comparison of RISCome with transcriptome expression defined the degree of RISC enrichment for each mRNA. The majority of the mRNAs enriched in wild-type cardiac RISComes compared to transcriptomes were bioinformatically predicted to be targets of at least 1 of 139 cardiac-expressed miRs. Programming cardiomyocyte RISCs via transgenic overexpression in adult hearts of miR-133a or miR-499, two miRs that contain entirely different ‘seed’ sequences, elicited differing profiles of RISC-targeted mRNAs. Thus, RISC sequencing represents a highly sensitive method for general RISC profiling and individual miR target identification in biological context. PMID:21030712

  14. Hsa-miR-195 targets PCMT1 in hepatocellular carcinoma that increases tumor life span.

    PubMed

    Amer, Marwa; Elhefnawi, M; El-Ahwany, Eman; Awad, A F; Gawad, Nermen Abdel; Zada, Suher; Tawab, F M Abdel

    2014-11-01

    MicroRNAs are small 19-25 nucleotides which have been shown to play important roles in the regulation of gene expression in many organisms. Downregulation or accumulation of miRNAs implies either tumor suppression or oncogenic activation. In this study, differentially expressed hsa-miR-195 in hepatocellular carcinoma (HCC) was identified and analyzed. The prediction was done using a consensus approach of tools. The validation steps were done at two different levels in silico and in vitro. FGF7, GHR, PCMT1, CITED2, PEX5, PEX13, NOVA1, AXIN2, and TSPYL2 were detected with high significant (P < 0.005). These genes are involved in important pathways in cancer like MAPK signaling pathway, Jak-STAT signaling pathways, regulation of actin cytoskeleton, angiogenesis, Wnt signaling pathway, and TGF-beta signaling pathway. In vitro target validation was done for protein-L-isoaspartate (D-aspartate) O-methyltransferase (PCMT1). The co-transfection of pmirGLO-PCMT1 and pEGP-miR-195 showed highly significant results. Firefly luciferase was detected using Lumiscensor and t test analysis was done. Firefly luciferase expression was significantly decreased (P < 0.001) in comparison to the control. The low expression of firefly luciferase validates the method of target prediction that we used in this work by working on PCMT1 as a target for miR-195. Furthermore, the rest of the predicted genes are suspected to be real targets for hsa-miR-195. These target genes control almost all the hallmarks of liver cancer which can be used as therapeutic targets in cancer treatment.

  15. Binding Site and Affinity Prediction of General Anesthetics to Protein Targets Using Docking

    PubMed Central

    Liu, Renyu; Perez-Aguilar, Jose Manuel; Liang, David; Saven, Jeffery G.

    2012-01-01

    Background The protein targets for general anesthetics remain unclear. A tool to predict anesthetic binding for potential binding targets is needed. In this study, we explore whether a computational method, AutoDock, could serve as such a tool. Methods High-resolution crystal data of water soluble proteins (cytochrome C, apoferritin and human serum albumin), and a membrane protein (a pentameric ligand-gated ion channel from Gloeobacter violaceus, GLIC) were used. Isothermal titration calorimetry (ITC) experiments were performed to determine anesthetic affinity in solution conditions for apoferritin. Docking calculations were performed using DockingServer with the Lamarckian genetic algorithm and the Solis and Wets local search method (https://www.dockingserver.com/web). Twenty general anesthetics were docked into apoferritin. The predicted binding constants are compared with those obtained from ITC experiments for potential correlations. In the case of apoferritin, details of the binding site and their interactions were compared with recent co-crystallization data. Docking calculations for six general anesthetics currently used in clinical settings (isoflurane, sevoflurane, desflurane, halothane, propofol, and etomidate) with known EC50 were also performed in all tested proteins. The binding constants derived from docking experiments were compared with known EC50s and octanol/water partition coefficients for the six general anesthetics. Results All 20 general anesthetics docked unambiguously into the anesthetic binding site identified in the crystal structure of apoferritin. The binding constants for 20 anesthetics obtained from the docking calculations correlate significantly with those obtained from ITC experiments (p=0.04). In the case of GLIC, the identified anesthetic binding sites in the crystal structure are among the docking predicted binding sites, but not the top ranked site. Docking calculations suggest a most probable binding site located in the extracellular domain of GLIC. The predicted affinities correlated significantly with the known EC50s for the six commonly used anesthetics in GLIC for the site identified in the experimental crystal data (p=0.006). However, predicted affinities in apoferritin, human serum albumin, and cytochrome C did not correlate with these six anesthetics’ known experimental EC50s. A weak correlation between the predicted affinities and the octanol/water partition coefficients was observed for the sites in GLIC. Conclusion We demonstrated that anesthetic binding sites and relative affinities can be predicted using docking calculations in an automatic docking server (Autodock) for both water soluble and membrane proteins. Correlation of predicted affinity and EC50 for six commonly used general anesthetics was only observed in GLIC, a member of a protein family relevant to anesthetic mechanism. PMID:22392968

  16. Modeling side-chains using molecular dynamics improve recognition of binding region in CAPRI targets.

    PubMed

    Camacho, Carlos J

    2005-08-01

    The CAPRI-II experiment added an extra level of complexity to the problem of predicting protein-protein interactions by including 5 targets for which participants had to build or complete the 3-dimensional (3D) structure of either the receptor or ligand based on the structure of a close homolog. In this article, we describe how modeling key side-chains using molecular dynamics (MD) in explicit solvent improved the recognition of the binding region of a free energy- based computational docking method. In particular, we show that MD is able to predict with relatively high accuracy the rotamer conformation of the anchor side-chains important for molecular recognition as suggested by Rajamani et al. (Proc Natl Acad Sci USA 2004;101:11287-11292). As expected, the conformations are some of the most common rotamers for the given residue, while latch side-chains that undergo induced fit upon binding are forced into less common conformations. Using these models as starting conformations in conjunction with the rigid-body docking server ClusPro and the flexible docking algorithm SmoothDock, we produced valuable predictions for 6 of the 9 targets in CAPRI-II, missing only the 3 targets that underwent significant structural rearrangements upon binding. We also show that our free energy- based scoring function, consisting of the sum of van der Waals, Coulombic electrostatic with a distance-dependent dielectric, and desolvation free energy successfully discriminates the nativelike conformation of our submitted predictions. The latter emphasizes the critical role that thermodynamics plays on our methodology, and validates the generality of the algorithm to predict protein interactions.

  17. VisitSense: Sensing Place Visit Patterns from Ambient Radio on Smartphones for Targeted Mobile Ads in Shopping Malls.

    PubMed

    Kim, Byoungjip; Kang, Seungwoo; Ha, Jin-Young; Song, Junehwa

    2015-07-16

    In this paper, we introduce a novel smartphone framework called VisitSense that automatically detects and predicts a smartphone user's place visits from ambient radio to enable behavioral targeting for mobile ads in large shopping malls. VisitSense enables mobile app developers to adopt visit-pattern-aware mobile advertising for shopping mall visitors in their apps. It also benefits mobile users by allowing them to receive highly relevant mobile ads that are aware of their place visit patterns in shopping malls. To achieve the goal, VisitSense employs accurate visit detection and prediction methods. For accurate visit detection, we develop a change-based detection method to take into consideration the stability change of ambient radio and the mobility change of users. It performs well in large shopping malls where ambient radio is quite noisy and causes existing algorithms to easily fail. In addition, we proposed a causality-based visit prediction model to capture the causality in the sequential visit patterns for effective prediction. We have developed a VisitSense prototype system, and a visit-pattern-aware mobile advertising application that is based on it. Furthermore, we deploy the system in the COEX Mall, one of the largest shopping malls in Korea, and conduct diverse experiments to show the effectiveness of VisitSense.

  18. Likelihood of achieving air quality targets under model uncertainties.

    PubMed

    Digar, Antara; Cohan, Daniel S; Cox, Dennis D; Kim, Byeong-Uk; Boylan, James W

    2011-01-01

    Regulatory attainment demonstrations in the United States typically apply a bright-line test to predict whether a control strategy is sufficient to attain an air quality standard. Photochemical models are the best tools available to project future pollutant levels and are a critical part of regulatory attainment demonstrations. However, because photochemical models are uncertain and future meteorology is unknowable, future pollutant levels cannot be predicted perfectly and attainment cannot be guaranteed. This paper introduces a computationally efficient methodology for estimating the likelihood that an emission control strategy will achieve an air quality objective in light of uncertainties in photochemical model input parameters (e.g., uncertain emission and reaction rates, deposition velocities, and boundary conditions). The method incorporates Monte Carlo simulations of a reduced form model representing pollutant-precursor response under parametric uncertainty to probabilistically predict the improvement in air quality due to emission control. The method is applied to recent 8-h ozone attainment modeling for Atlanta, Georgia, to assess the likelihood that additional controls would achieve fixed (well-defined) or flexible (due to meteorological variability and uncertain emission trends) targets of air pollution reduction. The results show that in certain instances ranking of the predicted effectiveness of control strategies may differ between probabilistic and deterministic analyses.

  19. The Role of Predictability in Saccadic Eye Responses in the Suppression Head Impulse Test of Horizontal Semicircular Canal Function.

    PubMed

    Rey-Martinez, Jorge; Yanes, Joaquin; Esteban, Jonathan; Sanz, Ricardo; Martin-Sanz, Eduardo

    2017-01-01

    In the suppression head impulse paradigm (SHIMP) vHIT protocol, the participant is instructed to follow with his gaze a mobile target generated by a laser placed on the participant's head. Recent studies have reported that the refixation saccade latencies are in relation with the time evolution of the vestibular dysfunction in both (standard and SHIMP) procedures. We hypothesized that some central mechanisms like head impulse prediction could be one of the causes for the differences in the saccadic eye responses. A prospective cohort non-randomized study was designed. For the SHIMP protocol, recorded with the ICS Impulse ver. 4.0 ® (Otometrics A/S, Taastrup, Denmark) vHIT device, three different algorithms were performed: "predictable," "less predictable," and "unpredictable" depending on the target's predictability. A mathematical method was developed to analyze the SHIMP responses. The method was implemented as an additional tool to the MATLAB open source script for the extended analysis of the vHIT responses named HITCal. In cohort 1, 52 participants were included in "predictable" SHIMP protocol. In cohort 2, 60 patients were included for the "less predictable" and 35 patients for the "unpredictable" SHIMP protocol. The participants made more early saccades when instructed to perform the "predictable" paradigm compared with the "less predictable" paradigm ( p  < 0.001). The less predictable protocol did not reveal any significant difference when compared with the unpredictable protocol ( p  = 0.189). For the latency of the first saccade, there was statistical difference between the "unpredictable" and "predictable" protocols ( p  < 0.001) and between the "less predictable" and "predictable" protocols ( p  < 0.001). Finally, we did not find any relationship between the horizontal vestibulo-ocular reflex (hVOR) gain and the latency of the saccades. We developed a specific method to analyze and detect early SHIMP saccades. Our findings offer evidence regarding the influence of predictability on the latency of the SHIMP saccadic responses, suggesting that early saccades are probably caused by a conditioned response of the participant. The lack of relationship between the hVOR gain and the latency of the saccades suggests that the predictive behavior that caused the early eye saccades are independent of the vestibular function.

  20. Comparison of integrated clustering methods for accurate and stable prediction of building energy consumption data

    DOE PAGES

    Hsu, David

    2015-09-27

    Clustering methods are often used to model energy consumption for two reasons. First, clustering is often used to process data and to improve the predictive accuracy of subsequent energy models. Second, stable clusters that are reproducible with respect to non-essential changes can be used to group, target, and interpret observed subjects. However, it is well known that clustering methods are highly sensitive to the choice of algorithms and variables. This can lead to misleading assessments of predictive accuracy and mis-interpretation of clusters in policymaking. This paper therefore introduces two methods to the modeling of energy consumption in buildings: clusterwise regression,more » also known as latent class regression, which integrates clustering and regression simultaneously; and cluster validation methods to measure stability. Using a large dataset of multifamily buildings in New York City, clusterwise regression is compared to common two-stage algorithms that use K-means and model-based clustering with linear regression. Predictive accuracy is evaluated using 20-fold cross validation, and the stability of the perturbed clusters is measured using the Jaccard coefficient. These results show that there seems to be an inherent tradeoff between prediction accuracy and cluster stability. This paper concludes by discussing which clustering methods may be appropriate for different analytical purposes.« less

  1. Utilizing random Forest QSAR models with optimized parameters for target identification and its application to target-fishing server.

    PubMed

    Lee, Kyoungyeul; Lee, Minho; Kim, Dongsup

    2017-12-28

    The identification of target molecules is important for understanding the mechanism of "target deconvolution" in phenotypic screening and "polypharmacology" of drugs. Because conventional methods of identifying targets require time and cost, in-silico target identification has been considered an alternative solution. One of the well-known in-silico methods of identifying targets involves structure activity relationships (SARs). SARs have advantages such as low computational cost and high feasibility; however, the data dependency in the SAR approach causes imbalance of active data and ambiguity of inactive data throughout targets. We developed a ligand-based virtual screening model comprising 1121 target SAR models built using a random forest algorithm. The performance of each target model was tested by employing the ROC curve and the mean score using an internal five-fold cross validation. Moreover, recall rates for top-k targets were calculated to assess the performance of target ranking. A benchmark model using an optimized sampling method and parameters was examined via external validation set. The result shows recall rates of 67.6% and 73.9% for top-11 (1% of the total targets) and top-33, respectively. We provide a website for users to search the top-k targets for query ligands available publicly at http://rfqsar.kaist.ac.kr . The target models that we built can be used for both predicting the activity of ligands toward each target and ranking candidate targets for a query ligand using a unified scoring scheme. The scores are additionally fitted to the probability so that users can estimate how likely a ligand-target interaction is active. The user interface of our web site is user friendly and intuitive, offering useful information and cross references.

  2. Comparative analysis of machine learning methods in ligand-based virtual screening of large compound libraries.

    PubMed

    Ma, Xiao H; Jia, Jia; Zhu, Feng; Xue, Ying; Li, Ze R; Chen, Yu Z

    2009-05-01

    Machine learning methods have been explored as ligand-based virtual screening tools for facilitating drug lead discovery. These methods predict compounds of specific pharmacodynamic, pharmacokinetic or toxicological properties based on their structure-derived structural and physicochemical properties. Increasing attention has been directed at these methods because of their capability in predicting compounds of diverse structures and complex structure-activity relationships without requiring the knowledge of target 3D structure. This article reviews current progresses in using machine learning methods for virtual screening of pharmacodynamically active compounds from large compound libraries, and analyzes and compares the reported performances of machine learning tools with those of structure-based and other ligand-based (such as pharmacophore and clustering) virtual screening methods. The feasibility to improve the performance of machine learning methods in screening large libraries is discussed.

  3. Comprehensive Prediction of Large-height Swell-like Waves in East Coast of Korea

    NASA Astrophysics Data System (ADS)

    Kwon, S. J.; Lee, C.; Ahn, S. J.; Kim, H. K.

    2014-12-01

    There have been growing interests in the large-height swell-like wave (LSW) in the east coast of Korea because such big waves have caused human victims as well as damages to facilities such as breakwaters in the coast. The LSW was found to be generated due to an atmospherically great valley in the north area of the East Sea and then propagate long distance to the east coast of Korea in prominently southwest direction (Oh et al., 2010).In this study, we will perform two methods, real-time data based and numerical-model based predictions in order to predict the LSW in the east coast of Korea. First, the real-time data based prediction method uses information which is collected by the directional wave gauge installed near Sokcho. Using the wave model SWAN (Booij et al., 1999) and the wave ray method (Munk and Arthur, 1952), we will estimate wave data in open sea from the real-time data and predict the travel time of LSW from the measurement site (near Sokcho) to several target points in the east coast of Korea. Second, the numerical-model based method uses three different numerical models; WW3 in deep water, SWAN in shallow water, and CADMAS-SURF for wave run-up (CDIT). The surface winds from the 72 hours prediction system of NCEP (National Centers for Environmental Prediction) GFS (Global Forecast System) will be inputted in finer grids after interpolating these in certain domains of WW3 and SWAN models. The significant wave heights and peak wave directions predicted by the two methods will be compared to the measured data of LSW at several target points near the coasts. Further, the prediction method will be improved using more measurement sites which will be installed in the future. ReferencesBooij, N., Ris, R.C., and Holthuijsen, L.H. (1999). A third-generation wave model for coastal regions 1. Model description and validation. J. of Geophysical Research, 103(C4), 7649-7666.Munk, W.H. and Arthur, R.S. (1952). Gravity Waves. 13. Wave Intensity along a Refracted Ray. National Bureau of Standards Circular 521, Washington D.C., 95-108.Oh, S.-H., Jeong, W.-M., Lee, D.Y. and Kim, S.I. (2010). Analysis of the reason for occurrence of large-height swell-like waves in the east coast of Korea. J. of Korean Society of Coastal and Ocean Engineers, 22(2), 101-111 (in Korean).

  4. Prospective evaluation of shape similarity based pose prediction method in D3R Grand Challenge 2015

    NASA Astrophysics Data System (ADS)

    Kumar, Ashutosh; Zhang, Kam Y. J.

    2016-09-01

    Evaluation of ligand three-dimensional (3D) shape similarity is one of the commonly used approaches to identify ligands similar to one or more known active compounds from a library of small molecules. Apart from using ligand shape similarity as a virtual screening tool, its role in pose prediction and pose scoring has also been reported. We have recently developed a method that utilizes ligand 3D shape similarity with known crystallographic ligands to predict binding poses of query ligands. Here, we report the prospective evaluation of our pose prediction method through the participation in drug design data resource (D3R) Grand Challenge 2015. Our pose prediction method was used to predict binding poses of heat shock protein 90 (HSP90) and mitogen activated protein kinase kinase kinase kinase (MAP4K4) ligands and it was able to predict the pose within 2 Å root mean square deviation (RMSD) either as the top pose or among the best of five poses in a majority of cases. Specifically for HSP90 protein, a median RMSD of 0.73 and 0.68 Å was obtained for the top and the best of five predictions respectively. For MAP4K4 target, although the median RMSD for our top prediction was only 2.87 Å but the median RMSD of 1.67 Å for the best of five predictions was well within the limit for successful prediction. Furthermore, the performance of our pose prediction method for HSP90 and MAP4K4 ligands was always among the top five groups. Particularly, for MAP4K4 protein our pose prediction method was ranked number one both in terms of mean and median RMSD when the best of five predictions were considered. Overall, our D3R Grand Challenge 2015 results demonstrated that ligand 3D shape similarity with the crystal ligand is sufficient to predict binding poses of new ligands with acceptable accuracy.

  5. Comparison of three methods to identify the anaerobic threshold during maximal exercise testing in patients with chronic heart failure.

    PubMed

    Beckers, Paul J; Possemiers, Nadine M; Van Craenenbroeck, Emeline M; Van Berendoncks, An M; Wuyts, Kurt; Vrints, Christiaan J; Conraads, Viviane M

    2012-02-01

    Exercise training efficiently improves peak oxygen uptake (V˙O2peak) in patients with chronic heart failure. To optimize training-derived benefit, higher exercise intensities are being explored. The correct identification of anaerobic threshold is important to allow safe and effective exercise prescription. During 48 cardiopulmonary exercise tests obtained in patients with chronic heart failure (59.6 ± 11 yrs; left ventricular ejection fraction, 27.9% ± 9%), ventilatory gas analysis findings and lactate measurements were collected. Three technicians independently determined the respiratory compensation point (RCP), the heart rate turning point (HRTP) and the second lactate turning point (LTP2). Thereafter, exercise intensity (target heart rate and workload) was calculated and compared between the three methods applied. Patients had significantly reduced maximal exercise capacity (68% ± 21% of predicted V˙O2peak) and chronotropic incompetence (74% ± 7% of predicted peak heart rate). Heart rate, workload, and V˙O2 at HRTP and at RCP were not different, but at LTP2, these parameters were significantly (P < 0.0001) higher. Mean target heart rate and target workload calculated using the LTP2 were 5% and 12% higher compared with those calculated using HRTP and RCP, respectively. The calculation of target heart rate based on LTP2 was 5% and 10% higher in 12 of 48 (25%) and 6 of 48 (12.5%) patients, respectively, compared with the other two methods. In patients with chronic heart failure, RCP and HRTP, determined during cardiopulmonary exercise tests, precede the occurrence of LTP2. Target heart rates and workloads used to prescribe tailored exercise training in patients with chronic heart failure based on LTP2 are significantly higher than those derived from HRTP and RCP.

  6. Rational Design of an Ultrasensitive Quorum-Sensing Switch.

    PubMed

    Zeng, Weiqian; Du, Pei; Lou, Qiuli; Wu, Lili; Zhang, Haoqian M; Lou, Chunbo; Wang, Hongli; Ouyang, Qi

    2017-08-18

    One of the purposes of synthetic biology is to develop rational methods that accelerate the design of genetic circuits, saving time and effort spent on experiments and providing reliably predictable circuit performance. We applied a reverse engineering approach to design an ultrasensitive transcriptional quorum-sensing switch. We want to explore how systems biology can guide synthetic biology in the choice of specific DNA sequences and their regulatory relations to achieve a targeted function. The workflow comprises network enumeration that achieves the target function robustly, experimental restriction of the obtained candidate networks, global parameter optimization via mathematical analysis, selection and engineering of parts based on these calculations, and finally, circuit construction based on the principles of standardization and modularization. The performance of realized quorum-sensing switches was in good qualitative agreement with the computational predictions. This study provides practical principles for the rational design of genetic circuits with targeted functions.

  7. A Rat α-Fetoprotein Binding Activity Prediction Model to Facilitate Assessment of the Endocrine Disruption Potential of Environmental Chemicals.

    PubMed

    Hong, Huixiao; Shen, Jie; Ng, Hui Wen; Sakkiah, Sugunadevi; Ye, Hao; Ge, Weigong; Gong, Ping; Xiao, Wenming; Tong, Weida

    2016-03-25

    Endocrine disruptors such as polychlorinated biphenyls (PCBs), diethylstilbestrol (DES) and dichlorodiphenyltrichloroethane (DDT) are agents that interfere with the endocrine system and cause adverse health effects. Huge public health concern about endocrine disruptors has arisen. One of the mechanisms of endocrine disruption is through binding of endocrine disruptors with the hormone receptors in the target cells. Entrance of endocrine disruptors into target cells is the precondition of endocrine disruption. The binding capability of a chemical with proteins in the blood affects its entrance into the target cells and, thus, is very informative for the assessment of potential endocrine disruption of chemicals. α-fetoprotein is one of the major serum proteins that binds to a variety of chemicals such as estrogens. To better facilitate assessment of endocrine disruption of environmental chemicals, we developed a model for α-fetoprotein binding activity prediction using the novel pattern recognition method (Decision Forest) and the molecular descriptors calculated from two-dimensional structures by Mold² software. The predictive capability of the model has been evaluated through internal validation using 125 training chemicals (average balanced accuracy of 69%) and external validations using 22 chemicals (balanced accuracy of 71%). Prediction confidence analysis revealed the model performed much better at high prediction confidence. Our results indicate that the model is useful (when predictions are in high confidence) in endocrine disruption risk assessment of environmental chemicals though improvement by increasing number of training chemicals is needed.

  8. Meeting the Healthy People 2020 Objectives to Reduce Cancer Mortality

    PubMed Central

    Thompson, Trevor D.; Soman, Ashwini; Møller, Bjorn; Leadbetter, Steven; White, Mary C.

    2015-01-01

    Introduction Healthy People 2020 (HP2020) calls for a 10% to 15% reduction in death rates from 2007 to 2020 for selected cancers. Trends in death rates can be used to predict progress toward meeting HP2020 targets. Methods We used mortality data from 1975 through 2009 and population estimates and projections to predict deaths for all cancers and the top 23 cancers among men and women by race. We apportioned changes in deaths from population risk and population growth and aging. Results From 1975 to 2009, the number of cancer deaths increased among white and black Americans primarily because of an aging white population and a growing black population. Overall, age-standardized cancer death rates (risk) declined in all groups. From 2007 to 2020, rates are predicted to continue to decrease while counts of deaths are predicted to increase among men (15%) and stabilize among women (increase <10%). Declining death rates are predicted to meet HP2020 targets for cancers of the female breast, lung and bronchus, cervix and uterus, colon and rectum, oral cavity and pharynx, and prostate, but not for melanoma. Conclusion Cancer deaths among women overall are predicted to increase by less than 10%, because of, in part, declines in breast, cervical, and colorectal cancer deaths among white women. Increased efforts to promote cancer prevention and improve survival are needed to counter the impact of a growing and aging population on the cancer burden and to meet melanoma target death rates. PMID:26133647

  9. Semi-supervised protein subcellular localization.

    PubMed

    Xu, Qian; Hu, Derek Hao; Xue, Hong; Yu, Weichuan; Yang, Qiang

    2009-01-30

    Protein subcellular localization is concerned with predicting the location of a protein within a cell using computational method. The location information can indicate key functionalities of proteins. Accurate predictions of subcellular localizations of protein can aid the prediction of protein function and genome annotation, as well as the identification of drug targets. Computational methods based on machine learning, such as support vector machine approaches, have already been widely used in the prediction of protein subcellular localization. However, a major drawback of these machine learning-based approaches is that a large amount of data should be labeled in order to let the prediction system learn a classifier of good generalization ability. However, in real world cases, it is laborious, expensive and time-consuming to experimentally determine the subcellular localization of a protein and prepare instances of labeled data. In this paper, we present an approach based on a new learning framework, semi-supervised learning, which can use much fewer labeled instances to construct a high quality prediction model. We construct an initial classifier using a small set of labeled examples first, and then use unlabeled instances to refine the classifier for future predictions. Experimental results show that our methods can effectively reduce the workload for labeling data using the unlabeled data. Our method is shown to enhance the state-of-the-art prediction results of SVM classifiers by more than 10%.

  10. Compound analysis via graph kernels incorporating chirality.

    PubMed

    Brown, J B; Urata, Takashi; Tamura, Takeyuki; Arai, Midori A; Kawabata, Takeo; Akutsu, Tatsuya

    2010-12-01

    High accuracy is paramount when predicting biochemical characteristics using Quantitative Structural-Property Relationships (QSPRs). Although existing graph-theoretic kernel methods combined with machine learning techniques are efficient for QSPR model construction, they cannot distinguish topologically identical chiral compounds which often exhibit different biological characteristics. In this paper, we propose a new method that extends the recently developed tree pattern graph kernel to accommodate stereoisomers. We show that Support Vector Regression (SVR) with a chiral graph kernel is useful for target property prediction by demonstrating its application to a set of human vitamin D receptor ligands currently under consideration for their potential anti-cancer effects.

  11. Accommodating subject and instrument variations in spectroscopic determinations

    DOEpatents

    Haas, Michael J [Albuquerque, NM; Rowe, Robert K [Corrales, NM; Thomas, Edward V [Albuquerque, NM

    2006-08-29

    A method and apparatus for measuring a biological attribute, such as the concentration of an analyte, particularly a blood analyte in tissue such as glucose. The method utilizes spectrographic techniques in conjunction with an improved instrument-tailored or subject-tailored calibration model. In a calibration phase, calibration model data is modified to reduce or eliminate instrument-specific attributes, resulting in a calibration data set modeling intra-instrument or intra-subject variation. In a prediction phase, the prediction process is tailored for each target instrument separately using a minimal number of spectral measurements from each instrument or subject.

  12. Improved GGIW-PHD filter for maneuvering non-ellipsoidal extended targets or group targets tracking based on sub-random matrices.

    PubMed

    Liang, Zhibing; Liu, Fuxian; Gao, Jiale

    2018-01-01

    For non-ellipsoidal extended targets and group targets tracking (NETT and NGTT), using an ellipsoid to approximate the target extension may not be accurate enough because of the lack of shape and orientation information. In consideration of this, we model a non-ellipsoidal extended target or target group as a combination of multiple ellipsoidal sub-objects, each represented by a random matrix. Based on these models, an improved gamma Gaussian inverse Wishart probability hypothesis density (GGIW-PHD) filter is proposed to estimate the measurement rates, kinematic states, and extension states of the sub-objects for each extended target or target group. For maneuvering NETT and NGTT, a multi-model (MM) approach based GGIW-PHD (MM-GGIW-PHD) filter is proposed. The common and the individual dynamics of the sub-objects belonging to the same extended target or target group are described by means of the combination between the overall maneuver model and the sub-object models. For the merging of updating components, an improved merging criterion and a new merging method are derived. A specific implementation of prediction partition with pseudo-likelihood method is presented. Two scenarios for non-maneuvering and maneuvering NETT and NGTT are simulated. The results demonstrate the effectiveness of the proposed algorithms.

  13. Improved GGIW-PHD filter for maneuvering non-ellipsoidal extended targets or group targets tracking based on sub-random matrices

    PubMed Central

    Liu, Fuxian; Gao, Jiale

    2018-01-01

    For non-ellipsoidal extended targets and group targets tracking (NETT and NGTT), using an ellipsoid to approximate the target extension may not be accurate enough because of the lack of shape and orientation information. In consideration of this, we model a non-ellipsoidal extended target or target group as a combination of multiple ellipsoidal sub-objects, each represented by a random matrix. Based on these models, an improved gamma Gaussian inverse Wishart probability hypothesis density (GGIW-PHD) filter is proposed to estimate the measurement rates, kinematic states, and extension states of the sub-objects for each extended target or target group. For maneuvering NETT and NGTT, a multi-model (MM) approach based GGIW-PHD (MM-GGIW-PHD) filter is proposed. The common and the individual dynamics of the sub-objects belonging to the same extended target or target group are described by means of the combination between the overall maneuver model and the sub-object models. For the merging of updating components, an improved merging criterion and a new merging method are derived. A specific implementation of prediction partition with pseudo-likelihood method is presented. Two scenarios for non-maneuvering and maneuvering NETT and NGTT are simulated. The results demonstrate the effectiveness of the proposed algorithms. PMID:29444144

  14. An assessment of spacecraft target mode selection methods

    NASA Astrophysics Data System (ADS)

    Mercer, J. F.; Aglietti, G. S.; Remedia, M.; Kiley, A.

    2017-11-01

    Coupled Loads Analyses (CLAs), using finite element models (FEMs) of the spacecraft and launch vehicle to simulate critical flight events, are performed in order to determine the dynamic loadings that will be experienced by spacecraft during launch. A validation process is carried out on the spacecraft FEM beforehand to ensure that the dynamics of the analytical model sufficiently represent the behavior of the physical hardware. One aspect of concern is the containment of the FEM correlation and update effort to focus on the vibration modes which are most likely to be excited under test and CLA conditions. This study therefore provides new insight into the prioritization of spacecraft FEM modes for correlation to base-shake vibration test data. The work involved example application to large, unique, scientific spacecraft, with modern FEMs comprising over a million degrees of freedom. This comprehensive investigation explores: the modes inherently important to the spacecraft structures, irrespective of excitation; the particular 'critical modes' which produce peak responses to CLA level excitation; an assessment of several traditional target mode selection methods in terms of ability to predict these 'critical modes'; and an indication of the level of correlation these FEM modes achieve compared to corresponding test data. Findings indicate that, although the traditional methods of target mode selection have merit and are able to identify many of the modes of significance to the spacecraft, there are 'critical modes' which may be missed by conventional application of these methods. The use of different thresholds to select potential target modes from these parameters would enable identification of many of these missed modes. Ultimately, some consideration of the expected excitations is required to predict all modes likely to contribute to the response of the spacecraft in operation.

  15. The Application of Gene Expression Profiling in Predictions of Occult Lymph Node Metastasis in Colorectal Cancer Patients

    PubMed Central

    Peyravian, Noshad; Larki, Pegah; Gharib, Ehsan; Nazemalhosseini-Mojarad, Ehsan; Anaraki, Fakhrosadate; Young, Chris; McClellan, James; Ashrafian Bonab, Maziar; Asadzadeh-Aghdaei, Hamid; Zali, Mohammad Reza

    2018-01-01

    A key factor in determining the likely outcome for a patient with colorectal cancer is whether or not the tumour has metastasised to the lymph nodes—information which is also important in assessing any possibilities of lymph node resection so as to improve survival. In this review we perform a wide-range assessment of literature relating to recent developments in gene expression profiling (GEP) of the primary tumour, to determine their utility in assessing node status. A set of characteristic genes seems to be involved in the prediction of lymph node metastasis (LNM) in colorectal patients. Hence, GEP is applicable in personalised/individualised/tailored therapies and provides insights into developing novel therapeutic targets. Not only is GEP useful in prediction of LNM, but it also allows classification based on differences such as sample size, target gene expression, and examination method. PMID:29498671

  16. Predicting Pharmacodynamic Drug-Drug Interactions through Signaling Propagation Interference on Protein-Protein Interaction Networks.

    PubMed

    Park, Kyunghyun; Kim, Docyong; Ha, Suhyun; Lee, Doheon

    2015-01-01

    As pharmacodynamic drug-drug interactions (PD DDIs) could lead to severe adverse effects in patients, it is important to identify potential PD DDIs in drug development. The signaling starting from drug targets is propagated through protein-protein interaction (PPI) networks. PD DDIs could occur by close interference on the same targets or within the same pathways as well as distant interference through cross-talking pathways. However, most of the previous approaches have considered only close interference by measuring distances between drug targets or comparing target neighbors. We have applied a random walk with restart algorithm to simulate signaling propagation from drug targets in order to capture the possibility of their distant interference. Cross validation with DrugBank and Kyoto Encyclopedia of Genes and Genomes DRUG shows that the proposed method outperforms the previous methods significantly. We also provide a web service with which PD DDIs for drug pairs can be analyzed at http://biosoft.kaist.ac.kr/targetrw.

  17. MQAPRank: improved global protein model quality assessment by learning-to-rank.

    PubMed

    Jing, Xiaoyang; Dong, Qiwen

    2017-05-25

    Protein structure prediction has achieved a lot of progress during the last few decades and a greater number of models for a certain sequence can be predicted. Consequently, assessing the qualities of predicted protein models in perspective is one of the key components of successful protein structure prediction. Over the past years, a number of methods have been developed to address this issue, which could be roughly divided into three categories: single methods, quasi-single methods and clustering (or consensus) methods. Although these methods achieve much success at different levels, accurate protein model quality assessment is still an open problem. Here, we present the MQAPRank, a global protein model quality assessment program based on learning-to-rank. The MQAPRank first sorts the decoy models by using single method based on learning-to-rank algorithm to indicate their relative qualities for the target protein. And then it takes the first five models as references to predict the qualities of other models by using average GDT_TS scores between reference models and other models. Benchmarked on CASP11 and 3DRobot datasets, the MQAPRank achieved better performances than other leading protein model quality assessment methods. Recently, the MQAPRank participated in the CASP12 under the group name FDUBio and achieved the state-of-the-art performances. The MQAPRank provides a convenient and powerful tool for protein model quality assessment with the state-of-the-art performances, it is useful for protein structure prediction and model quality assessment usages.

  18. De novo protein structure prediction by dynamic fragment assembly and conformational space annealing.

    PubMed

    Lee, Juyong; Lee, Jinhyuk; Sasaki, Takeshi N; Sasai, Masaki; Seok, Chaok; Lee, Jooyoung

    2011-08-01

    Ab initio protein structure prediction is a challenging problem that requires both an accurate energetic representation of a protein structure and an efficient conformational sampling method for successful protein modeling. In this article, we present an ab initio structure prediction method which combines a recently suggested novel way of fragment assembly, dynamic fragment assembly (DFA) and conformational space annealing (CSA) algorithm. In DFA, model structures are scored by continuous functions constructed based on short- and long-range structural restraint information from a fragment library. Here, DFA is represented by the full-atom model by CHARMM with the addition of the empirical potential of DFIRE. The relative contributions between various energy terms are optimized using linear programming. The conformational sampling was carried out with CSA algorithm, which can find low energy conformations more efficiently than simulated annealing used in the existing DFA study. The newly introduced DFA energy function and CSA sampling algorithm are implemented into CHARMM. Test results on 30 small single-domain proteins and 13 template-free modeling targets of the 8th Critical Assessment of protein Structure Prediction show that the current method provides comparable and complementary prediction results to existing top methods. Copyright © 2011 Wiley-Liss, Inc.

  19. Prediction of binding poses to FXR using multi-targeted docking combined with molecular dynamics and enhanced sampling

    NASA Astrophysics Data System (ADS)

    Bhakat, Soumendranath; Åberg, Emil; Söderhjelm, Pär

    2018-01-01

    Advanced molecular docking methods often aim at capturing the flexibility of the protein upon binding to the ligand. In this study, we investigate whether instead a simple rigid docking method can be applied, if combined with multiple target structures to model the backbone flexibility and molecular dynamics simulations to model the sidechain and ligand flexibility. The methods are tested for the binding of 35 ligands to FXR as part of the first stage of the Drug Design Data Resource (D3R) Grand Challenge 2 blind challenge. The results show that the multiple-target docking protocol performs surprisingly well, with correct poses found for 21 of the ligands. MD simulations started on the docked structures are remarkably stable, but show almost no tendency of refining the structure closer to the experimentally found binding pose. Reconnaissance metadynamics enhances the exploration of new binding poses, but additional collective variables involving the protein are needed to exploit the full potential of the method.

  20. Prediction of binding poses to FXR using multi-targeted docking combined with molecular dynamics and enhanced sampling.

    PubMed

    Bhakat, Soumendranath; Åberg, Emil; Söderhjelm, Pär

    2018-01-01

    Advanced molecular docking methods often aim at capturing the flexibility of the protein upon binding to the ligand. In this study, we investigate whether instead a simple rigid docking method can be applied, if combined with multiple target structures to model the backbone flexibility and molecular dynamics simulations to model the sidechain and ligand flexibility. The methods are tested for the binding of 35 ligands to FXR as part of the first stage of the Drug Design Data Resource (D3R) Grand Challenge 2 blind challenge. The results show that the multiple-target docking protocol performs surprisingly well, with correct poses found for 21 of the ligands. MD simulations started on the docked structures are remarkably stable, but show almost no tendency of refining the structure closer to the experimentally found binding pose. Reconnaissance metadynamics enhances the exploration of new binding poses, but additional collective variables involving the protein are needed to exploit the full potential of the method.

  1. A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis

    PubMed Central

    Ni, Ying; Aghamirzaie, Delasa; Elmarakeby, Haitham; Collakova, Eva; Li, Song; Grene, Ruth; Heath, Lenwood S.

    2016-01-01

    Gene regulatory networks (GRNs) provide a representation of relationships between regulators and their target genes. Several methods for GRN inference, both unsupervised and supervised, have been developed to date. Because regulatory relationships consistently reprogram in diverse tissues or under different conditions, GRNs inferred without specific biological contexts are of limited applicability. In this report, a machine learning approach is presented to predict GRNs specific to developing Arabidopsis thaliana embryos. We developed the Beacon GRN inference tool to predict GRNs occurring during seed development in Arabidopsis based on a support vector machine (SVM) model. We developed both global and local inference models and compared their performance, demonstrating that local models are generally superior for our application. Using both the expression levels of the genes expressed in developing embryos and prior known regulatory relationships, GRNs were predicted for specific embryonic developmental stages. The targets that are strongly positively correlated with their regulators are mostly expressed at the beginning of seed development. Potential direct targets were identified based on a match between the promoter regions of these inferred targets and the cis elements recognized by specific regulators. Our analysis also provides evidence for previously unknown inhibitory effects of three positive regulators of gene expression. The Beacon GRN inference tool provides a valuable model system for context-specific GRN inference and is freely available at https://github.com/BeaconProjectAtVirginiaTech/beacon_network_inference.git. PMID:28066488

  2. Past speculations of future health technologies: a description of technologies predicted in 15 forecasting studies published between 1986 and 2010

    PubMed Central

    Doos, Lucy; Packer, Claire; Ward, Derek; Simpson, Sue; Stevens, Andrew

    2017-01-01

    Objective To describe and classify health technologies predicted in forecasting studies. Design and methods A portrait describing health technologies predicted in 15 forecasting studies published between 1986 and 2010 that were identified in a previous systematic review. Health technologies are classified according to their type, purpose and clinical use; relating these to the original purpose and timing of the forecasting studies. Data sources All health-related technologies predicted in 15 forecasting studies identified in a previously published systematic review. Main outcome measure Outcomes related to (1) each forecasting study including country, year, intention and forecasting methods used and (2) the predicted technologies including technology type, purpose, targeted clinical area and forecast timeframe. Results Of the 896 identified health-related technologies, 685 (76.5%) were health technologies with an explicit or implied health application and included in our study. Of these, 19.1% were diagnostic or imaging tests, 14.3% devices or biomaterials, 12.6% information technology systems, eHealth or mHealth and 12% drugs. The majority of the technologies were intended to treat or manage disease (38.1%) or diagnose or monitor disease (26.1%). The most frequent targeted clinical areas were infectious diseases followed by cancer, circulatory and nervous system disorders. The most frequent technology types were for: infectious diseases—prophylactic vaccines (45.8%), cancer—drugs (40%), circulatory disease—devices and biomaterials (26.3%), and diseases of the nervous system—equally devices and biomaterials (25%) and regenerative medicine (25%). The mean timeframe for forecasting was 11.6 years (range 0–33 years, median=10, SD=6.6). The forecasting timeframe significantly differed by technology type (p=0.002), the intent of the forecasting group (p<0.001) and the methods used (p<001). Conclusion While description and classification of predicted health-related technologies is crucial in preparing healthcare systems for adopting new innovations, further work is needed to test the accuracy of predictions made. PMID:28760796

  3. Repopulation of calibrations with samples from the target site: effect of the size of the calibration.

    NASA Astrophysics Data System (ADS)

    Guerrero, C.; Zornoza, R.; Gómez, I.; Mataix-Solera, J.; Navarro-Pedreño, J.; Mataix-Beneyto, J.; García-Orenes, F.

    2009-04-01

    Near infrared (NIR) reflectance spectroscopy offers important advantages because is a non-destructive technique, the pre-treatments needed in samples are minimal, and the spectrum of the sample is obtained in less than 1 minute without the needs of chemical reagents. For these reasons, NIR is a fast and cost-effective method. Moreover, NIR allows the analysis of several constituents or parameters simultaneously from the same spectrum once it is obtained. For this, a needed steep is the development of soil spectral libraries (set of samples analysed and scanned) and calibrations (using multivariate techniques). The calibrations should contain the variability of the target site soils in which the calibration is to be used. Many times this premise is not easy to fulfil, especially in libraries recently developed. A classical way to solve this problem is through the repopulation of libraries and the subsequent recalibration of the models. In this work we studied the changes in the accuracy of the predictions as a consequence of the successive addition of samples to repopulation. In general, calibrations with high number of samples and high diversity are desired. But we hypothesized that calibrations with lower quantities of samples (lower size) will absorb more easily the spectral characteristics of the target site. Thus, we suspect that the size of the calibration (model) that will be repopulated could be important. For this reason we also studied this effect in the accuracy of predictions of the repopulated models. In this study we used those spectra of our library which contained data of soil Kjeldahl Nitrogen (NKj) content (near to 1500 samples). First, those spectra from the target site were removed from the spectral library. Then, different quantities of samples of the library were selected (representing the 5, 10, 25, 50, 75 and 100% of the total library). These samples were used to develop calibrations with different sizes (%) of samples. We used partial least squares regression, and leave-one-out cross validation as methods of calibration. Two methods were used to select the different quantities (size of models) of samples: (1) Based on Characteristics of Spectra (BCS), and (2) Based on NKj Values of Samples (BVS). Both methods tried to select representative samples. Each of the calibrations (containing the 5, 10, 25, 50, 75 or 100% of the total samples of the library) was repopulated with samples from the target site and then recalibrated (by leave-one-out cross validation). This procedure was sequential. In each step, 2 samples from the target site were added to the models, and then recalibrated. This process was repeated successively 10 times, being 20 the total number of samples added. A local model was also created with the 20 samples used for repopulation. The repopulated, non-repopulated and local calibrations were used to predict the NKj content in those samples from the target site not included in repopulations. For the measurement of the accuracy of the predictions, the r2, RMSEP and slopes were calculated comparing predicted with analysed NKj values. This scheme was repeated for each of the four target sites studied. In general, scarce differences can be found between results obtained with BCS and BVS models. We observed that the repopulation of models increased the r2 of the predictions in sites 1 and 3. The repopulation caused scarce changes of the r2 of the predictions in sites 2 and 4, maybe due to the high initial values (using non-repopulated models r2 >0.90). As consequence of repopulation, the RMSEP decreased in all the sites except in site 2, where a very low RMESP was obtained before the repopulation (0.4 g×kg-1). The slopes trended to approximate to 1, but this value was reached only in site 4 and after the repopulation with 20 samples. In sites 3 and 4, accurate predictions were obtained using the local models. Predictions obtained with models using similar size of samples (similar %) were averaged with the aim to describe the main patterns. The r2 of predictions obtained with models of higher size were not more accurate than those obtained with models of lower size. After repopulation, the RMSEP of predictions using models with lower sizes (5, 10 and 25% of samples of the library) were lower than RMSEP obtained with higher sizes (75 and 100%), indicating that small models can easily integrate the variability of the soils from the target site. The results suggest that calibrations of small size could be repopulated and "converted" in local calibrations. According to this, we can focus most of the efforts in the obtainment of highly accurate analytical values in a reduced set of samples (including some samples from the target sites). The patterns observed here are in opposition with the idea of global models. These results could encourage the expansion of this technique, because very large data based seems not to be needed. Future studies with very different samples will help to confirm the robustness of the patterns observed. Authors acknowledge to "Bancaja-UMH" for the financial support of the project "NIRPROS".

  4. Emerging Computational Methods for the Rational Discovery of Allosteric Drugs

    PubMed Central

    2016-01-01

    Allosteric drug development holds promise for delivering medicines that are more selective and less toxic than those that target orthosteric sites. To date, the discovery of allosteric binding sites and lead compounds has been mostly serendipitous, achieved through high-throughput screening. Over the past decade, structural data has become more readily available for larger protein systems and more membrane protein classes (e.g., GPCRs and ion channels), which are common allosteric drug targets. In parallel, improved simulation methods now provide better atomistic understanding of the protein dynamics and cooperative motions that are critical to allosteric mechanisms. As a result of these advances, the field of predictive allosteric drug development is now on the cusp of a new era of rational structure-based computational methods. Here, we review algorithms that predict allosteric sites based on sequence data and molecular dynamics simulations, describe tools that assess the druggability of these pockets, and discuss how Markov state models and topology analyses provide insight into the relationship between protein dynamics and allosteric drug binding. In each section, we first provide an overview of the various method classes before describing relevant algorithms and software packages. PMID:27074285

  5. Emerging Computational Methods for the Rational Discovery of Allosteric Drugs.

    PubMed

    Wagner, Jeffrey R; Lee, Christopher T; Durrant, Jacob D; Malmstrom, Robert D; Feher, Victoria A; Amaro, Rommie E

    2016-06-08

    Allosteric drug development holds promise for delivering medicines that are more selective and less toxic than those that target orthosteric sites. To date, the discovery of allosteric binding sites and lead compounds has been mostly serendipitous, achieved through high-throughput screening. Over the past decade, structural data has become more readily available for larger protein systems and more membrane protein classes (e.g., GPCRs and ion channels), which are common allosteric drug targets. In parallel, improved simulation methods now provide better atomistic understanding of the protein dynamics and cooperative motions that are critical to allosteric mechanisms. As a result of these advances, the field of predictive allosteric drug development is now on the cusp of a new era of rational structure-based computational methods. Here, we review algorithms that predict allosteric sites based on sequence data and molecular dynamics simulations, describe tools that assess the druggability of these pockets, and discuss how Markov state models and topology analyses provide insight into the relationship between protein dynamics and allosteric drug binding. In each section, we first provide an overview of the various method classes before describing relevant algorithms and software packages.

  6. A Robust Bayesian Random Effects Model for Nonlinear Calibration Problems

    PubMed Central

    Fong, Y.; Wakefield, J.; De Rosa, S.; Frahm, N.

    2013-01-01

    Summary In the context of a bioassay or an immunoassay, calibration means fitting a curve, usually nonlinear, through the observations collected on a set of samples containing known concentrations of a target substance, and then using the fitted curve and observations collected on samples of interest to predict the concentrations of the target substance in these samples. Recent technological advances have greatly improved our ability to quantify minute amounts of substance from a tiny volume of biological sample. This has in turn led to a need to improve statistical methods for calibration. In this paper, we focus on developing calibration methods robust to dependent outliers. We introduce a novel normal mixture model with dependent error terms to model the experimental noise. In addition, we propose a re-parameterization of the five parameter logistic nonlinear regression model that allows us to better incorporate prior information. We examine the performance of our methods with simulation studies and show that they lead to a substantial increase in performance measured in terms of mean squared error of estimation and a measure of the average prediction accuracy. A real data example from the HIV Vaccine Trials Network Laboratory is used to illustrate the methods. PMID:22551415

  7. Increasing organizational energy conservation behaviors: Comparing the theory of planned behavior and reasons theory for identifying specific motivational factors to target for change

    NASA Astrophysics Data System (ADS)

    Finlinson, Scott Michael

    Social scientists frequently assess factors thought to underlie behavior for the purpose of designing behavioral change interventions. Researchers commonly identify these factors by examining relationships between specific variables and the focal behaviors being investigated. Variables with the strongest relationships to the focal behavior are then assumed to be the most influential determinants of that behavior, and therefore often become the targets for change in a behavioral change intervention. In the current proposal, multiple methods are used to compare the effectiveness of two theoretical frameworks for identifying influential motivational factors. Assessing the relative influence of all factors and sets of factors for driving behavior should clarify which framework and methodology is the most promising for identifying effective change targets. Results indicated each methodology adequately predicted the three focal behaviors examined. However, the reasons theory approach was superior for predicting factor influence ratings compared to the TpB approach. While common method variance contamination had minimal impact on the results or conclusions derived from the present study's findings, there were substantial differences in conclusions depending on the questionnaire design used to collect the data. Examples of applied uses of the present study are discussed.

  8. Multi-Source Multi-Target Dictionary Learning for Prediction of Cognitive Decline.

    PubMed

    Zhang, Jie; Li, Qingyang; Caselli, Richard J; Thompson, Paul M; Ye, Jieping; Wang, Yalin

    2017-06-01

    Alzheimer's Disease (AD) is the most common type of dementia. Identifying correct biomarkers may determine pre-symptomatic AD subjects and enable early intervention. Recently, Multi-task sparse feature learning has been successfully applied to many computer vision and biomedical informatics researches. It aims to improve the generalization performance by exploiting the shared features among different tasks. However, most of the existing algorithms are formulated as a supervised learning scheme. Its drawback is with either insufficient feature numbers or missing label information. To address these challenges, we formulate an unsupervised framework for multi-task sparse feature learning based on a novel dictionary learning algorithm. To solve the unsupervised learning problem, we propose a two-stage Multi-Source Multi-Target Dictionary Learning (MMDL) algorithm. In stage 1, we propose a multi-source dictionary learning method to utilize the common and individual sparse features in different time slots. In stage 2, supported by a rigorous theoretical analysis, we develop a multi-task learning method to solve the missing label problem. Empirical studies on an N = 3970 longitudinal brain image data set, which involves 2 sources and 5 targets, demonstrate the improved prediction accuracy and speed efficiency of MMDL in comparison with other state-of-the-art algorithms.

  9. Explaining the disease phenotype of intergenic SNP through predicted long range regulation.

    PubMed

    Chen, Jingqi; Tian, Weidong

    2016-10-14

    Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. The prediction of candidate genes for cervix related cancer through gene ontology and graph theoretical approach.

    PubMed

    Hindumathi, V; Kranthi, T; Rao, S B; Manimaran, P

    2014-06-01

    With rapidly changing technology, prediction of candidate genes has become an indispensable task in recent years mainly in the field of biological research. The empirical methods for candidate gene prioritization that succors to explore the potential pathway between genetic determinants and complex diseases are highly cumbersome and labor intensive. In such a scenario predicting potential targets for a disease state through in silico approaches are of researcher's interest. The prodigious availability of protein interaction data coupled with gene annotation renders an ease in the accurate determination of disease specific candidate genes. In our work we have prioritized the cervix related cancer candidate genes by employing Csaba Ortutay and his co-workers approach of identifying the candidate genes through graph theoretical centrality measures and gene ontology. With the advantage of the human protein interaction data, cervical cancer gene sets and the ontological terms, we were able to predict 15 novel candidates for cervical carcinogenesis. The disease relevance of the anticipated candidate genes was corroborated through a literature survey. Also the presence of the drugs for these candidates was detected through Therapeutic Target Database (TTD) and DrugMap Central (DMC) which affirms that they may be endowed as potential drug targets for cervical cancer.

  11. Electron impact ionization of cycloalkanes, aldehydes, and ketones

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gupta, Dhanoj; Antony, Bobby, E-mail: bka.ism@gmail.com

    The theoretical calculations of electron impact total ionization cross section for cycloalkane, aldehyde, and ketone group molecules are undertaken from ionization threshold to 2 keV. The present calculations are based on the spherical complex optical potential formalism and complex scattering potential ionization contribution method. The results of most of the targets studied compare fairly well with the recent measurements, wherever available and the cross sections for many targets are predicted for the first time. The correlation between the peak of ionization cross sections with number of target electrons and target parameters is also reported. It was found that the crossmore » sections at their maximum depend linearly with the number of target electrons and with other target parameters, confirming the consistency of the values reported here.« less

  12. Shaping up the protein folding funnel by local interaction: lesson from a structure prediction study.

    PubMed

    Chikenji, George; Fujitsuka, Yoshimi; Takada, Shoji

    2006-02-28

    Predicting protein tertiary structure by folding-like simulations is one of the most stringent tests of how much we understand the principle of protein folding. Currently, the most successful method for folding-based structure prediction is the fragment assembly (FA) method. Here, we address why the FA method is so successful and its lesson for the folding problem. To do so, using the FA method, we designed a structure prediction test of "chimera proteins." In the chimera proteins, local structural preference is specific to the target sequences, whereas nonlocal interactions are only sequence-independent compaction forces. We find that these chimera proteins can find the native folds of the intact sequences with high probability indicating dominant roles of the local interactions. We further explore roles of local structural preference by exact calculation of the HP lattice model of proteins. From these results, we suggest principles of protein folding: For small proteins, compact structures that are fully compatible with local structural preference are few, one of which is the native fold. These local biases shape up the funnel-like energy landscape.

  13. Shaping up the protein folding funnel by local interaction: Lesson from a structure prediction study

    PubMed Central

    Chikenji, George; Fujitsuka, Yoshimi; Takada, Shoji

    2006-01-01

    Predicting protein tertiary structure by folding-like simulations is one of the most stringent tests of how much we understand the principle of protein folding. Currently, the most successful method for folding-based structure prediction is the fragment assembly (FA) method. Here, we address why the FA method is so successful and its lesson for the folding problem. To do so, using the FA method, we designed a structure prediction test of “chimera proteins.” In the chimera proteins, local structural preference is specific to the target sequences, whereas nonlocal interactions are only sequence-independent compaction forces. We find that these chimera proteins can find the native folds of the intact sequences with high probability indicating dominant roles of the local interactions. We further explore roles of local structural preference by exact calculation of the HP lattice model of proteins. From these results, we suggest principles of protein folding: For small proteins, compact structures that are fully compatible with local structural preference are few, one of which is the native fold. These local biases shape up the funnel-like energy landscape. PMID:16488978

  14. PockDrug: A Model for Predicting Pocket Druggability That Overcomes Pocket Estimation Uncertainties.

    PubMed

    Borrel, Alexandre; Regad, Leslie; Xhaard, Henri; Petitjean, Michel; Camproux, Anne-Claude

    2015-04-27

    Predicting protein druggability is a key interest in the target identification phase of drug discovery. Here, we assess the pocket estimation methods' influence on druggability predictions by comparing statistical models constructed from pockets estimated using different pocket estimation methods: a proximity of either 4 or 5.5 Å to a cocrystallized ligand or DoGSite and fpocket estimation methods. We developed PockDrug, a robust pocket druggability model that copes with uncertainties in pocket boundaries. It is based on a linear discriminant analysis from a pool of 52 descriptors combined with a selection of the most stable and efficient models using different pocket estimation methods. PockDrug retains the best combinations of three pocket properties which impact druggability: geometry, hydrophobicity, and aromaticity. It results in an average accuracy of 87.9% ± 4.7% using a test set and exhibits higher accuracy (∼5-10%) than previous studies that used an identical apo set. In conclusion, this study confirms the influence of pocket estimation on pocket druggability prediction and proposes PockDrug as a new model that overcomes pocket estimation variability.

  15. Insights into an original pocket-ligand pair classification: a promising tool for ligand profile prediction.

    PubMed

    Pérot, Stéphanie; Regad, Leslie; Reynès, Christelle; Spérandio, Olivier; Miteva, Maria A; Villoutreix, Bruno O; Camproux, Anne-Claude

    2013-01-01

    Pockets are today at the cornerstones of modern drug discovery projects and at the crossroad of several research fields, from structural biology to mathematical modeling. Being able to predict if a small molecule could bind to one or more protein targets or if a protein could bind to some given ligands is very useful for drug discovery endeavors, anticipation of binding to off- and anti-targets. To date, several studies explore such questions from chemogenomic approach to reverse docking methods. Most of these studies have been performed either from the viewpoint of ligands or targets. However it seems valuable to use information from both ligands and target binding pockets. Hence, we present a multivariate approach relating ligand properties with protein pocket properties from the analysis of known ligand-protein interactions. We explored and optimized the pocket-ligand pair space by combining pocket and ligand descriptors using Principal Component Analysis and developed a classification engine on this paired space, revealing five main clusters of pocket-ligand pairs sharing specific and similar structural or physico-chemical properties. These pocket-ligand pair clusters highlight correspondences between pocket and ligand topological and physico-chemical properties and capture relevant information with respect to protein-ligand interactions. Based on these pocket-ligand correspondences, a protocol of prediction of clusters sharing similarity in terms of recognition characteristics is developed for a given pocket-ligand complex and gives high performances. It is then extended to cluster prediction for a given pocket in order to acquire knowledge about its expected ligand profile or to cluster prediction for a given ligand in order to acquire knowledge about its expected pocket profile. This prediction approach shows promising results and could contribute to predict some ligand properties critical for binding to a given pocket, and conversely, some key pocket properties for ligand binding.

  16. Insights into an Original Pocket-Ligand Pair Classification: A Promising Tool for Ligand Profile Prediction

    PubMed Central

    Reynès, Christelle; Spérandio, Olivier; Miteva, Maria A.; Villoutreix, Bruno O.; Camproux, Anne-Claude

    2013-01-01

    Pockets are today at the cornerstones of modern drug discovery projects and at the crossroad of several research fields, from structural biology to mathematical modeling. Being able to predict if a small molecule could bind to one or more protein targets or if a protein could bind to some given ligands is very useful for drug discovery endeavors, anticipation of binding to off- and anti-targets. To date, several studies explore such questions from chemogenomic approach to reverse docking methods. Most of these studies have been performed either from the viewpoint of ligands or targets. However it seems valuable to use information from both ligands and target binding pockets. Hence, we present a multivariate approach relating ligand properties with protein pocket properties from the analysis of known ligand-protein interactions. We explored and optimized the pocket-ligand pair space by combining pocket and ligand descriptors using Principal Component Analysis and developed a classification engine on this paired space, revealing five main clusters of pocket-ligand pairs sharing specific and similar structural or physico-chemical properties. These pocket-ligand pair clusters highlight correspondences between pocket and ligand topological and physico-chemical properties and capture relevant information with respect to protein-ligand interactions. Based on these pocket-ligand correspondences, a protocol of prediction of clusters sharing similarity in terms of recognition characteristics is developed for a given pocket-ligand complex and gives high performances. It is then extended to cluster prediction for a given pocket in order to acquire knowledge about its expected ligand profile or to cluster prediction for a given ligand in order to acquire knowledge about its expected pocket profile. This prediction approach shows promising results and could contribute to predict some ligand properties critical for binding to a given pocket, and conversely, some key pocket properties for ligand binding. PMID:23840299

  17. Changes in predictive cuing modulate the hemispheric distribution of the P1 inhibitory response to attentional targets.

    PubMed

    Lasaponara, Stefano; D' Onofrio, Marianna; Dragone, Alessio; Pinto, Mario; Caratelli, Ludovica; Doricchi, Fabrizio

    2017-05-01

    Brain activity related to orienting of attention with spatial cues and brain responses to attentional targets are influenced the probabilistic contingency between cues and targets. Compared to predictive cues, cues predicting at chance the location of targets reduce the filtering out of uncued locations and the costs in reorienting attention to targets presented at these locations. Slagter et al. (2016) have recently suggested that the larger target related P1 component that is found in the hemisphere ipsilateral to validly cued targets reflects stimulus-driven inhibition in the processing of the unstimulated side of space contralateral to the same hemisphere. Here we verified whether the strength of this inhibition and the amplitude of the corresponding P1 wave are modulated by the probabilistic link between cues and targets. Healthy participants performed a task of endogenous orienting once with predictive and once with non-predictive directional cues. In the non-predictive condition we observed a drop in the amplitude of the P1 ipsilateral to the target and in the costs of reorienting. No change in the inter-hemispheric latencies of the P1 was found between the two predictive conditions. The N1 facilitatory component was unaffected by predictive cuing. These results show that the predictive context modulates the strength of the inhibitory P1 response and that this modulation is not matched with changes in the inter-hemispheric interaction between the P1 generators of the two hemispheres. Copyright © 2017. Published by Elsevier Ltd.

  18. A Volterra series-based method for extracting target echoes in the seafloor mining environment.

    PubMed

    Zhao, Haiming; Ji, Yaqian; Hong, Yujiu; Hao, Qi; Ma, Liyong

    2016-09-01

    The purpose of this research was to evaluate the applicability of the Volterra adaptive method to predict the target echo of an ultrasonic signal in an underwater seafloor mining environment. There is growing interest in mining of seafloor minerals because they offer an alternative source of rare metals. Mining the minerals cause the seafloor sediments to be stirred up and suspended in sea water. In such an environment, the target signals used for seafloor mapping are unable to be detected because of the unavoidable presence of volume reverberation induced by the suspended sediments. The detection of target signals in reverberation is currently performed using a stochastic model (for example, the autoregressive (AR) model) based on the statistical characterisation of reverberation. However, we examined a new method of signal detection in volume reverberation based on the Volterra series by confirming that the reverberation is a chaotic signal and generated by a deterministic process. The advantage of this method over the stochastic model is that attributions of the specific physical process are considered in the signal detection problem. To test the Volterra series based method and its applicability to target signal detection in the volume reverberation environment derived from the seafloor mining process, we simulated the real-life conditions of seafloor mining in a water filled tank of dimensions of 5×3×1.8m. The bottom of the tank was covered with 10cm of an irregular sand layer under which 5cm of an irregular cobalt-rich crusts layer was placed. The bottom was interrogated by an acoustic wave generated as 16μs pulses of 500kHz frequency. This frequency is demonstrated to ensure a resolution on the order of one centimetre, which is adequate in exploration practice. Echo signals were collected with a data acquisition card (PCI 1714 UL, 12-bit). Detection of the target echo in these signals was performed by both the Volterra series based model and the AR model. The results obtained confirm that the Volterra series based method is more efficient in the detection of the signal in reverberation than the conventional AR model (the accuracy is 80% for the PIM-Volterra prediction model versus 40% for the AR model). Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Manufacturing of Proteins and Antibodies: Chapter Downstream Processing Technologies : Harvest Operations.

    PubMed

    Turner, Richard; Joseph, Adrian; Titchener-Hooker, Nigel; Bender, Jean

    2017-08-04

    Cell harvesting is the separation or retention of cells and cellular debris from the supernatant containing the target molecule Selection of harvest method strongly depends on the type of cells, mode of bioreactor operation, process scale, and characteristics of the product and cell culture fluid. Most traditional harvesting methods use some form of filtration, centrifugation, or a combination of both for cell separation and/or retention. Filtration methods include normal flow depth filtration and tangential flow microfiltration. The ability to scale down predictably the selected harvest method helps to ensure successful production and is critical for conducting small-scale characterization studies for confirming parameter targets and ranges. In this chapter we describe centrifugation and depth filtration harvesting methods, share strategies for harvest optimization, present recent developments in centrifugation scale-down models, and review alternative harvesting technologies.

  20. Evolution of egg target size: an analysis of selection on correlated characters.

    PubMed

    Podolsky, R D

    2001-12-01

    In broadcast-spawning marine organisms, chronic sperm limitation should select for traits that improve chances of sperm-egg contact. One mechanism may involve increasing the size of the physical or chemical target for sperm. However, models of fertilization kinetics predict that increasing egg size can reduce net zygote production due to an associated decline in fecundity. An alternate method for increasing physical target size is through addition of energetically inexpensive external structures, such as the jelly coats typical of eggs in species from several phyla. In selection experiments on eggs of the echinoid Dendraster excentricus, in which sperm was used as the agent of selection, eggs with larger overall targets were favored in fertilization. Actual shifts in target size following selection matched quantitative predictions of a model that assumed fertilization was proportional to target size. Jelly volume and ovum volume, two characters that contribute to target size, were correlated both within and among females. A cross-sectional analysis of selection partitioned the independent effects of these characters on fertilization success and showed that they experience similar direct selection pressures. Coupled with data on relative organic costs of the two materials, these results suggest that, under conditions where fertilization is limited by egg target size, selection should favor investment in low-cost accessory structures and may have a relatively weak effect on the evolution of ovum size.

  1. A bio-inspired swarm robot coordination algorithm for multiple target searching

    NASA Astrophysics Data System (ADS)

    Meng, Yan; Gan, Jing; Desai, Sachi

    2008-04-01

    The coordination of a multi-robot system searching for multi targets is challenging under dynamic environment since the multi-robot system demands group coherence (agents need to have the incentive to work together faithfully) and group competence (agents need to know how to work together well). In our previous proposed bio-inspired coordination method, Local Interaction through Virtual Stigmergy (LIVS), one problem is the considerable randomness of the robot movement during coordination, which may lead to more power consumption and longer searching time. To address these issues, an adaptive LIVS (ALIVS) method is proposed in this paper, which not only considers the travel cost and target weight, but also predicting the target/robot ratio and potential robot redundancy with respect to the detected targets. Furthermore, a dynamic weight adjustment is also applied to improve the searching performance. This new method a truly distributed method where each robot makes its own decision based on its local sensing information and the information from its neighbors. Basically, each robot only communicates with its neighbors through a virtual stigmergy mechanism and makes its local movement decision based on a Particle Swarm Optimization (PSO) algorithm. The proposed ALIVS algorithm has been implemented on the embodied robot simulator, Player/Stage, in a searching target. The simulation results demonstrate the efficiency and robustness in a power-efficient manner with the real-world constraints.

  2. A sampling-based method for ranking protein structural models by integrating multiple scores and features.

    PubMed

    Shi, Xiaohu; Zhang, Jingfen; He, Zhiquan; Shang, Yi; Xu, Dong

    2011-09-01

    One of the major challenges in protein tertiary structure prediction is structure quality assessment. In many cases, protein structure prediction tools generate good structural models, but fail to select the best models from a huge number of candidates as the final output. In this study, we developed a sampling-based machine-learning method to rank protein structural models by integrating multiple scores and features. First, features such as predicted secondary structure, solvent accessibility and residue-residue contact information are integrated by two Radial Basis Function (RBF) models trained from different datasets. Then, the two RBF scores and five selected scoring functions developed by others, i.e., Opus-CA, Opus-PSP, DFIRE, RAPDF, and Cheng Score are synthesized by a sampling method. At last, another integrated RBF model ranks the structural models according to the features of sampling distribution. We tested the proposed method by using two different datasets, including the CASP server prediction models of all CASP8 targets and a set of models generated by our in-house software MUFOLD. The test result shows that our method outperforms any individual scoring function on both best model selection, and overall correlation between the predicted ranking and the actual ranking of structural quality.

  3. Imaging Surrogates of Infiltration Obtained Via Multiparametric Imaging Pattern Analysis Predict Subsequent Location of Recurrence of Glioblastoma

    PubMed Central

    Akbari, Hamed; Macyszyn, Luke; Da, Xiao; Bilello, Michel; Wolf, Ronald L.; Martinez-Lage, Maria; Biros, George; Alonso-Basanta, Michelle; O’Rourke, Donald M.; Davatzikos, Christos

    2016-01-01

    Background Glioblastoma is an aggressive and highly infiltrative brain cancer. Standard surgical resection is guided by enhancement on postcontrast T1-weighted (T1) magnetic resonance imaging (MRI), which is insufficient for delineating surrounding infiltrating tumor. Objective To develop imaging biomarkers that delineate areas of tumor infiltration and predict early recurrence in peritumoral tissue. Such markers would enable intensive, yet targeted, surgery and radiotherapy, thereby potentially delaying recurrence and prolonging survival. Methods Preoperative multiparametric MRIs (T1, T1-Gad, T2-weighted [T2], T2-fluid-attenuated inversion recovery [FLAIR], diffusion tensor imaging (DTI), and dynamic susceptibility contrast-enhanced [DSC]-MRI) from 31 patients were combined using machine learning methods, thereby creating predictive spatial maps of infiltrated peritumoral tissue. Cross validation was used in the retrospective cohort to achieve generalizable biomarkers. Subsequently, the imaging signatures learned from the retrospective study were used in a replication cohort of 34 new patients. Spatial maps representing likelihood of tumor infiltration and future early recurrence were compared to regions of recurrence on postresection follow-up studies with pathology confirmation. Results This technique produced predictions of early recurrence with a mean area under the curve (AUC) of 0.84, sensitivity of 91%, specificity of 93%, and odds ratio estimates of 9.29 (99% CI, 8.95–9.65) for tissue predicted to be heavily infiltrated in the replication study. Regions of tumor recurrence were found to have subtle, yet fairly distinctive multiparametric imaging signatures when analyzed quantitatively by pattern analysis and machine learning. Conclusion Visually imperceptible imaging patterns discovered via multiparametric pattern analysis methods were found to estimate the extent of infiltration and location of future tumor recurrence, paving the way for improved targeted treatment. PMID:26813856

  4. WORMHOLE: Novel Least Diverged Ortholog Prediction through Machine Learning

    PubMed Central

    Sutphin, George L.; Mahoney, J. Matthew; Sheppard, Keith; Walton, David O.; Korstanje, Ron

    2016-01-01

    The rapid advancement of technology in genomics and targeted genetic manipulation has made comparative biology an increasingly prominent strategy to model human disease processes. Predicting orthology relationships between species is a vital component of comparative biology. Dozens of strategies for predicting orthologs have been developed using combinations of gene and protein sequence, phylogenetic history, and functional interaction with progressively increasing accuracy. A relatively new class of orthology prediction strategies combines aspects of multiple methods into meta-tools, resulting in improved prediction performance. Here we present WORMHOLE, a novel ortholog prediction meta-tool that applies machine learning to integrate 17 distinct ortholog prediction algorithms to identify novel least diverged orthologs (LDOs) between 6 eukaryotic species—humans, mice, zebrafish, fruit flies, nematodes, and budding yeast. Machine learning allows WORMHOLE to intelligently incorporate predictions from a wide-spectrum of strategies in order to form aggregate predictions of LDOs with high confidence. In this study we demonstrate the performance of WORMHOLE across each combination of query and target species. We show that WORMHOLE is particularly adept at improving LDO prediction performance between distantly related species, expanding the pool of LDOs while maintaining low evolutionary distance and a high level of functional relatedness between genes in LDO pairs. We present extensive validation, including cross-validated prediction of PANTHER LDOs and evaluation of evolutionary divergence and functional similarity, and discuss future applications of machine learning in ortholog prediction. A WORMHOLE web tool has been developed and is available at http://wormhole.jax.org/. PMID:27812085

  5. WORMHOLE: Novel Least Diverged Ortholog Prediction through Machine Learning.

    PubMed

    Sutphin, George L; Mahoney, J Matthew; Sheppard, Keith; Walton, David O; Korstanje, Ron

    2016-11-01

    The rapid advancement of technology in genomics and targeted genetic manipulation has made comparative biology an increasingly prominent strategy to model human disease processes. Predicting orthology relationships between species is a vital component of comparative biology. Dozens of strategies for predicting orthologs have been developed using combinations of gene and protein sequence, phylogenetic history, and functional interaction with progressively increasing accuracy. A relatively new class of orthology prediction strategies combines aspects of multiple methods into meta-tools, resulting in improved prediction performance. Here we present WORMHOLE, a novel ortholog prediction meta-tool that applies machine learning to integrate 17 distinct ortholog prediction algorithms to identify novel least diverged orthologs (LDOs) between 6 eukaryotic species-humans, mice, zebrafish, fruit flies, nematodes, and budding yeast. Machine learning allows WORMHOLE to intelligently incorporate predictions from a wide-spectrum of strategies in order to form aggregate predictions of LDOs with high confidence. In this study we demonstrate the performance of WORMHOLE across each combination of query and target species. We show that WORMHOLE is particularly adept at improving LDO prediction performance between distantly related species, expanding the pool of LDOs while maintaining low evolutionary distance and a high level of functional relatedness between genes in LDO pairs. We present extensive validation, including cross-validated prediction of PANTHER LDOs and evaluation of evolutionary divergence and functional similarity, and discuss future applications of machine learning in ortholog prediction. A WORMHOLE web tool has been developed and is available at http://wormhole.jax.org/.

  6. How good are publicly available web services that predict bioactivity profiles for drug repurposing?

    PubMed

    Murtazalieva, K A; Druzhilovskiy, D S; Goel, R K; Sastry, G N; Poroikov, V V

    2017-10-01

    Drug repurposing provides a non-laborious and less expensive way for finding new human medicines. Computational assessment of bioactivity profiles shed light on the hidden pharmacological potential of the launched drugs. Currently, several freely available computational tools are available via the Internet, which predict multitarget profiles of drug-like compounds. They are based on chemical similarity assessment (ChemProt, SuperPred, SEA, SwissTargetPrediction and TargetHunter) or machine learning methods (ChemProt and PASS). To compare their performance, this study has created two evaluation sets, consisting of (1) 50 well-known repositioned drugs and (2) 12 drugs recently patented for new indications. In the first set, sensitivity values varied from 0.64 (TarPred) to 1.00 (PASS Online) for the initial indications and from 0.64 (TarPred) to 0.98 (PASS Online) for the repurposed indications. In the second set, sensitivity values varied from 0.08 (SuperPred) to 1.00 (PASS Online) for the initial indications and from 0.00 (SuperPred) to 1.00 (PASS Online) for the repurposed indications. Thus, this analysis demonstrated that the performance of machine learning methods surpassed those of chemical similarity assessments, particularly in the case of novel repurposed indications.

  7. Hyperspectral face recognition using improved inter-channel alignment based on qualitative prediction models.

    PubMed

    Cho, Woon; Jang, Jinbeum; Koschan, Andreas; Abidi, Mongi A; Paik, Joonki

    2016-11-28

    A fundamental limitation of hyperspectral imaging is the inter-band misalignment correlated with subject motion during data acquisition. One way of resolving this problem is to assess the alignment quality of hyperspectral image cubes derived from the state-of-the-art alignment methods. In this paper, we present an automatic selection framework for the optimal alignment method to improve the performance of face recognition. Specifically, we develop two qualitative prediction models based on: 1) a principal curvature map for evaluating the similarity index between sequential target bands and a reference band in the hyperspectral image cube as a full-reference metric; and 2) the cumulative probability of target colors in the HSV color space for evaluating the alignment index of a single sRGB image rendered using all of the bands of the hyperspectral image cube as a no-reference metric. We verify the efficacy of the proposed metrics on a new large-scale database, demonstrating a higher prediction accuracy in determining improved alignment compared to two full-reference and five no-reference image quality metrics. We also validate the ability of the proposed framework to improve hyperspectral face recognition.

  8. Protein subcellular localization prediction using artificial intelligence technology.

    PubMed

    Nair, Rajesh; Rost, Burkhard

    2008-01-01

    Proteins perform many important tasks in living organisms, such as catalysis of biochemical reactions, transport of nutrients, and recognition and transmission of signals. The plethora of aspects of the role of any particular protein is referred to as its "function." One aspect of protein function that has been the target of intensive research by computational biologists is its subcellular localization. Proteins must be localized in the same subcellular compartment to cooperate toward a common physiological function. Aberrant subcellular localization of proteins can result in several diseases, including kidney stones, cancer, and Alzheimer's disease. To date, sequence homology remains the most widely used method for inferring the function of a protein. However, the application of advanced artificial intelligence (AI)-based techniques in recent years has resulted in significant improvements in our ability to predict the subcellular localization of a protein. The prediction accuracy has risen steadily over the years, in large part due to the application of AI-based methods such as hidden Markov models (HMMs), neural networks (NNs), and support vector machines (SVMs), although the availability of larger experimental datasets has also played a role. Automatic methods that mine textual information from the biological literature and molecular biology databases have considerably sped up the process of annotation for proteins for which some information regarding function is available in the literature. State-of-the-art methods based on NNs and HMMs can predict the presence of N-terminal sorting signals extremely accurately. Ab initio methods that predict subcellular localization for any protein sequence using only the native amino acid sequence and features predicted from the native sequence have shown the most remarkable improvements. The prediction accuracy of these methods has increased by over 30% in the past decade. The accuracy of these methods is now on par with high-throughput methods for predicting localization, and they are beginning to play an important role in directing experimental research. In this chapter, we review some of the most important methods for the prediction of subcellular localization.

  9. In vivo brain electrophoresis - a novel method for chemotherapy of CNS diseases.

    PubMed

    Ammirati, Mario; Lamki, Tariq; Chitnis, Girish; Yang, Xiangyu; Russell, Duncan; Coble, Dondrae; Kaur, Balveen; Knopp, Michael; Moore, Sarah; Ziaie, Babak

    2015-05-01

    The blood-brain barrier (BBB) is a protective mechanism that does its job superbly. So much so, that hitherto, brain chemotherapy has been limited by it. In fact, very few agents are effective against brain disease due to the inherent difficulties of penetrating the BBB. We describe a novel, extremely focused method for delivering drugs to specific diseased areas. This innovative method directly delivers putative substances to the pathological area, bypassing the BBB. Treatment of brain diseases could be improved by targeted, controlled delivery of therapeutic substances to diseased cerebral areas. Our described novel method - in vivo electrophoresis - achieves this. This technique was evaluated in beagles after craniotomy was performed and a custom-designed plate with electrodes inserted. The delivery of charged substances to selected areas with predictably guided movement was achieved via a created electrical field. Gadolinium, a compound unable to cross the BBB, was injected intracerebrally whereas an electrical field was created using the implanted electrodes surrounding the injection area. The electrical field-guided Gadolinium movement was evaluated using MRI. Gadolinium was moved predictably using the created electrical field without complications. The experiment successfully demonstrated controlled movement of the substance. This technique can significantly change treatment of brain diseases because substances: i) may be moved in a controlled, predictable way - exponentially increasing therapeutic interactions with the target; and ii) no longer need to conform to constraints dictated by the BBB (molecular mass < 500 d; lipophilic), thereby increasing potential number of usable substances.

  10. A chemogenomic analysis of the human proteome: application to enzyme families.

    PubMed

    Bernasconi, Paul; Chen, Min; Galasinski, Scott; Popa-Burke, Ioana; Bobasheva, Anna; Coudurier, Louis; Birkos, Steve; Hallam, Rhonda; Janzen, William P

    2007-10-01

    Sequence-based phylogenies (SBP) are well-established tools for describing relationships between proteins. They have been used extensively to predict the behavior and sensitivity toward inhibitors of enzymes within a family. The utility of this approach diminishes when comparing proteins with little sequence homology. Even within an enzyme family, SBPs must be complemented by an orthogonal method that is independent of sequence to better predict enzymatic behavior. A chemogenomic approach is demonstrated here that uses the inhibition profile of a 130,000 diverse molecule library to uncover relationships within a set of enzymes. The profile is used to construct a semimetric additive distance matrix. This matrix, in turn, defines a sequence-independent phylogeny (SIP). The method was applied to 97 enzymes (kinases, proteases, and phosphatases). SIP does not use structural information from the molecules used for establishing the profile, thus providing a more heuristic method than the current approaches, which require knowledge of the specific inhibitor's structure. Within enzyme families, SIP shows a good overall correlation with SBP. More interestingly, SIP uncovers distances within families that are not recognizable by sequence-based methods. In addition, SIP allows the determination of distance between enzymes with no sequence homology, thus uncovering novel relationships not predicted by SBP. This chemogenomic approach, used in conjunction with SBP, should prove to be a powerful tool for choosing target combinations for drug discovery programs as well as for guiding the selection of profiling and liability targets.

  11. PL-PatchSurfer: a novel molecular local surface-based method for exploring protein-ligand interactions.

    PubMed

    Hu, Bingjie; Zhu, Xiaolei; Monroe, Lyman; Bures, Mark G; Kihara, Daisuke

    2014-08-27

    Structure-based computational methods have been widely used in exploring protein-ligand interactions, including predicting the binding ligands of a given protein based on their structural complementarity. Compared to other protein and ligand representations, the advantages of a surface representation include reduced sensitivity to subtle changes in the pocket and ligand conformation and fast search speed. Here we developed a novel method named PL-PatchSurfer (Protein-Ligand PatchSurfer). PL-PatchSurfer represents the protein binding pocket and the ligand molecular surface as a combination of segmented surface patches. Each patch is characterized by its geometrical shape and the electrostatic potential, which are represented using the 3D Zernike descriptor (3DZD). We first tested PL-PatchSurfer on binding ligand prediction and found it outperformed the pocket-similarity based ligand prediction program. We then optimized the search algorithm of PL-PatchSurfer using the PDBbind dataset. Finally, we explored the utility of applying PL-PatchSurfer to a larger and more diverse dataset and showed that PL-PatchSurfer was able to provide a high early enrichment for most of the targets. To the best of our knowledge, PL-PatchSurfer is the first surface patch-based method that treats ligand complementarity at protein binding sites. We believe that using a surface patch approach to better understand protein-ligand interactions has the potential to significantly enhance the design of new ligands for a wide array of drug-targets.

  12. PL-PatchSurfer: A Novel Molecular Local Surface-Based Method for Exploring Protein-Ligand Interactions

    PubMed Central

    Hu, Bingjie; Zhu, Xiaolei; Monroe, Lyman; Bures, Mark G.; Kihara, Daisuke

    2014-01-01

    Structure-based computational methods have been widely used in exploring protein-ligand interactions, including predicting the binding ligands of a given protein based on their structural complementarity. Compared to other protein and ligand representations, the advantages of a surface representation include reduced sensitivity to subtle changes in the pocket and ligand conformation and fast search speed. Here we developed a novel method named PL-PatchSurfer (Protein-Ligand PatchSurfer). PL-PatchSurfer represents the protein binding pocket and the ligand molecular surface as a combination of segmented surface patches. Each patch is characterized by its geometrical shape and the electrostatic potential, which are represented using the 3D Zernike descriptor (3DZD). We first tested PL-PatchSurfer on binding ligand prediction and found it outperformed the pocket-similarity based ligand prediction program. We then optimized the search algorithm of PL-PatchSurfer using the PDBbind dataset. Finally, we explored the utility of applying PL-PatchSurfer to a larger and more diverse dataset and showed that PL-PatchSurfer was able to provide a high early enrichment for most of the targets. To the best of our knowledge, PL-PatchSurfer is the first surface patch-based method that treats ligand complementarity at protein binding sites. We believe that using a surface patch approach to better understand protein-ligand interactions has the potential to significantly enhance the design of new ligands for a wide array of drug-targets. PMID:25167137

  13. Dual CRISPR-Cas9 Cleavage Mediated Gene Excision and Targeted Integration in Yarrowia lipolytica.

    PubMed

    Gao, Difeng; Smith, Spencer; Spagnuolo, Michael; Rodriguez, Gabriel; Blenner, Mark

    2018-05-29

    CRISPR-Cas9 technology has been successfully applied in Yarrowia lipolytica for targeted genomic editing including gene disruption and integration; however, disruptions by existing methods typically result from small frameshift mutations caused by indels within the coding region, which usually resulted in unnatural protein. In this study, a dual cleavage strategy directed by paired sgRNAs is developed for gene knockout. This method allows fast and robust gene excision, demonstrated on six genes of interest. The targeted regions for excision vary in length from 0.3 kb up to 3.5 kb and contain both non-coding and coding regions. The majority of the gene excisions are repaired by perfect nonhomologous end-joining without indel. Based on this dual cleavage system, two targeted markerless integration methods are developed by providing repair templates. While both strategies are effective, homology mediated end joining (HMEJ) based method are twice as efficient as homology recombination (HR) based method. In both cases, dual cleavage leads to similar or improved gene integration efficiencies compared to gene excision without integration. This dual cleavage strategy will be useful for not only generating more predictable and robust gene knockout, but also for efficient targeted markerless integration, and simultaneous knockout and integration in Y. lipolytica. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. A stereotaxic method of recording from single neurons in the intact in vivo eye of the cat.

    PubMed

    Molenaar, J; Van de Grind, W A

    1980-04-01

    A method is described for recording stereotaxically from single retinal neurons in the optically intact in vivo eye of the cat. The method is implemented with the help of a new type of stereotaxic instrument and a specially developed stereotaxic atlas of the cat's eye and retina. The instrument is extremely stable and facilitates intracellular recording from retinal neurons. The microelectrode can be rotated about two mutually perpendicular axes, which intersect in the freely positionable pivot point of the electrode manipulation system. When the pivot point is made to coincide with a small electrode-entrance hole in the sclera of the eye, a large retinal region can be reached through this fixed hole in the immobilized eye. The stereotaxic method makes it possible to choose a target point on the presented eye atlas and predict the settings of the instrument necessary to reach this target. This method also includes the prediction of the corresponding light stimulus position on a tangent screen and the calculation of the projection of the recording electrode on this screen. The sources of error in the method were studied experimentally and a numerical perturbation analysis was carried out to study the influence of each of the sources of error on the final result. The overall accuracy of the method is of the order of 5 degrees of visual angle, which will be sufficient for most purposes.

  15. Using genetic algorithms to optimize the analogue method for precipitation prediction in the Swiss Alps

    NASA Astrophysics Data System (ADS)

    Horton, Pascal; Jaboyedoff, Michel; Obled, Charles

    2018-01-01

    Analogue methods provide a statistical precipitation prediction based on synoptic predictors supplied by general circulation models or numerical weather prediction models. The method samples a selection of days in the archives that are similar to the target day to be predicted, and consider their set of corresponding observed precipitation (the predictand) as the conditional distribution for the target day. The relationship between the predictors and predictands relies on some parameters that characterize how and where the similarity between two atmospheric situations is defined. This relationship is usually established by a semi-automatic sequential procedure that has strong limitations: (i) it cannot automatically choose the pressure levels and temporal windows (hour of the day) for a given meteorological variable, (ii) it cannot handle dependencies between parameters, and (iii) it cannot easily handle new degrees of freedom. In this work, a global optimization approach relying on genetic algorithms could optimize all parameters jointly and automatically. The global optimization was applied to some variants of the analogue method for the Rhône catchment in the Swiss Alps. The performance scores increased compared to reference methods, especially for days with high precipitation totals. The resulting parameters were found to be relevant and coherent between the different subregions of the catchment. Moreover, they were obtained automatically and objectively, which reduces the effort that needs to be invested in exploration attempts when adapting the method to a new region or for a new predictand. For example, it obviates the need to assess a large number of combinations of pressure levels and temporal windows of predictor variables that were manually selected beforehand. The optimization could also take into account parameter inter-dependencies. In addition, the approach allowed for new degrees of freedom, such as a possible weighting between pressure levels, and non-overlapping spatial windows.

  16. Computational modeling of membrane proteins

    PubMed Central

    Leman, Julia Koehler; Ulmschneider, Martin B.; Gray, Jeffrey J.

    2014-01-01

    The determination of membrane protein (MP) structures has always trailed that of soluble proteins due to difficulties in their overexpression, reconstitution into membrane mimetics, and subsequent structure determination. The percentage of MP structures in the protein databank (PDB) has been at a constant 1-2% for the last decade. In contrast, over half of all drugs target MPs, only highlighting how little we understand about drug-specific effects in the human body. To reduce this gap, researchers have attempted to predict structural features of MPs even before the first structure was experimentally elucidated. In this review, we present current computational methods to predict MP structure, starting with secondary structure prediction, prediction of trans-membrane spans, and topology. Even though these methods generate reliable predictions, challenges such as predicting kinks or precise beginnings and ends of secondary structure elements are still waiting to be addressed. We describe recent developments in the prediction of 3D structures of both α-helical MPs as well as β-barrels using comparative modeling techniques, de novo methods, and molecular dynamics (MD) simulations. The increase of MP structures has (1) facilitated comparative modeling due to availability of more and better templates, and (2) improved the statistics for knowledge-based scoring functions. Moreover, de novo methods have benefitted from the use of correlated mutations as restraints. Finally, we outline current advances that will likely shape the field in the forthcoming decade. PMID:25355688

  17. An efficient hybrid technique in RCS predictions of complex targets at high frequencies

    NASA Astrophysics Data System (ADS)

    Algar, María-Jesús; Lozano, Lorena; Moreno, Javier; González, Iván; Cátedra, Felipe

    2017-09-01

    Most computer codes in Radar Cross Section (RCS) prediction use Physical Optics (PO) and Physical theory of Diffraction (PTD) combined with Geometrical Optics (GO) and Geometrical Theory of Diffraction (GTD). The latter approaches are computationally cheaper and much more accurate for curved surfaces, but not applicable for the computation of the RCS of all surfaces of a complex object due to the presence of caustic problems in the analysis of concave surfaces or flat surfaces in the far field. The main contribution of this paper is the development of a hybrid method based on a new combination of two asymptotic techniques: GTD and PO, considering the advantages and avoiding the disadvantages of each of them. A very efficient and accurate method to analyze the RCS of complex structures at high frequencies is obtained with the new combination. The proposed new method has been validated comparing RCS results obtained for some simple cases using the proposed approach and RCS using the rigorous technique of Method of Moments (MoM). Some complex cases have been examined at high frequencies contrasting the results with PO. This study shows the accuracy and the efficiency of the hybrid method and its suitability for the computation of the RCS at really large and complex targets at high frequencies.

  18. Spatiotemporal Analysis of Malaria in Urban Ahmedabad (Gujarat), India: Identification of Hot Spots and Risk Factors for Targeted Intervention

    PubMed Central

    Parizo, Justin; Sturrock, Hugh J. W.; Dhiman, Ramesh C.; Greenhouse, Bryan

    2016-01-01

    The world population, especially in developing countries, has experienced a rapid progression of urbanization over the last half century. Urbanization has been accompanied by a rise in cases of urban infectious diseases, such as malaria. The complexity and heterogeneity of the urban environment has made study of specific urban centers vital for urban malaria control programs, whereas more generalizable risk factor identification also remains essential. Ahmedabad city, India, is a large urban center located in the state of Gujarat, which has experienced a significant Plasmodium vivax and Plasmodium falciparum disease burden. Therefore, a targeted analysis of malaria in Ahmedabad city was undertaken to identify spatiotemporal patterns of malaria, risk factors, and methods of predicting future malaria cases. Malaria incidence in Ahmedabad city was found to be spatially heterogeneous, but temporally stable, with high spatial correlation between species. Because of this stability, a prediction method utilizing historic cases from prior years and seasons was used successfully to predict which areas of Ahmedabad city would experience the highest malaria burden and could be used to prospectively target interventions. Finally, spatial analysis showed that normalized difference vegetation index, proximity to water sources, and location within Ahmedabad city relative to the dense urban core were the best predictors of malaria incidence. Because of the heterogeneity of urban environments and urban malaria itself, the study of specific large urban centers is vital to assist in allocating resources and informing future urban planning. PMID:27382081

  19. In vivo dose measurement using TLDs and MOSFET dosimeters for cardiac radiosurgery.

    PubMed

    Gardner, Edward A; Sumanaweera, Thilaka S; Blanck, Oliver; Iwamura, Alyson K; Steel, James P; Dieterich, Sonja; Maguire, Patrick

    2012-05-10

    In vivo measurements were made of the dose delivered to animal models in an effort to develop a method for treating cardiac arrhythmia using radiation. This treatment would replace RF energy (currently used to create cardiac scar) with ionizing radiation. In the current study, the pulmonary vein ostia of animal models were irradiated with 6 MV X-rays in order to produce a scar that would block aberrant signals characteristic of atrial fibrillation. The CyberKnife radiosurgery system was used to deliver planned treatments of 20-35 Gy in a single fraction to four animals. The Synchrony system was used to track respiratory motion of the heart, while the contractile motion of the heart was untracked. The dose was measured on the epicardial surface near the right pulmonary vein and on the esophagus using surgically implanted TLD dosimeters, or in the coronary sinus using a MOSFET dosimeter placed using a catheter. The doses measured on the epicardium with TLDs averaged 5% less than predicted for those locations, while doses measured in the coronary sinus with the MOSFET sensor nearest the target averaged 6% less than the predicted dose. The measurements on the esophagus averaged 25% less than predicted. These results provide an indication of the accuracy with which the treatment planning methods accounted for the motion of the target, with its respiratory and cardiac components. This is the first report on the accuracy of CyberKnife dose delivery to cardiac targets.

  20. Identification of potential inhibitors based on compound proposal contest: Tyrosine-protein kinase Yes as a target.

    PubMed

    Chiba, Shuntaro; Ikeda, Kazuyoshi; Ishida, Takashi; Gromiha, M Michael; Taguchi, Y-H; Iwadate, Mitsuo; Umeyama, Hideaki; Hsin, Kun-Yi; Kitano, Hiroaki; Yamamoto, Kazuki; Sugaya, Nobuyoshi; Kato, Koya; Okuno, Tatsuya; Chikenji, George; Mochizuki, Masahiro; Yasuo, Nobuaki; Yoshino, Ryunosuke; Yanagisawa, Keisuke; Ban, Tomohiro; Teramoto, Reiji; Ramakrishnan, Chandrasekaran; Thangakani, A Mary; Velmurugan, D; Prathipati, Philip; Ito, Junichi; Tsuchiya, Yuko; Mizuguchi, Kenji; Honma, Teruki; Hirokawa, Takatsugu; Akiyama, Yutaka; Sekijima, Masakazu

    2015-11-26

    A search of broader range of chemical space is important for drug discovery. Different methods of computer-aided drug discovery (CADD) are known to propose compounds in different chemical spaces as hit molecules for the same target protein. This study aimed at using multiple CADD methods through open innovation to achieve a level of hit molecule diversity that is not achievable with any particular single method. We held a compound proposal contest, in which multiple research groups participated and predicted inhibitors of tyrosine-protein kinase Yes. This showed whether collective knowledge based on individual approaches helped to obtain hit compounds from a broad range of chemical space and whether the contest-based approach was effective.

  1. Identification of potential inhibitors based on compound proposal contest: Tyrosine-protein kinase Yes as a target

    PubMed Central

    Chiba, Shuntaro; Ikeda, Kazuyoshi; Ishida, Takashi; Gromiha, M. Michael; Taguchi, Y-h.; Iwadate, Mitsuo; Umeyama, Hideaki; Hsin, Kun-Yi; Kitano, Hiroaki; Yamamoto, Kazuki; Sugaya, Nobuyoshi; Kato, Koya; Okuno, Tatsuya; Chikenji, George; Mochizuki, Masahiro; Yasuo, Nobuaki; Yoshino, Ryunosuke; Yanagisawa, Keisuke; Ban, Tomohiro; Teramoto, Reiji; Ramakrishnan, Chandrasekaran; Thangakani, A. Mary; Velmurugan, D.; Prathipati, Philip; Ito, Junichi; Tsuchiya, Yuko; Mizuguchi, Kenji; Honma, Teruki; Hirokawa, Takatsugu; Akiyama, Yutaka; Sekijima, Masakazu

    2015-01-01

    A search of broader range of chemical space is important for drug discovery. Different methods of computer-aided drug discovery (CADD) are known to propose compounds in different chemical spaces as hit molecules for the same target protein. This study aimed at using multiple CADD methods through open innovation to achieve a level of hit molecule diversity that is not achievable with any particular single method. We held a compound proposal contest, in which multiple research groups participated and predicted inhibitors of tyrosine-protein kinase Yes. This showed whether collective knowledge based on individual approaches helped to obtain hit compounds from a broad range of chemical space and whether the contest-based approach was effective. PMID:26607293

  2. A critique of the molecular target-based drug discovery paradigm based on principles of metabolic control: advantages of pathway-based discovery.

    PubMed

    Hellerstein, Marc K

    2008-01-01

    Contemporary drug discovery and development (DDD) is dominated by a molecular target-based paradigm. Molecular targets that are potentially important in disease are physically characterized; chemical entities that interact with these targets are identified by ex vivo high-throughput screening assays, and optimized lead compounds enter testing as drugs. Contrary to highly publicized claims, the ascendance of this approach has in fact resulted in the lowest rate of new drug approvals in a generation. The primary explanation for low rates of new drugs is attrition, or the failure of candidates identified by molecular target-based methods to advance successfully through the DDD process. In this essay, I advance the thesis that this failure was predictable, based on modern principles of metabolic control that have emerged and been applied most forcefully in the field of metabolic engineering. These principles, such as the robustness of flux distributions, address connectivity relationships in complex metabolic networks and make it unlikely a priori that modulating most molecular targets will have predictable, beneficial functional outcomes. These same principles also suggest, however, that unexpected therapeutic actions will be common for agents that have any effect (i.e., that complexity can be exploited therapeutically). A potential operational solution (pathway-based DDD), based on observability rather than predictability, is described, focusing on emergent properties of key metabolic pathways in vivo. Recent examples of pathway-based DDD are described. In summary, the molecular target-based DDD paradigm is built on a naïve and misleading model of biologic control and is not heuristically adequate for advancing the mission of modern therapeutics. New approaches that take account of and are built on principles described by metabolic engineers are needed for the next generation of DDD.

  3. Conflict Adaptation and Cue Competition during Learning in an Eriksen Flanker Task

    PubMed Central

    Ghinescu, Rodica; Ramsey, Ashley K.; Gratton, Gabriele; Fabiani, Monica

    2016-01-01

    Two experiments investigated competition between cues that predicted the correct target response to a target stimulus in a response conflict procedure using a flanker task. Subjects received trials with five-character arrays with a central target character and distractor flanker characters that matched (compatible) or did not match (incompatible) the central target. Subjects’ expectancies for compatible and incompatible trials were manipulated by presenting pre-trial cues that signaled the occurrence of compatible or incompatible trials. On some trials, a single cue predicted the target stimulus and the required target response. On other trials, a second redundant, predictive cue was also present on such trials. The results showed an effect of competition between cues for control over strategic responding to the target stimuli, a finding that is predicted by associative learning theories. The finding of competition between pre-trial cues that predict incompatible trials, but not cues that predict compatible trials, suggests that different strategic processes may occur during adaptation to conflict when different kinds of trials are expected. PMID:27941977

  4. GalaxyGPCRloop: Template-Based and Ab Initio Structure Sampling of the Extracellular Loops of G-Protein-Coupled Receptors.

    PubMed

    Won, Jonghun; Lee, Gyu Rie; Park, Hahnbeom; Seok, Chaok

    2018-06-07

    The second extracellular loops (ECL2s) of G-protein-coupled receptors (GPCRs) are often involved in GPCR functions, and their structures have important implications in drug discovery. However, structure prediction of ECL2 is difficult because of its long length and the structural diversity among different GPCRs. In this study, a new ECL2 conformational sampling method involving both template-based and ab initio sampling was developed. Inspired by the observation of similar ECL2 structures of closely related GPCRs, a template-based sampling method employing loop structure templates selected from the structure database was developed. A new metric for evaluating similarity of the target loop to templates was introduced for template selection. An ab initio loop sampling method was also developed to treat cases without highly similar templates. The ab initio method is based on the previously developed fragment assembly and loop closure method. A new sampling component that takes advantage of secondary structure prediction was added. In addition, a conserved disulfide bridge restraining ECL2 conformation was predicted and analytically incorporated into sampling, reducing the effective dimension of the conformational search space. The sampling method was combined with an existing energy function for comparison with previously reported loop structure prediction methods, and the benchmark test demonstrated outstanding performance.

  5. Analysis of Artificial Neural Network Backpropagation Using Conjugate Gradient Fletcher Reeves In The Predicting Process

    NASA Astrophysics Data System (ADS)

    Wanto, Anjar; Zarlis, Muhammad; Sawaluddin; Hartama, Dedy

    2017-12-01

    Backpropagation is a good artificial neural network algorithm used to predict, one of which is to predict the rate of Consumer Price Index (CPI) based on the foodstuff sector. While conjugate gradient fletcher reeves is a suitable optimization method when juxtaposed with backpropagation method, because this method can shorten iteration without reducing the quality of training and testing result. Consumer Price Index (CPI) data that will be predicted to come from the Central Statistics Agency (BPS) Pematangsiantar. The results of this study will be expected to contribute to the government in making policies to improve economic growth. In this study, the data obtained will be processed by conducting training and testing with artificial neural network backpropagation by using parameter learning rate 0,01 and target error minimum that is 0.001-0,09. The training network is built with binary and bipolar sigmoid activation functions. After the results with backpropagation are obtained, it will then be optimized using the conjugate gradient fletcher reeves method by conducting the same training and testing based on 5 predefined network architectures. The result, the method used can increase the speed and accuracy result.

  6. Quantitative and Systems Pharmacology. 1. In Silico Prediction of Drug-Target Interactions of Natural Products Enables New Targeted Cancer Therapy.

    PubMed

    Fang, Jiansong; Wu, Zengrui; Cai, Chuipu; Wang, Qi; Tang, Yun; Cheng, Feixiong

    2017-11-27

    Natural products with diverse chemical scaffolds have been recognized as an invaluable source of compounds in drug discovery and development. However, systematic identification of drug targets for natural products at the human proteome level via various experimental assays is highly expensive and time-consuming. In this study, we proposed a systems pharmacology infrastructure to predict new drug targets and anticancer indications of natural products. Specifically, we reconstructed a global drug-target network with 7,314 interactions connecting 751 targets and 2,388 natural products and built predictive network models via a balanced substructure-drug-target network-based inference approach. A high area under receiver operating characteristic curve of 0.96 was yielded for predicting new targets of natural products during cross-validation. The newly predicted targets of natural products (e.g., resveratrol, genistein, and kaempferol) with high scores were validated by various literature studies. We further built the statistical network models for identification of new anticancer indications of natural products through integration of both experimentally validated and computationally predicted drug-target interactions of natural products with known cancer proteins. We showed that the significantly predicted anticancer indications of multiple natural products (e.g., naringenin, disulfiram, and metformin) with new mechanism-of-action were validated by various published experimental evidence. In summary, this study offers powerful computational systems pharmacology approaches and tools for the development of novel targeted cancer therapies by exploiting the polypharmacology of natural products.

  7. Ensemble method for dengue prediction.

    PubMed

    Buczak, Anna L; Baugher, Benjamin; Moniz, Linda J; Bagley, Thomas; Babin, Steven M; Guven, Erhan

    2018-01-01

    In the 2015 NOAA Dengue Challenge, participants made three dengue target predictions for two locations (Iquitos, Peru, and San Juan, Puerto Rico) during four dengue seasons: 1) peak height (i.e., maximum weekly number of cases during a transmission season; 2) peak week (i.e., week in which the maximum weekly number of cases occurred); and 3) total number of cases reported during a transmission season. A dengue transmission season is the 12-month period commencing with the location-specific, historical week with the lowest number of cases. At the beginning of the Dengue Challenge, participants were provided with the same input data for developing the models, with the prediction testing data provided at a later date. Our approach used ensemble models created by combining three disparate types of component models: 1) two-dimensional Method of Analogues models incorporating both dengue and climate data; 2) additive seasonal Holt-Winters models with and without wavelet smoothing; and 3) simple historical models. Of the individual component models created, those with the best performance on the prior four years of data were incorporated into the ensemble models. There were separate ensembles for predicting each of the three targets at each of the two locations. Our ensemble models scored higher for peak height and total dengue case counts reported in a transmission season for Iquitos than all other models submitted to the Dengue Challenge. However, the ensemble models did not do nearly as well when predicting the peak week. The Dengue Challenge organizers scored the dengue predictions of the Challenge participant groups. Our ensemble approach was the best in predicting the total number of dengue cases reported for transmission season and peak height for Iquitos, Peru.

  8. Identification and characterization of microRNAs and their target genes from Nile tilapia (Oreochromis niloticus).

    PubMed

    Huang, Yong; Ma, Xiu Ying; Yang, You Bing; Ren, Hong Tao; Sun, Xi Hong; Wang, Li Rui

    MicroRNAs (miRNAs) are a class of small single-stranded, endogenous 21-22 nt non-coding RNAs that regulate their target mRNA levels by causing either inactivation or degradation of the mRNAs. In recent years, miRNA genes have been identified from mammals, insects, worms, plants, and viruses. In this research, bioinformatics approaches were used to predict potential miRNAs and their targets in Nile tilapia from the expressed sequence tag (EST) and genomic survey sequence (GSS) database, respectively, based on the conservation of miRNAs in many animal species. A total of 19 potential miRNAs were detected following a range of strict filtering criteria. To test the validity of the bioinformatics method, seven predicted Nile tilapia miRNA genes were selected for further biological validation, and their mature miRNA transcripts were successfully detected by stem-loop RT-PCR experiments. Using these potential miRNAs, we found 56 potential targets in this species. Most of the target mRNAs appear to be involved in development, metabolism, signal transduction, transcription regulation and stress responses. Overall, our findings will provide an important foundation for further research on miRNAs function in the Nile tilapia.

  9. Relationship between the Prediction Accuracy of Tsunami Inundation and Relative Distribution of Tsunami Source and Observation Arrays: A Case Study in Tokyo Bay

    NASA Astrophysics Data System (ADS)

    Takagawa, T.

    2017-12-01

    A rapid and precise tsunami forecast based on offshore monitoring is getting attention to reduce human losses due to devastating tsunami inundation. We developed a forecast method based on the combination of hierarchical Bayesian inversion with pre-computed database and rapid post-computing of tsunami inundation. The method was applied to Tokyo bay to evaluate the efficiency of observation arrays against three tsunamigenic earthquakes. One is a scenario earthquake at Nankai trough and the other two are historic ones of Genroku in 1703 and Enpo in 1677. In general, rich observation array near the tsunami source has an advantage in both accuracy and rapidness of tsunami forecast. To examine the effect of observation time length we used four types of data with the lengths of 5, 10, 20 and 45 minutes after the earthquake occurrences. Prediction accuracy of tsunami inundation was evaluated by the simulated tsunami inundation areas around Tokyo bay due to target earthquakes. The shortest time length of accurate prediction varied with target earthquakes. Here, accurate prediction means the simulated values fall within the 95% credible intervals of prediction. In Enpo earthquake case, 5-minutes observation is enough for accurate prediction for Tokyo bay, but 10-minutes and 45-minutes are needed in the case of Nankai trough and Genroku, respectively. The difference of the shortest time length for accurate prediction shows the strong relationship with the relative distance from the tsunami source and observation arrays. In the Enpo case, offshore tsunami observation points are densely distributed even in the source region. So, accurate prediction can be rapidly achieved within 5 minutes. This precise prediction is useful for early warnings. Even in the worst case of Genroku, where less observation points are available near the source, accurate prediction can be obtained within 45 minutes. This information can be useful to figure out the outline of the hazard in an early stage of reaction.

  10. Combined target factor analysis and Bayesian soft-classification of interference-contaminated samples: forensic fire debris analysis.

    PubMed

    Williams, Mary R; Sigman, Michael E; Lewis, Jennifer; Pitan, Kelly McHugh

    2012-10-10

    A bayesian soft classification method combined with target factor analysis (TFA) is described and tested for the analysis of fire debris data. The method relies on analysis of the average mass spectrum across the chromatographic profile (i.e., the total ion spectrum, TIS) from multiple samples taken from a single fire scene. A library of TIS from reference ignitable liquids with assigned ASTM classification is used as the target factors in TFA. The class-conditional distributions of correlations between the target and predicted factors for each ASTM class are represented by kernel functions and analyzed by bayesian decision theory. The soft classification approach assists in assessing the probability that ignitable liquid residue from a specific ASTM E1618 class, is present in a set of samples from a single fire scene, even in the presence of unspecified background contributions from pyrolysis products. The method is demonstrated with sample data sets and then tested on laboratory-scale burn data and large-scale field test burns. The overall performance achieved in laboratory and field test of the method is approximately 80% correct classification of fire debris samples. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  11. In silico prediction of novel therapeutic targets using gene-disease association data.

    PubMed

    Ferrero, Enrico; Dunham, Ian; Sanseau, Philippe

    2017-08-29

    Target identification and validation is a pressing challenge in the pharmaceutical industry, with many of the programmes that fail for efficacy reasons showing poor association between the drug target and the disease. Computational prediction of successful targets could have a considerable impact on attrition rates in the drug discovery pipeline by significantly reducing the initial search space. Here, we explore whether gene-disease association data from the Open Targets platform is sufficient to predict therapeutic targets that are actively being pursued by pharmaceutical companies or are already on the market. To test our hypothesis, we train four different classifiers (a random forest, a support vector machine, a neural network and a gradient boosting machine) on partially labelled data and evaluate their performance using nested cross-validation and testing on an independent set. We then select the best performing model and use it to make predictions on more than 15,000 genes. Finally, we validate our predictions by mining the scientific literature for proposed therapeutic targets. We observe that the data types with the best predictive power are animal models showing a disease-relevant phenotype, differential expression in diseased tissue and genetic association with the disease under investigation. On a test set, the neural network classifier achieves over 71% accuracy with an AUC of 0.76 when predicting therapeutic targets in a semi-supervised learning setting. We use this model to gain insights into current and failed programmes and to predict 1431 novel targets, of which a highly significant proportion has been independently proposed in the literature. Our in silico approach shows that data linking genes and diseases is sufficient to predict novel therapeutic targets effectively and confirms that this type of evidence is essential for formulating or strengthening hypotheses in the target discovery process. Ultimately, more rapid and automated target prioritisation holds the potential to reduce both the costs and the development times associated with bringing new medicines to patients.

  12. Using iRT, a normalized retention time for more targeted measurement of peptides

    PubMed Central

    Escher, Claudia; Reiter, Lukas; MacLean, Brendan; Ossola, Reto; Herzog, Franz; Chilton, John; MacCoss, Michael J.; Rinner, Oliver

    2014-01-01

    Multiple reaction monitoring (MRM) has recently become the method of choice for targeted quantitative measurement of proteins using mass spectrometry. The method, however, is limited in the number of peptides that can be measured in one run. This number can be markedly increased by scheduling the acquisition if the accurate retention time (RT) of each peptide is known. Here we present iRT, an empirically derived dimensionless peptide-specific value that allows for highly accurate RT prediction. The iRT of a peptide is a fixed number relative to a standard set of reference iRT-peptides that can be transferred across laboratories and chromatographic systems. We show that iRT facilitates the setup of multiplexed experiments with acquisition windows more than 4 times smaller compared to in silico RT predictions resulting in improved quantification accuracy. iRTs can be determined by any laboratory and shared transparently. The iRT concept has been implemented in Skyline, the most widely used software for MRM experiments. PMID:22577012

  13. A continuous function model for path prediction of entities

    NASA Astrophysics Data System (ADS)

    Nanda, S.; Pray, R.

    2007-04-01

    As militaries across the world continue to evolve, the roles of humans in various theatres of operation are being increasingly targeted by military planners for substitution with automation. Forward observation and direction of supporting arms to neutralize threats from dynamic adversaries is one such example. However, contemporary tracking and targeting systems are incapable of serving autonomously for they do not embody the sophisticated algorithms necessary to predict the future positions of adversaries with the accuracy offered by the cognitive and analytical abilities of human operators. The need for these systems to incorporate methods characterizing such intelligence is therefore compelling. In this paper, we present a novel technique to achieve this goal by modeling the path of an entity as a continuous polynomial function of multiple variables expressed as a Taylor series with a finite number of terms. We demonstrate the method for evaluating the coefficient of each term to define this function unambiguously for any given entity, and illustrate its use to determine the entity's position at any point in time in the future.

  14. Prediction of miRNA-mRNA associations in Alzheimer's disease mice using network topology.

    PubMed

    Noh, Haneul; Park, Charny; Park, Soojun; Lee, Young Seek; Cho, Soo Young; Seo, Hyemyung

    2014-08-03

    Little is known about the relationship between miRNA and mRNA expression in Alzheimer's disease (AD) at early- or late-symptomatic stages. Sequence-based target prediction algorithms and anti-correlation profiles have been applied to predict miRNA targets using omics data, but this approach often leads to false positive predictions. Here, we applied the joint profiling analysis of mRNA and miRNA expression levels to Tg6799 AD model mice at 4 and 8 months of age using a network topology-based method. We constructed gene regulatory networks and used the PageRank algorithm to predict significant interactions between miRNA and mRNA. In total, 8 cluster modules were predicted by the transcriptome data for co-expression networks of AD pathology. In total, 54 miRNAs were identified as being differentially expressed in AD. Among these, 50 significant miRNA-mRNA interactions were predicted by integrating sequence target prediction, expression analysis, and the PageRank algorithm. We identified a set of miRNA-mRNA interactions that were changed in the hippocampus of Tg6799 AD model mice. We determined the expression levels of several candidate genes and miRNA. For functional validation in primary cultured neurons from Tg6799 mice (MT) and littermate (LM) controls, the overexpression of ARRDC3 enhanced PPP1R3C expression. ARRDC3 overexpression showed the tendency to decrease the expression of miR139-5p and miR3470a in both LM and MT primary cells. Pathological environment created by Aβ treatment increased the gene expression of PPP1R3C and Sfpq but did not significantly alter the expression of miR139-5p or miR3470a. Aβ treatment increased the promoter activity of ARRDC3 gene in LM primary cells but not in MT primary cells. Our results demonstrate AD-specific changes in the miRNA regulatory system as well as the relationship between the expression levels of miRNAs and their targets in the hippocampus of Tg6799 mice. These data help further our understanding of the function and mechanism of various miRNAs and their target genes in the molecular pathology of AD.

  15. Development of companion diagnostics

    DOE PAGES

    Mankoff, David A.; Edmonds, Christine E.; Farwell, Michael D.; ...

    2015-12-12

    The goal of individualized and targeted treatment and precision medicine requires the assessment of potential therapeutic targets to direct treatment selection. The biomarkers used to direct precision medicine, often termed companion diagnostics, for highly targeted drugs have thus far been almost entirely based on in vitro assay of biopsy material. Molecular imaging companion diagnostics offer a number of features complementary to those from in vitro assay, including the ability to measure the heterogeneity of each patient’s cancer across the entire disease burden and to measure early changes in response to treatment. We discuss the use of molecular imaging methods asmore » companion diagnostics for cancer therapy with the goal of predicting response to targeted therapy and measuring early (pharmacodynamic) response as an indication of whether the treatment has “hit” the target. We also discuss considerations for probe development for molecular imaging companion diagnostics, including both small-molecule probes and larger molecules such as labeled antibodies and related constructs. We then describe two examples where both predictive and pharmacodynamic molecular imaging markers have been tested in humans: endocrine therapy for breast cancer and human epidermal growth factor receptor type 2–targeted therapy. Lastly, the review closes with a summary of the items needed to move molecular imaging companion diagnostics from early studies into multicenter trials and into the clinic.« less

  16. Development of companion diagnostics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mankoff, David A.; Edmonds, Christine E.; Farwell, Michael D.

    The goal of individualized and targeted treatment and precision medicine requires the assessment of potential therapeutic targets to direct treatment selection. The biomarkers used to direct precision medicine, often termed companion diagnostics, for highly targeted drugs have thus far been almost entirely based on in vitro assay of biopsy material. Molecular imaging companion diagnostics offer a number of features complementary to those from in vitro assay, including the ability to measure the heterogeneity of each patient’s cancer across the entire disease burden and to measure early changes in response to treatment. We discuss the use of molecular imaging methods asmore » companion diagnostics for cancer therapy with the goal of predicting response to targeted therapy and measuring early (pharmacodynamic) response as an indication of whether the treatment has “hit” the target. We also discuss considerations for probe development for molecular imaging companion diagnostics, including both small-molecule probes and larger molecules such as labeled antibodies and related constructs. We then describe two examples where both predictive and pharmacodynamic molecular imaging markers have been tested in humans: endocrine therapy for breast cancer and human epidermal growth factor receptor type 2–targeted therapy. Lastly, the review closes with a summary of the items needed to move molecular imaging companion diagnostics from early studies into multicenter trials and into the clinic.« less

  17. Development of Companion Diagnostics

    PubMed Central

    Mankoff, David A.; Edmonds, Christine E.; Farwell, Michael D.; Pryma, Daniel A.

    2016-01-01

    The goal of individualized and targeted treatment and precision medicine requires the assessment of potential therapeutic targets to direct treatment selection. The biomarkers used to direct precision medicine, often termed companion diagnostics, for highly targeted drugs have thus far been almost entirely based on in vitro assay of biopsy material. Molecular imaging companion diagnostics offer a number of features complementary to those from in vitro assay, including the ability to measure the heterogeneity of each patient’s cancer across the entire disease burden and to measure early changes in response to treatment. We discuss the use of molecular imaging methods as companion diagnostics for cancer therapy with the goal of predicting response to targeted therapy and measuring early (pharmacodynamic) response as an indication of whether the treatment has “hit” the target. We also discuss considerations for probe development for molecular imaging companion diagnostics, including both small-molecule probes and larger molecules such as labeled antibodies and related constructs. We then describe two examples where both predictive and pharmacodynamic molecular imaging markers have been tested in humans: endocrine therapy for breast cancer and human epidermal growth factor receptor type 2–targeted therapy. The review closes with a summary of the items needed to move molecular imaging companion diagnostics from early studies into multicenter trials and into the clinic. PMID:26687857

  18. Predict subcellular locations of singleplex and multiplex proteins by semi-supervised learning and dimension-reducing general mode of Chou's PseAAC.

    PubMed

    Pacharawongsakda, Eakasit; Theeramunkong, Thanaruk

    2013-12-01

    Predicting protein subcellular location is one of major challenges in Bioinformatics area since such knowledge helps us understand protein functions and enables us to select the targeted proteins during drug discovery process. While many computational techniques have been proposed to improve predictive performance for protein subcellular location, they have several shortcomings. In this work, we propose a method to solve three main issues in such techniques; i) manipulation of multiplex proteins which may exist or move between multiple cellular compartments, ii) handling of high dimensionality in input and output spaces and iii) requirement of sufficient labeled data for model training. Towards these issues, this work presents a new computational method for predicting proteins which have either single or multiple locations. The proposed technique, namely iFLAST-CORE, incorporates the dimensionality reduction in the feature and label spaces with co-training paradigm for semi-supervised multi-label classification. For this purpose, the Singular Value Decomposition (SVD) is applied to transform the high-dimensional feature space and label space into the lower-dimensional spaces. After that, due to limitation of labeled data, the co-training regression makes use of unlabeled data by predicting the target values in the lower-dimensional spaces of unlabeled data. In the last step, the component of SVD is used to project labels in the lower-dimensional space back to those in the original space and an adaptive threshold is used to map a numeric value to a binary value for label determination. A set of experiments on viral proteins and gram-negative bacterial proteins evidence that our proposed method improve the classification performance in terms of various evaluation metrics such as Aiming (or Precision), Coverage (or Recall) and macro F-measure, compared to the traditional method that uses only labeled data.

  19. Prediction of Protein-Protein Interaction Sites with Machine-Learning-Based Data-Cleaning and Post-Filtering Procedures.

    PubMed

    Liu, Guang-Hui; Shen, Hong-Bin; Yu, Dong-Jun

    2016-04-01

    Accurately predicting protein-protein interaction sites (PPIs) is currently a hot topic because it has been demonstrated to be very useful for understanding disease mechanisms and designing drugs. Machine-learning-based computational approaches have been broadly utilized and demonstrated to be useful for PPI prediction. However, directly applying traditional machine learning algorithms, which often assume that samples in different classes are balanced, often leads to poor performance because of the severe class imbalance that exists in the PPI prediction problem. In this study, we propose a novel method for improving PPI prediction performance by relieving the severity of class imbalance using a data-cleaning procedure and reducing predicted false positives with a post-filtering procedure: First, a machine-learning-based data-cleaning procedure is applied to remove those marginal targets, which may potentially have a negative effect on training a model with a clear classification boundary, from the majority samples to relieve the severity of class imbalance in the original training dataset; then, a prediction model is trained on the cleaned dataset; finally, an effective post-filtering procedure is further used to reduce potential false positive predictions. Stringent cross-validation and independent validation tests on benchmark datasets demonstrated the efficacy of the proposed method, which exhibits highly competitive performance compared with existing state-of-the-art sequence-based PPIs predictors and should supplement existing PPI prediction methods.

  20. Improved prediction of peptide detectability for targeted proteomics using a rank-based algorithm and organism-specific data.

    PubMed

    Qeli, Ermir; Omasits, Ulrich; Goetze, Sandra; Stekhoven, Daniel J; Frey, Juerg E; Basler, Konrad; Wollscheid, Bernd; Brunner, Erich; Ahrens, Christian H

    2014-08-28

    The in silico prediction of the best-observable "proteotypic" peptides in mass spectrometry-based workflows is a challenging problem. Being able to accurately predict such peptides would enable the informed selection of proteotypic peptides for targeted quantification of previously observed and non-observed proteins for any organism, with a significant impact for clinical proteomics and systems biology studies. Current prediction algorithms rely on physicochemical parameters in combination with positive and negative training sets to identify those peptide properties that most profoundly affect their general detectability. Here we present PeptideRank, an approach that uses learning to rank algorithm for peptide detectability prediction from shotgun proteomics data, and that eliminates the need to select a negative dataset for the training step. A large number of different peptide properties are used to train ranking models in order to predict a ranking of the best-observable peptides within a protein. Empirical evaluation with rank accuracy metrics showed that PeptideRank complements existing prediction algorithms. Our results indicate that the best performance is achieved when it is trained on organism-specific shotgun proteomics data, and that PeptideRank is most accurate for short to medium-sized and abundant proteins, without any loss in prediction accuracy for the important class of membrane proteins. Targeted proteomics approaches have been gaining a lot of momentum and hold immense potential for systems biology studies and clinical proteomics. However, since only very few complete proteomes have been reported to date, for a considerable fraction of a proteome there is no experimental proteomics evidence that would allow to guide the selection of the best-suited proteotypic peptides (PTPs), i.e. peptides that are specific to a given proteoform and that are repeatedly observed in a mass spectrometer. We describe a novel, rank-based approach for the prediction of the best-suited PTPs for targeted proteomics applications. By building on methods developed in the field of information retrieval (e.g. web search engines like Google's PageRank), we circumvent the delicate step of selecting positive and negative training sets and at the same time also more closely reflect the experimentalist´s need for selecting e.g. the 5 most promising peptides for targeting a protein of interest. This approach allows to predict PTPs for not yet observed proteins or for organisms without prior experimental proteomics data such as many non-model organisms. Copyright © 2014 Elsevier B.V. All rights reserved.

  1. Large scale free energy calculations for blind predictions of protein-ligand binding: the D3R Grand Challenge 2015.

    PubMed

    Deng, Nanjie; Flynn, William F; Xia, Junchao; Vijayan, R S K; Zhang, Baofeng; He, Peng; Mentes, Ahmet; Gallicchio, Emilio; Levy, Ronald M

    2016-09-01

    We describe binding free energy calculations in the D3R Grand Challenge 2015 for blind prediction of the binding affinities of 180 ligands to Hsp90. The present D3R challenge was built around experimental datasets involving Heat shock protein (Hsp) 90, an ATP-dependent molecular chaperone which is an important anticancer drug target. The Hsp90 ATP binding site is known to be a challenging target for accurate calculations of ligand binding affinities because of the ligand-dependent conformational changes in the binding site, the presence of ordered waters and the broad chemical diversity of ligands that can bind at this site. Our primary focus here is to distinguish binders from nonbinders. Large scale absolute binding free energy calculations that cover over 3000 protein-ligand complexes were performed using the BEDAM method starting from docked structures generated by Glide docking. Although the ligand dataset in this study resembles an intermediate to late stage lead optimization project while the BEDAM method is mainly developed for early stage virtual screening of hit molecules, the BEDAM binding free energy scoring has resulted in a moderate enrichment of ligand screening against this challenging drug target. Results show that, using a statistical mechanics based free energy method like BEDAM starting from docked poses offers better enrichment than classical docking scoring functions and rescoring methods like Prime MM-GBSA for the Hsp90 data set in this blind challenge. Importantly, among the three methods tested here, only the mean value of the BEDAM binding free energy scores is able to separate the large group of binders from the small group of nonbinders with a gap of 2.4 kcal/mol. None of the three methods that we have tested provided accurate ranking of the affinities of the 147 active compounds. We discuss the possible sources of errors in the binding free energy calculations. The study suggests that BEDAM can be used strategically to discriminate binders from nonbinders in virtual screening and to more accurately predict the ligand binding modes prior to the more computationally expensive FEP calculations of binding affinity.

  2. Large scale free energy calculations for blind predictions of protein-ligand binding: the D3R Grand Challenge 2015

    NASA Astrophysics Data System (ADS)

    Deng, Nanjie; Flynn, William F.; Xia, Junchao; Vijayan, R. S. K.; Zhang, Baofeng; He, Peng; Mentes, Ahmet; Gallicchio, Emilio; Levy, Ronald M.

    2016-09-01

    We describe binding free energy calculations in the D3R Grand Challenge 2015 for blind prediction of the binding affinities of 180 ligands to Hsp90. The present D3R challenge was built around experimental datasets involving Heat shock protein (Hsp) 90, an ATP-dependent molecular chaperone which is an important anticancer drug target. The Hsp90 ATP binding site is known to be a challenging target for accurate calculations of ligand binding affinities because of the ligand-dependent conformational changes in the binding site, the presence of ordered waters and the broad chemical diversity of ligands that can bind at this site. Our primary focus here is to distinguish binders from nonbinders. Large scale absolute binding free energy calculations that cover over 3000 protein-ligand complexes were performed using the BEDAM method starting from docked structures generated by Glide docking. Although the ligand dataset in this study resembles an intermediate to late stage lead optimization project while the BEDAM method is mainly developed for early stage virtual screening of hit molecules, the BEDAM binding free energy scoring has resulted in a moderate enrichment of ligand screening against this challenging drug target. Results show that, using a statistical mechanics based free energy method like BEDAM starting from docked poses offers better enrichment than classical docking scoring functions and rescoring methods like Prime MM-GBSA for the Hsp90 data set in this blind challenge. Importantly, among the three methods tested here, only the mean value of the BEDAM binding free energy scores is able to separate the large group of binders from the small group of nonbinders with a gap of 2.4 kcal/mol. None of the three methods that we have tested provided accurate ranking of the affinities of the 147 active compounds. We discuss the possible sources of errors in the binding free energy calculations. The study suggests that BEDAM can be used strategically to discriminate binders from nonbinders in virtual screening and to more accurately predict the ligand binding modes prior to the more computationally expensive FEP calculations of binding affinity.

  3. Combinatorial support vector machines approach for virtual screening of selective multi-target serotonin reuptake inhibitors from large compound libraries.

    PubMed

    Shi, Z; Ma, X H; Qin, C; Jia, J; Jiang, Y Y; Tan, C Y; Chen, Y Z

    2012-02-01

    Selective multi-target serotonin reuptake inhibitors enhance antidepressant efficacy. Their discovery can be facilitated by multiple methods, including in silico ones. In this study, we developed and tested an in silico method, combinatorial support vector machines (COMBI-SVMs), for virtual screening (VS) multi-target serotonin reuptake inhibitors of seven target pairs (serotonin transporter paired with noradrenaline transporter, H(3) receptor, 5-HT(1A) receptor, 5-HT(1B) receptor, 5-HT(2C) receptor, melanocortin 4 receptor and neurokinin 1 receptor respectively) from large compound libraries. COMBI-SVMs trained with 917-1951 individual target inhibitors correctly identified 22-83.3% (majority >31.1%) of the 6-216 dual inhibitors collected from literature as independent testing sets. COMBI-SVMs showed moderate to good target selectivity in misclassifying as dual inhibitors 2.2-29.8% (majority <15.4%) of the individual target inhibitors of the same target pair and 0.58-7.1% of the other 6 targets outside the target pair. COMBI-SVMs showed low dual inhibitor false hit rates (0.006-0.056%, 0.042-0.21%, 0.2-4%) in screening 17 million PubChem compounds, 168,000 MDDR compounds, and 7-8181 MDDR compounds similar to the dual inhibitors. Compared with similarity searching, k-NN and PNN methods, COMBI-SVM produced comparable dual inhibitor yields, similar target selectivity, and lower false hit rate in screening 168,000 MDDR compounds. The annotated classes of many COMBI-SVMs identified MDDR virtual hits correlate with the reported effects of their predicted targets. COMBI-SVM is potentially useful for searching selective multi-target agents without explicit knowledge of these agents. Copyright © 2011 Elsevier Inc. All rights reserved.

  4. Evaluation of axial pile bearing capacity based on pile driving analyzer (PDA) test using Neural Network

    NASA Astrophysics Data System (ADS)

    Maizir, H.; Suryanita, R.

    2018-01-01

    A few decades, many methods have been developed to predict and evaluate the bearing capacity of driven piles. The problem of the predicting and assessing the bearing capacity of the pile is very complicated and not yet established, different soil testing and evaluation produce a widely different solution. However, the most important thing is to determine methods used to predict and evaluate the bearing capacity of the pile to the required degree of accuracy and consistency value. Accurate prediction and evaluation of axial bearing capacity depend on some variables, such as the type of soil, diameter, and length of pile, etc. The aims of the study of Artificial Neural Networks (ANNs) are utilized to obtain more accurate and consistent axial bearing capacity of a driven pile. ANNs can be described as mapping an input to the target output data. The method using the ANN model developed to predict and evaluate the axial bearing capacity of the pile based on the pile driving analyzer (PDA) test data for more than 200 selected data. The results of the predictions obtained by the ANN model and the PDA test were then compared. This research as the neural network models give a right prediction and evaluation of the axial bearing capacity of piles using neural networks.

  5. Automated use of mutagenesis data in structure prediction.

    PubMed

    Nanda, Vikas; DeGrado, William F

    2005-05-15

    In the absence of experimental structural determination, numerous methods are available to indirectly predict or probe the structure of a target molecule. Genetic modification of a protein sequence is a powerful tool for identifying key residues involved in binding reactions or protein stability. Mutagenesis data is usually incorporated into the modeling process either through manual inspection of model compatibility with empirical data, or through the generation of geometric constraints linking sensitive residues to a binding interface. We present an approach derived from statistical studies of lattice models for introducing mutation information directly into the fitness score. The approach takes into account the phenotype of mutation (neutral or disruptive) and calculates the energy for a given structure over an ensemble of sequences. The structure prediction procedure searches for the optimal conformation where neutral sequences either have no impact or improve stability and disruptive sequences reduce stability relative to wild type. We examine three types of sequence ensembles: information from saturation mutagenesis, scanning mutagenesis, and homologous proteins. Incorporating multiple sequences into a statistical ensemble serves to energetically separate the native state and misfolded structures. As a result, the prediction of structure with a poor force field is sufficiently enhanced by mutational information to improve accuracy. Furthermore, by separating misfolded conformations from the target score, the ensemble energy serves to speed up conformational search algorithms such as Monte Carlo-based methods. Copyright 2005 Wiley-Liss, Inc.

  6. VisitSense: Sensing Place Visit Patterns from Ambient Radio on Smartphones for Targeted Mobile Ads in Shopping Malls

    PubMed Central

    Kim, Byoungjip; Kang, Seungwoo; Ha, Jin-Young; Song, Junehwa

    2015-01-01

    In this paper, we introduce a novel smartphone framework called VisitSense that automatically detects and predicts a smartphone user’s place visits from ambient radio to enable behavioral targeting for mobile ads in large shopping malls. VisitSense enables mobile app developers to adopt visit-pattern-aware mobile advertising for shopping mall visitors in their apps. It also benefits mobile users by allowing them to receive highly relevant mobile ads that are aware of their place visit patterns in shopping malls. To achieve the goal, VisitSense employs accurate visit detection and prediction methods. For accurate visit detection, we develop a change-based detection method to take into consideration the stability change of ambient radio and the mobility change of users. It performs well in large shopping malls where ambient radio is quite noisy and causes existing algorithms to easily fail. In addition, we proposed a causality-based visit prediction model to capture the causality in the sequential visit patterns for effective prediction. We have developed a VisitSense prototype system, and a visit-pattern-aware mobile advertising application that is based on it. Furthermore, we deploy the system in the COEX Mall, one of the largest shopping malls in Korea, and conduct diverse experiments to show the effectiveness of VisitSense. PMID:26193275

  7. Multi-Stage Target Tracking with Drift Correction and Position Prediction

    NASA Astrophysics Data System (ADS)

    Chen, Xin; Ren, Keyan; Hou, Yibin

    2018-04-01

    Most existing tracking methods are hard to combine accuracy and performance, and do not consider the shift between clarity and blur that often occurs. In this paper, we propound a multi-stage tracking framework with two particular modules: position prediction and corrective measure. We conduct tracking based on correlation filter with a corrective measure module to increase both performance and accuracy. Specifically, a convolutional network is used for solving the blur problem in realistic scene, training methodology that training dataset with blur images generated by the three blur algorithms. Then, we propose a position prediction module to reduce the computation cost and make tracker more capable of fast motion. Experimental result shows that our tracking method is more robust compared to others and more accurate on the benchmark sequences.

  8. Experimental validation of alternate integral-formulation method for predicting acoustic radiation based on particle velocity measurements.

    PubMed

    Ni, Zhi; Wu, Sean F

    2010-09-01

    This paper presents experimental validation of an alternate integral-formulation method (AIM) for predicting acoustic radiation from an arbitrary structure based on the particle velocities specified on a hypothetical surface enclosing the target source. Both the normal and tangential components of the particle velocity on this hypothetical surface are measured and taken as the input to AIM codes to predict the acoustic pressures in both exterior and interior regions. The results obtained are compared with the benchmark values measured by microphones at the same locations. To gain some insight into practical applications of AIM, laser Doppler anemometer (LDA) and double hotwire sensor (DHS) are used as measurement devices to collect the particle velocities in the air. Measurement limitations of using LDA and DHS are discussed.

  9. QSAR Modeling Using Large-Scale Databases: Case Study for HIV-1 Reverse Transcriptase Inhibitors.

    PubMed

    Tarasova, Olga A; Urusova, Aleksandra F; Filimonov, Dmitry A; Nicklaus, Marc C; Zakharov, Alexey V; Poroikov, Vladimir V

    2015-07-27

    Large-scale databases are important sources of training sets for various QSAR modeling approaches. Generally, these databases contain information extracted from different sources. This variety of sources can produce inconsistency in the data, defined as sometimes widely diverging activity results for the same compound against the same target. Because such inconsistency can reduce the accuracy of predictive models built from these data, we are addressing the question of how best to use data from publicly and commercially accessible databases to create accurate and predictive QSAR models. We investigate the suitability of commercially and publicly available databases to QSAR modeling of antiviral activity (HIV-1 reverse transcriptase (RT) inhibition). We present several methods for the creation of modeling (i.e., training and test) sets from two, either commercially or freely available, databases: Thomson Reuters Integrity and ChEMBL. We found that the typical predictivities of QSAR models obtained using these different modeling set compilation methods differ significantly from each other. The best results were obtained using training sets compiled for compounds tested using only one method and material (i.e., a specific type of biological assay). Compound sets aggregated by target only typically yielded poorly predictive models. We discuss the possibility of "mix-and-matching" assay data across aggregating databases such as ChEMBL and Integrity and their current severe limitations for this purpose. One of them is the general lack of complete and semantic/computer-parsable descriptions of assay methodology carried by these databases that would allow one to determine mix-and-matchability of result sets at the assay level.

  10. A comparative study of family-specific protein-ligand complex affinity prediction based on random forest approach

    NASA Astrophysics Data System (ADS)

    Wang, Yu; Guo, Yanzhi; Kuang, Qifan; Pu, Xuemei; Ji, Yue; Zhang, Zhihang; Li, Menglong

    2015-04-01

    The assessment of binding affinity between ligands and the target proteins plays an essential role in drug discovery and design process. As an alternative to widely used scoring approaches, machine learning methods have also been proposed for fast prediction of the binding affinity with promising results, but most of them were developed as all-purpose models despite of the specific functions of different protein families, since proteins from different function families always have different structures and physicochemical features. In this study, we proposed a random forest method to predict the protein-ligand binding affinity based on a comprehensive feature set covering protein sequence, binding pocket, ligand structure and intermolecular interaction. Feature processing and compression was respectively implemented for different protein family datasets, which indicates that different features contribute to different models, so individual representation for each protein family is necessary. Three family-specific models were constructed for three important protein target families of HIV-1 protease, trypsin and carbonic anhydrase respectively. As a comparison, two generic models including diverse protein families were also built. The evaluation results show that models on family-specific datasets have the superior performance to those on the generic datasets and the Pearson and Spearman correlation coefficients ( R p and Rs) on the test sets are 0.740, 0.874, 0.735 and 0.697, 0.853, 0.723 for HIV-1 protease, trypsin and carbonic anhydrase respectively. Comparisons with the other methods further demonstrate that individual representation and model construction for each protein family is a more reasonable way in predicting the affinity of one particular protein family.

  11. Predicting oligonucleotide affinity to nucleic acid targets.

    PubMed Central

    Mathews, D H; Burkard, M E; Freier, S M; Wyatt, J R; Turner, D H

    1999-01-01

    A computer program, OligoWalk, is reported that predicts the equilibrium affinity of complementary DNA or RNA oligonucleotides to an RNA target. This program considers the predicted stability of the oligonucleotide-target helix and the competition with predicted secondary structure of both the target and the oligonucleotide. Both unimolecular and bimolecular oligonucleotide self structure are considered with a user-defined concentration. The application of OligoWalk is illustrated with three comparisons to experimental results drawn from the literature. PMID:10580474

  12. Validation of a method to evaluate future impact of road safety interventions, a comparison between fatal passenger car crashes in Sweden 2000 and 2010.

    PubMed

    Strandroth, Johan

    2015-03-01

    When targeting a society free from serious and fatal road-traffic injuries, it has been a common practice in many countries and organizations to set up time-limited and quantified targets for the reduction of fatalities and injuries. In setting these targets EU and other organizations have recognized the importance to monitor and predict the development toward the target as well as the efficiency of road safety policies and interventions. This study aims to validate a method to forecast future road safety challenges by applying it to the fatal crashes in Sweden in 2000 and using the method to explain the change in fatalities based on the road safety interventions made until 2010. The estimation of the method is then compared to the true outcome in 2010. The aim of this study was to investigate if a residual of crashes produced by a partial analysis could constitute a sufficient base to describe the characteristics of future crashes. show that out of the 332 car occupants killed in 2000, 197 were estimated to constitute the residual in 2010. Consequently, 135 fatalities from 2000 were estimated by the model to be prevented by 2010. That is a predicted reduction of 41% compared to the reduction in the real outcome of 53%, from 332 in 2000 to 156 in 2010. The method was found able to generate a residual of crashes in 2010 from the crashes in 2000 that had a very similar nature, with regards to crash type, as the true outcome of 2010. It was also found suitable to handle double counting and system effects. However, future research is needed in order to investigate how external factors as well as random and systematic variation should be taken into account in a reliable manner. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Using the Detectability Index to Predict P300 Speller Performance

    PubMed Central

    Mainsah, B.O.; Collins, L.M.; Throckmorton, C.S.

    2017-01-01

    Objective The P300 speller is a popular brain-computer interface (BCI) system that has been investigated as a potential communication alternative for individuals with severe neuromuscular limitations. To achieve acceptable accuracy levels for communication, the system requires repeated data measurements in a given signal condition to enhance the signal-to-noise ratio of elicited brain responses. These elicited brain responses, which are used as control signals, are embedded in noisy electroencephalography (EEG) data. The discriminability between target and non-target EEG responses defines a user’s performance with the system. A previous P300 speller model has been proposed to estimate system accuracy given a certain amount of data collection. However, the approach was limited to a static stopping algorithm, i.e. averaging over a fixed number of measurements, and the row-column paradigm. A generalized method that is also applicable to dynamic stopping algorithms and other stimulus paradigms is desirable. Approach We developed a new probabilistic model-based approach to predicting BCI performance, where performance functions can be derived analytically or via Monte Carlo methods. Within this framework, we introduce a new model for the P300 speller with the Bayesian dynamic stopping (DS) algorithm, by simplifying a multi-hypothesis to a binary hypothesis problem using the likelihood ratio test. Under a normality assumption, the performance functions for the Bayesian algorithm can be parameterized with the detectability index, a measure which quantifies the discriminability between target and non-target EEG responses. Main results Simulations with synthetic and empirical data provided initial verification of the proposed method of estimating performance with Bayesian DS using the detectability index. Analysis of results from previous online studies validated the proposed method. Significance The proposed method could serve as a useful tool to initially asses BCI performance without extensive online testing, in order to estimate the amount of data required to achieve a desired accuracy level. PMID:27705956

  14. Using the detectability index to predict P300 speller performance

    NASA Astrophysics Data System (ADS)

    Mainsah, B. O.; Collins, L. M.; Throckmorton, C. S.

    2016-12-01

    Objective. The P300 speller is a popular brain-computer interface (BCI) system that has been investigated as a potential communication alternative for individuals with severe neuromuscular limitations. To achieve acceptable accuracy levels for communication, the system requires repeated data measurements in a given signal condition to enhance the signal-to-noise ratio of elicited brain responses. These elicited brain responses, which are used as control signals, are embedded in noisy electroencephalography (EEG) data. The discriminability between target and non-target EEG responses defines a user’s performance with the system. A previous P300 speller model has been proposed to estimate system accuracy given a certain amount of data collection. However, the approach was limited to a static stopping algorithm, i.e. averaging over a fixed number of measurements, and the row-column paradigm. A generalized method that is also applicable to dynamic stopping (DS) algorithms and other stimulus paradigms is desirable. Approach. We developed a new probabilistic model-based approach to predicting BCI performance, where performance functions can be derived analytically or via Monte Carlo methods. Within this framework, we introduce a new model for the P300 speller with the Bayesian DS algorithm, by simplifying a multi-hypothesis to a binary hypothesis problem using the likelihood ratio test. Under a normality assumption, the performance functions for the Bayesian algorithm can be parameterized with the detectability index, a measure which quantifies the discriminability between target and non-target EEG responses. Main results. Simulations with synthetic and empirical data provided initial verification of the proposed method of estimating performance with Bayesian DS using the detectability index. Analysis of results from previous online studies validated the proposed method. Significance. The proposed method could serve as a useful tool to initially assess BCI performance without extensive online testing, in order to estimate the amount of data required to achieve a desired accuracy level.

  15. Prediction of plant pre-microRNAs and their microRNAs in genome-scale sequences using structure-sequence features and support vector machine.

    PubMed

    Meng, Jun; Liu, Dong; Sun, Chao; Luan, Yushi

    2014-12-30

    MicroRNAs (miRNAs) are a family of non-coding RNAs approximately 21 nucleotides in length that play pivotal roles at the post-transcriptional level in animals, plants and viruses. These molecules silence their target genes by degrading transcription or suppressing translation. Studies have shown that miRNAs are involved in biological responses to a variety of biotic and abiotic stresses. Identification of these molecules and their targets can aid the understanding of regulatory processes. Recently, prediction methods based on machine learning have been widely used for miRNA prediction. However, most of these methods were designed for mammalian miRNA prediction, and few are available for predicting miRNAs in the pre-miRNAs of specific plant species. Although the complete Solanum lycopersicum genome has been published, only 77 Solanum lycopersicum miRNAs have been identified, far less than the estimated number. Therefore, it is essential to develop a prediction method based on machine learning to identify new plant miRNAs. A novel classification model based on a support vector machine (SVM) was trained to identify real and pseudo plant pre-miRNAs together with their miRNAs. An initial set of 152 novel features related to sequential structures was used to train the model. By applying feature selection, we obtained the best subset of 47 features for use with the Back Support Vector Machine-Recursive Feature Elimination (B-SVM-RFE) method for the classification of plant pre-miRNAs. Using this method, 63 features were obtained for plant miRNA classification. We then developed an integrated classification model, miPlantPreMat, which comprises MiPlantPre and MiPlantMat, to identify plant pre-miRNAs and their miRNAs. This model achieved approximately 90% accuracy using plant datasets from nine plant species, including Arabidopsis thaliana, Glycine max, Oryza sativa, Physcomitrella patens, Medicago truncatula, Sorghum bicolor, Arabidopsis lyrata, Zea mays and Solanum lycopersicum. Using miPlantPreMat, 522 Solanum lycopersicum miRNAs were identified in the Solanum lycopersicum genome sequence. We developed an integrated classification model, miPlantPreMat, based on structure-sequence features and SVM. MiPlantPreMat was used to identify both plant pre-miRNAs and the corresponding mature miRNAs. An improved feature selection method was proposed, resulting in high classification accuracy, sensitivity and specificity.

  16. Identification of MicroRNA Targets of Capsicum spp. Using MiRTrans—a Trans-Omics Approach

    PubMed Central

    Zhang, Lu; Qin, Cheng; Mei, Junpu; Chen, Xiaocui; Wu, Zhiming; Luo, Xirong; Cheng, Jiaowen; Tang, Xiangqun; Hu, Kailin; Li, Shuai C.

    2017-01-01

    The microRNA (miRNA) can regulate the transcripts that are involved in eukaryotic cell proliferation, differentiation, and metabolism. Especially for plants, our understanding of miRNA targets, is still limited. Early attempts of prediction on sequence alignments have been plagued by enormous false positives. It is helpful to improve target prediction specificity by incorporating the other data sources such as the dependency between miRNA and transcript expression or even cleaved transcripts by miRNA regulations, which are referred to as trans-omics data. In this paper, we developed MiRTrans (Prediction of MiRNA targets by Trans-omics data) to explore miRNA targets by incorporating miRNA sequencing, transcriptome sequencing, and degradome sequencing. MiRTrans consisted of three major steps. First, the target transcripts of miRNAs were predicted by scrutinizing their sequence characteristics and collected as an initial potential targets pool. Second, false positive targets were eliminated if the expression of miRNA and its targets were weakly correlated by lasso regression. Third, degradome sequencing was utilized to capture the miRNA targets by examining the cleaved transcripts that regulated by miRNAs. Finally, the predicted targets from the second and third step were combined by Fisher's combination test. MiRTrans was applied to identify the miRNA targets for Capsicum spp. (i.e., pepper). It can generate more functional miRNA targets than sequence-based predictions by evaluating functional enrichment. MiRTrans identified 58 miRNA-transcript pairs with high confidence from 18 miRNA families conserved in eudicots. Most of these targets were transcription factors; this lent support to the role of miRNA as key regulator in pepper. To our best knowledge, this work is the first attempt to investigate the miRNA targets of pepper, as well as their regulatory networks. Surprisingly, only a small proportion of miRNA-transcript pairs were shared between degradome sequencing and expression dependency predictions, suggesting that miRNA targets predicted by a single technology alone may be prone to report false negatives. PMID:28443105

  17. Research on orbit prediction for solar-based calibration proper satellite

    NASA Astrophysics Data System (ADS)

    Chen, Xuan; Qi, Wenwen; Xu, Peng

    2018-03-01

    Utilizing the mathematical model of the orbit mechanics, the orbit prediction is to forecast the space target's orbit information of a certain time based on the orbit of the initial moment. The proper satellite radiometric calibration and calibration orbit prediction process are introduced briefly. On the basis of the research of the calibration space position design method and the radiative transfer model, an orbit prediction method for proper satellite radiometric calibration is proposed to select the appropriate calibration arc for the remote sensor and to predict the orbit information of the proper satellite and the remote sensor. By analyzing the orbit constraint of the proper satellite calibration, the GF-1solar synchronous orbit is chose as the proper satellite orbit in order to simulate the calibration visible durance for different satellites to be calibrated. The results of simulation and analysis provide the basis for the improvement of the radiometric calibration accuracy of the satellite remote sensor, which lays the foundation for the high precision and high frequency radiometric calibration.

  18. BagReg: Protein inference through machine learning.

    PubMed

    Zhao, Can; Liu, Dao; Teng, Ben; He, Zengyou

    2015-08-01

    Protein inference from the identified peptides is of primary importance in the shotgun proteomics. The target of protein inference is to identify whether each candidate protein is truly present in the sample. To date, many computational methods have been proposed to solve this problem. However, there is still no method that can fully utilize the information hidden in the input data. In this article, we propose a learning-based method named BagReg for protein inference. The method firstly artificially extracts five features from the input data, and then chooses each feature as the class feature to separately build models to predict the presence probabilities of proteins. Finally, the weak results from five prediction models are aggregated to obtain the final result. We test our method on six public available data sets. The experimental results show that our method is superior to the state-of-the-art protein inference algorithms. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Compound Structure-Independent Activity Prediction in High-Dimensional Target Space.

    PubMed

    Balfer, Jenny; Hu, Ye; Bajorath, Jürgen

    2014-08-01

    Profiling of compound libraries against arrays of targets has become an important approach in pharmaceutical research. The prediction of multi-target compound activities also represents an attractive task for machine learning with potential for drug discovery applications. Herein, we have explored activity prediction in high-dimensional target space. Different types of models were derived to predict multi-target activities. The models included naïve Bayesian (NB) and support vector machine (SVM) classifiers based upon compound structure information and NB models derived on the basis of activity profiles, without considering compound structure. Because the latter approach can be applied to incomplete training data and principally depends on the feature independence assumption, SVM modeling was not applicable in this case. Furthermore, iterative hybrid NB models making use of both activity profiles and compound structure information were built. In high-dimensional target space, NB models utilizing activity profile data were found to yield more accurate activity predictions than structure-based NB and SVM models or hybrid models. An in-depth analysis of activity profile-based models revealed the presence of correlation effects across different targets and rationalized prediction accuracy. Taken together, the results indicate that activity profile information can be effectively used to predict the activity of test compounds against novel targets. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. GP0.4 from bacteriophage T7: in silico characterisation of its structure and interaction with E. coli FtsZ.

    PubMed

    Simpkin, Adam J; Rigden, Daniel J

    2016-07-13

    Proteins produced by bacteriophages can have potent antimicrobial activity. The study of phage-host interactions can therefore inform small molecule drug discovery by revealing and characterising new drug targets. Here we characterise in silico the predicted interaction of gene protein 0.4 (GP0.4) from the Escherichia coli (E. coli) phage T7 with E. coli filamenting temperature-sensitive mutant Z division protein (FtsZ). FtsZ is a tubulin homolog which plays a key role in bacterial cell division and that has been proposed as a drug target. Using ab initio, fragment assembly structure modelling, we predicted the structure of GP0.4 with two programs. A structure similarity-based network was used to identify a U-shaped helix-turn-helix candidate fold as being favoured. ClusPro was used to dock this structure prediction to a homology model of E. coli FtsZ resulting in a favourable predicted interaction mode. Alternative docking methods supported the proposed mode which offered an immediate explanation for the anti-filamenting activity of GP0.4. Importantly, further strong support derived from a previously characterised insertion mutation, known to abolish GP0.4 activity, that is positioned in close proximity to the proposed GP0.4/FtsZ interface. The mode of interaction predicted by bioinformatics techniques strongly suggests a mechanism through which GP0.4 inhibits FtsZ and further establishes the latter's druggable intrafilament interface as a potential drug target.

Top