Asymmetric bagging and feature selection for activities prediction of drug molecules.
Li, Guo-Zheng; Meng, Hao-Hua; Lu, Wen-Cong; Yang, Jack Y; Yang, Mary Qu
2008-05-28
Activities of drug molecules can be predicted by QSAR (quantitative structure activity relationship) models, which overcomes the disadvantages of high cost and long cycle by employing the traditional experimental method. With the fact that the number of drug molecules with positive activity is rather fewer than that of negatives, it is important to predict molecular activities considering such an unbalanced situation. Here, asymmetric bagging and feature selection are introduced into the problem and asymmetric bagging of support vector machines (asBagging) is proposed on predicting drug activities to treat the unbalanced problem. At the same time, the features extracted from the structures of drug molecules affect prediction accuracy of QSAR models. Therefore, a novel algorithm named PRIFEAB is proposed, which applies an embedded feature selection method to remove redundant and irrelevant features for asBagging. Numerical experimental results on a data set of molecular activities show that asBagging improve the AUC and sensitivity values of molecular activities and PRIFEAB with feature selection further helps to improve the prediction ability. Asymmetric bagging can help to improve prediction accuracy of activities of drug molecules, which can be furthermore improved by performing feature selection to select relevant features from the drug molecules data sets.
DemQSAR: predicting human volume of distribution and clearance of drugs
NASA Astrophysics Data System (ADS)
Demir-Kavuk, Ozgur; Bentzien, Jörg; Muegge, Ingo; Knapp, Ernst-Walter
2011-12-01
In silico methods characterizing molecular compounds with respect to pharmacologically relevant properties can accelerate the identification of new drugs and reduce their development costs. Quantitative structure-activity/-property relationship (QSAR/QSPR) correlate structure and physico-chemical properties of molecular compounds with a specific functional activity/property under study. Typically a large number of molecular features are generated for the compounds. In many cases the number of generated features exceeds the number of molecular compounds with known property values that are available for learning. Machine learning methods tend to overfit the training data in such situations, i.e. the method adjusts to very specific features of the training data, which are not characteristic for the considered property. This problem can be alleviated by diminishing the influence of unimportant, redundant or even misleading features. A better strategy is to eliminate such features completely. Ideally, a molecular property can be described by a small number of features that are chemically interpretable. The purpose of the present contribution is to provide a predictive modeling approach, which combines feature generation, feature selection, model building and control of overtraining into a single application called DemQSAR. DemQSAR is used to predict human volume of distribution (VDss) and human clearance (CL). To control overtraining, quadratic and linear regularization terms were employed. A recursive feature selection approach is used to reduce the number of descriptors. The prediction performance is as good as the best predictions reported in the recent literature. The example presented here demonstrates that DemQSAR can generate a model that uses very few features while maintaining high predictive power. A standalone DemQSAR Java application for model building of any user defined property as well as a web interface for the prediction of human VDss and CL is available on the webpage of DemPRED: http://agknapp.chemie.fu-berlin.de/dempred/.
DemQSAR: predicting human volume of distribution and clearance of drugs.
Demir-Kavuk, Ozgur; Bentzien, Jörg; Muegge, Ingo; Knapp, Ernst-Walter
2011-12-01
In silico methods characterizing molecular compounds with respect to pharmacologically relevant properties can accelerate the identification of new drugs and reduce their development costs. Quantitative structure-activity/-property relationship (QSAR/QSPR) correlate structure and physico-chemical properties of molecular compounds with a specific functional activity/property under study. Typically a large number of molecular features are generated for the compounds. In many cases the number of generated features exceeds the number of molecular compounds with known property values that are available for learning. Machine learning methods tend to overfit the training data in such situations, i.e. the method adjusts to very specific features of the training data, which are not characteristic for the considered property. This problem can be alleviated by diminishing the influence of unimportant, redundant or even misleading features. A better strategy is to eliminate such features completely. Ideally, a molecular property can be described by a small number of features that are chemically interpretable. The purpose of the present contribution is to provide a predictive modeling approach, which combines feature generation, feature selection, model building and control of overtraining into a single application called DemQSAR. DemQSAR is used to predict human volume of distribution (VD(ss)) and human clearance (CL). To control overtraining, quadratic and linear regularization terms were employed. A recursive feature selection approach is used to reduce the number of descriptors. The prediction performance is as good as the best predictions reported in the recent literature. The example presented here demonstrates that DemQSAR can generate a model that uses very few features while maintaining high predictive power. A standalone DemQSAR Java application for model building of any user defined property as well as a web interface for the prediction of human VD(ss) and CL is available on the webpage of DemPRED: http://agknapp.chemie.fu-berlin.de/dempred/ .
Tadayyon, Hadi; Sannachi, Lakshmanan; Gangeh, Mehrdad J.; Kim, Christina; Ghandi, Sonal; Trudeau, Maureen; Pritchard, Kathleen; Tran, William T.; Slodkowska, Elzbieta; Sadeghi-Naini, Ali; Czarnota, Gregory J.
2017-01-01
Quantitative ultrasound (QUS) can probe tissue structure and analyze tumour characteristics. Using a 6-MHz ultrasound system, radiofrequency data were acquired from 56 locally advanced breast cancer patients prior to their neoadjuvant chemotherapy (NAC) and QUS texture features were computed from regions of interest in tumour cores and their margins as potential predictive and prognostic indicators. Breast tumour molecular features were also collected and used for analysis. A multiparametric QUS model was constructed, which demonstrated a response prediction accuracy of 88% and ability to predict patient 5-year survival rates (p = 0.01). QUS features demonstrated superior performance in comparison to molecular markers and the combination of QUS and molecular markers did not improve response prediction. This study demonstrates, for the first time, that non-invasive QUS features in the core and margin of breast tumours can indicate breast cancer response to neoadjuvant chemotherapy (NAC) and predict five-year recurrence-free survival. PMID:28401902
Tadayyon, Hadi; Sannachi, Lakshmanan; Gangeh, Mehrdad J; Kim, Christina; Ghandi, Sonal; Trudeau, Maureen; Pritchard, Kathleen; Tran, William T; Slodkowska, Elzbieta; Sadeghi-Naini, Ali; Czarnota, Gregory J
2017-04-12
Quantitative ultrasound (QUS) can probe tissue structure and analyze tumour characteristics. Using a 6-MHz ultrasound system, radiofrequency data were acquired from 56 locally advanced breast cancer patients prior to their neoadjuvant chemotherapy (NAC) and QUS texture features were computed from regions of interest in tumour cores and their margins as potential predictive and prognostic indicators. Breast tumour molecular features were also collected and used for analysis. A multiparametric QUS model was constructed, which demonstrated a response prediction accuracy of 88% and ability to predict patient 5-year survival rates (p = 0.01). QUS features demonstrated superior performance in comparison to molecular markers and the combination of QUS and molecular markers did not improve response prediction. This study demonstrates, for the first time, that non-invasive QUS features in the core and margin of breast tumours can indicate breast cancer response to neoadjuvant chemotherapy (NAC) and predict five-year recurrence-free survival.
Structural features that predict real-value fluctuations of globular proteins
Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke
2012-01-01
It is crucial to consider dynamics for understanding the biological function of proteins. We used a large number of molecular dynamics trajectories of non-homologous proteins as references and examined static structural features of proteins that are most relevant to fluctuations. We examined correlation of individual structural features with fluctuations and further investigated effective combinations of features for predicting the real-value of residue fluctuations using the support vector regression. It was found that some structural features have higher correlation than crystallographic B-factors with fluctuations observed in molecular dynamics trajectories. Moreover, support vector regression that uses combinations of static structural features showed accurate prediction of fluctuations with an average Pearson’s correlation coefficient of 0.669 and a root mean square error of 1.04 Å. This correlation coefficient is higher than the one observed for the prediction by the Gaussian network model. An advantage of the developed method over the Gaussian network models is that the former predicts the real-value of fluctuation. The results help improve our understanding of relationships between protein structure and fluctuation. Furthermore, the developed method provides a convienient practial way to predict fluctuations of proteins using easily computed static structural features of proteins. PMID:22328193
Radiomics biomarkers for accurate tumor progression prediction of oropharyngeal cancer
NASA Astrophysics Data System (ADS)
Hadjiiski, Lubomir; Chan, Heang-Ping; Cha, Kenny H.; Srinivasan, Ashok; Wei, Jun; Zhou, Chuan; Prince, Mark; Papagerakis, Silvana
2017-03-01
Accurate tumor progression prediction for oropharyngeal cancers is crucial for identifying patients who would best be treated with optimized treatment and therefore minimize the risk of under- or over-treatment. An objective decision support system that can merge the available radiomics, histopathologic and molecular biomarkers in a predictive model based on statistical outcomes of previous cases and machine learning may assist clinicians in making more accurate assessment of oropharyngeal tumor progression. In this study, we evaluated the feasibility of developing individual and combined predictive models based on quantitative image analysis from radiomics, histopathology and molecular biomarkers for oropharyngeal tumor progression prediction. With IRB approval, 31, 84, and 127 patients with head and neck CT (CT-HN), tumor tissue microarrays (TMAs) and molecular biomarker expressions, respectively, were collected. For 8 of the patients all 3 types of biomarkers were available and they were sequestered in a test set. The CT-HN lesions were automatically segmented using our level sets based method. Morphological, texture and molecular based features were extracted from CT-HN and TMA images, and selected features were merged by a neural network. The classification accuracy was quantified using the area under the ROC curve (AUC). Test AUCs of 0.87, 0.74, and 0.71 were obtained with the individual predictive models based on radiomics, histopathologic, and molecular features, respectively. Combining the radiomics and molecular models increased the test AUC to 0.90. Combining all 3 models increased the test AUC further to 0.94. This preliminary study demonstrates that the individual domains of biomarkers are useful and the integrated multi-domain approach is most promising for tumor progression prediction.
NASA Astrophysics Data System (ADS)
He, Ting; Fan, Ming; Zhang, Peng; Li, Hui; Zhang, Juan; Shao, Guoliang; Li, Lihua
2018-03-01
Breast cancer can be classified into four molecular subtypes of Luminal A, Luminal B, HER2 and Basal-like, which have significant differences in treatment and survival outcomes. We in this study aim to predict immunohistochemistry (IHC) determined molecular subtypes of breast cancer using image features derived from tumor and peritumoral stroma region based on diffusion weighted imaging (DWI). A dataset of 126 breast cancer patients were collected who underwent preoperative breast MRI with a 3T scanner. The apparent diffusion coefficients (ADCs) were recorded from DWI, and breast image was segmented into regions comprising the tumor and the surrounding stromal. Statistical characteristics in various breast tumor and peritumoral regions were computed, including mean, minimum, maximum, variance, interquartile range, range, skewness, and kurtosis of ADC values. Additionally, the difference of features between each two regions were also calculated. The univariate logistic based classifier was performed for evaluating the performance of the individual features for discriminating subtypes. For multi-class classification, multivariate logistic regression model was trained and validated. The results showed that the tumor boundary and proximal peritumoral stroma region derived features have a higher performance in classification compared to that of the other regions. Furthermore, the prediction model using statistical features, difference features and all the features combined from these regions generated AUC values of 0.774, 0.796 and 0.811, respectively. The results in this study indicate that ADC feature in tumor and peritumoral stromal region would be valuable for estimating the molecular subtype in breast cancer.
Communication: Finding destructive interference features in molecular transport junctions.
Reuter, Matthew G; Hansen, Thorsten
2014-11-14
Associating molecular structure with quantum interference features in electrode-molecule-electrode transport junctions has been difficult because existing guidelines for understanding interferences only apply to conjugated hydrocarbons. Herein we use linear algebra and the Landauer-Büttiker theory for electron transport to derive a general rule for predicting the existence and locations of interference features. Our analysis illustrates that interferences can be directly determined from the molecular Hamiltonian and the molecule-electrode couplings, and we demonstrate its utility with several examples.
The molecular basis of breast cancer pathological phenotypes.
Heng, Yujing J; Lester, Susan C; Tse, Gary Mk; Factor, Rachel E; Allison, Kimberly H; Collins, Laura C; Chen, Yunn-Yi; Jensen, Kristin C; Johnson, Nicole B; Jeong, Jong Cheol; Punjabi, Rahi; Shin, Sandra J; Singh, Kamaljeet; Krings, Gregor; Eberhard, David A; Tan, Puay Hoon; Korski, Konstanty; Waldman, Frederic M; Gutman, David A; Sanders, Melinda; Reis-Filho, Jorge S; Flanagan, Sydney R; Gendoo, Deena Ma; Chen, Gregory M; Haibe-Kains, Benjamin; Ciriello, Giovanni; Hoadley, Katherine A; Perou, Charles M; Beck, Andrew H
2017-02-01
The histopathological evaluation of morphological features in breast tumours provides prognostic information to guide therapy. Adjunct molecular analyses provide further diagnostic, prognostic and predictive information. However, there is limited knowledge of the molecular basis of morphological phenotypes in invasive breast cancer. This study integrated genomic, transcriptomic and protein data to provide a comprehensive molecular profiling of morphological features in breast cancer. Fifteen pathologists assessed 850 invasive breast cancer cases from The Cancer Genome Atlas (TCGA). Morphological features were significantly associated with genomic alteration, DNA methylation subtype, PAM50 and microRNA subtypes, proliferation scores, gene expression and/or reverse-phase protein assay subtype. Marked nuclear pleomorphism, necrosis, inflammation and a high mitotic count were associated with the basal-like subtype, and had a similar molecular basis. Omics-based signatures were constructed to predict morphological features. The association of morphology transcriptome signatures with overall survival in oestrogen receptor (ER)-positive and ER-negative breast cancer was first assessed by use of the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset; signatures that remained prognostic in the METABRIC multivariate analysis were further evaluated in five additional datasets. The transcriptomic signature of poorly differentiated epithelial tubules was prognostic in ER-positive breast cancer. No signature was prognostic in ER-negative breast cancer. This study provided new insights into the molecular basis of breast cancer morphological phenotypes. The integration of morphological with molecular data has the potential to refine breast cancer classification, predict response to therapy, enhance our understanding of breast cancer biology, and improve clinical management. This work is publicly accessible at www.dx.ai/tcga_breast. Copyright © 2016 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd. Copyright © 2016 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Niu, Yingli; Li, Wenqiang; Peng, Qian; Geng, Hua; Yi, Yuanping; Wang, Linjun; Nan, Guangjun; Wang, Dong; Shuai, Zhigang
2018-04-01
MOlecular MAterials Property Prediction Package (MOMAP) is a software toolkit for molecular materials property prediction. It focuses on luminescent properties and charge mobility properties. This article contains a brief descriptive introduction of key features, theoretical models and algorithms of the software, together with examples that illustrate the performance. First, we present the theoretical models and algorithms for molecular luminescent properties calculation, which includes the excited-state radiative/non-radiative decay rate constant and the optical spectra. Then, a multi-scale simulation approach and its algorithm for the molecular charge mobility are described. This approach is based on hopping model and combines with Kinetic Monte Carlo and molecular dynamics simulations, and it is especially applicable for describing a large category of organic semiconductors, whose inter-molecular electronic coupling is much smaller than intra-molecular charge reorganisation energy.
Aben, Nanne; Vis, Daniel J; Michaut, Magali; Wessels, Lodewyk F A
2016-09-01
Clinical response to anti-cancer drugs varies between patients. A large portion of this variation can be explained by differences in molecular features, such as mutation status, copy number alterations, methylation and gene expression profiles. We show that the classic approach for combining these molecular features (Elastic Net regression on all molecular features simultaneously) results in models that are almost exclusively based on gene expression. The gene expression features selected by the classic approach are difficult to interpret as they often represent poorly studied combinations of genes, activated by aberrations in upstream signaling pathways. To utilize all data types in a more balanced way, we developed TANDEM, a two-stage approach in which the first stage explains response using upstream features (mutations, copy number, methylation and cancer type) and the second stage explains the remainder using downstream features (gene expression). Applying TANDEM to 934 cell lines profiled across 265 drugs (GDSC1000), we show that the resulting models are more interpretable, while retaining the same predictive performance as the classic approach. Using the more balanced contributions per data type as determined with TANDEM, we find that response to MAPK pathway inhibitors is largely predicted by mutation data, while predicting response to DNA damaging agents requires gene expression data, in particular SLFN11 expression. TANDEM is available as an R package on CRAN (for more information, see http://ccb.nki.nl/software/tandem). m.michaut@nki.nl or l.wessels@nki.nl Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Prediction of lysine glutarylation sites by maximum relevance minimum redundancy feature selection.
Ju, Zhe; He, Jian-Jun
2018-06-01
Lysine glutarylation is new type of protein acylation modification in both prokaryotes and eukaryotes. To better understand the molecular mechanism of glutarylation, it is important to identify glutarylated substrates and their corresponding glutarylation sites accurately. In this study, a novel bioinformatics tool named GlutPred is developed to predict glutarylation sites by using multiple feature extraction and maximum relevance minimum redundancy feature selection. On the one hand, amino acid factors, binary encoding, and the composition of k-spaced amino acid pairs features are incorporated to encode glutarylation sites. And the maximum relevance minimum redundancy method and the incremental feature selection algorithm are adopted to remove the redundant features. On the other hand, a biased support vector machine algorithm is used to handle the imbalanced problem in glutarylation sites training dataset. As illustrated by 10-fold cross-validation, the performance of GlutPred achieves a satisfactory performance with a Sensitivity of 64.80%, a Specificity of 76.60%, an Accuracy of 74.90% and a Matthew's correlation coefficient of 0.3194. Feature analysis shows that some k-spaced amino acid pair features play the most important roles in the prediction of glutarylation sites. The conclusions derived from this study might provide some clues for understanding the molecular mechanisms of glutarylation. Copyright © 2018 Elsevier Inc. All rights reserved.
Breast cancer molecular subtype classification using deep features: preliminary results
NASA Astrophysics Data System (ADS)
Zhu, Zhe; Albadawy, Ehab; Saha, Ashirbani; Zhang, Jun; Harowicz, Michael R.; Mazurowski, Maciej A.
2018-02-01
Radiogenomics is a field of investigation that attempts to examine the relationship between imaging characteris- tics of cancerous lesions and their genomic composition. This could offer a noninvasive alternative to establishing genomic characteristics of tumors and aid cancer treatment planning. While deep learning has shown its supe- riority in many detection and classification tasks, breast cancer radiogenomic data suffers from a very limited number of training examples, which renders the training of the neural network for this problem directly and with no pretraining a very difficult task. In this study, we investigated an alternative deep learning approach referred to as deep features or off-the-shelf network approach to classify breast cancer molecular subtypes using breast dynamic contrast enhanced MRIs. We used the feature maps of different convolution layers and fully connected layers as features and trained support vector machines using these features for prediction. For the feature maps that have multiple layers, max-pooling was performed along each channel. We focused on distinguishing the Luminal A subtype from other subtypes. To evaluate the models, 10 fold cross-validation was performed and the final AUC was obtained by averaging the performance of all the folds. The highest average AUC obtained was 0.64 (0.95 CI: 0.57-0.71), using the feature maps of the last fully connected layer. This indicates the promise of using this approach to predict the breast cancer molecular subtypes. Since the best performance appears in the last fully connected layer, it also implies that breast cancer molecular subtypes may relate to high level image features
Macyszyn, Luke; Akbari, Hamed; Pisapia, Jared M.; Da, Xiao; Attiah, Mark; Pigrish, Vadim; Bi, Yingtao; Pal, Sharmistha; Davuluri, Ramana V.; Roccograndi, Laura; Dahmane, Nadia; Martinez-Lage, Maria; Biros, George; Wolf, Ronald L.; Bilello, Michel; O'Rourke, Donald M.; Davatzikos, Christos
2016-01-01
Background MRI characteristics of brain gliomas have been used to predict clinical outcome and molecular tumor characteristics. However, previously reported imaging biomarkers have not been sufficiently accurate or reproducible to enter routine clinical practice and often rely on relatively simple MRI measures. The current study leverages advanced image analysis and machine learning algorithms to identify complex and reproducible imaging patterns predictive of overall survival and molecular subtype in glioblastoma (GB). Methods One hundred five patients with GB were first used to extract approximately 60 diverse features from preoperative multiparametric MRIs. These imaging features were used by a machine learning algorithm to derive imaging predictors of patient survival and molecular subtype. Cross-validation ensured generalizability of these predictors to new patients. Subsequently, the predictors were evaluated in a prospective cohort of 29 new patients. Results Survival curves yielded a hazard ratio of 10.64 for predicted long versus short survivors. The overall, 3-way (long/medium/short survival) accuracy in the prospective cohort approached 80%. Classification of patients into the 4 molecular subtypes of GB achieved 76% accuracy. Conclusions By employing machine learning techniques, we were able to demonstrate that imaging patterns are highly predictive of patient survival. Additionally, we found that GB subtypes have distinctive imaging phenotypes. These results reveal that when imaging markers related to infiltration, cell density, microvascularity, and blood–brain barrier compromise are integrated via advanced pattern analysis methods, they form very accurate predictive biomarkers. These predictive markers used solely preoperative images, hence they can significantly augment diagnosis and treatment of GB patients. PMID:26188015
Davie, Stuart J; Di Pasquale, Nicodemo; Popelier, Paul L A
2016-10-15
Machine learning algorithms have been demonstrated to predict atomistic properties approaching the accuracy of quantum chemical calculations at significantly less computational cost. Difficulties arise, however, when attempting to apply these techniques to large systems, or systems possessing excessive conformational freedom. In this article, the machine learning method kriging is applied to predict both the intra-atomic and interatomic energies, as well as the electrostatic multipole moments, of the atoms of a water molecule at the center of a 10 water molecule (decamer) cluster. Unlike previous work, where the properties of small water clusters were predicted using a molecular local frame, and where training set inputs (features) were based on atomic index, a variety of feature definitions and coordinate frames are considered here to increase prediction accuracy. It is shown that, for a water molecule at the center of a decamer, no single method of defining features or coordinate schemes is optimal for every property. However, explicitly accounting for the structure of the first solvation shell in the definition of the features of the kriging training set, and centring the coordinate frame on the atom-of-interest will, in general, return better predictions than models that apply the standard methods of feature definition, or a molecular coordinate frame. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.
Prediction of interface residue based on the features of residue interaction network.
Jiao, Xiong; Ranganathan, Shoba
2017-11-07
Protein-protein interaction plays a crucial role in the cellular biological processes. Interface prediction can improve our understanding of the molecular mechanisms of the related processes and functions. In this work, we propose a classification method to recognize the interface residue based on the features of a weighted residue interaction network. The random forest algorithm is used for the prediction and 16 network parameters and the B-factor are acting as the element of the input feature vector. Compared with other similar work, the method is feasible and effective. The relative importance of these features also be analyzed to identify the key feature for the prediction. Some biological meaning of the important feature is explained. The results of this work can be used for the related work about the structure-function relationship analysis via a residue interaction network model. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dayde, Delphine; Tanaka, Ichidai; Jain, Rekha; Tai, Mei Chee; Taguchi, Ayumu
2017-03-07
The standard of care in locally advanced rectal cancer is neoadjuvant chemoradiation (nCRT) followed by radical surgery. Response to nCRT varies among patients and pathological complete response is associated with better outcome. However, there is a lack of effective methods to select rectal cancer patients who would or would not have a benefit from nCRT. The utility of clinicopathological and radiological features are limited due to lack of adequate sensitivity and specificity. Molecular biomarkers have the potential to predict response to nCRT at an early time point, but none have currently reached the clinic. Integration of diverse types of biomarkers including clinicopathological and imaging features, identification of mechanistic link to tumor biology, and rigorous validation using samples which represent disease heterogeneity, will allow to develop a sensitive and cost-effective molecular biomarker panel for precision medicine in rectal cancer. Here, we aim to review the recent advance in tissue- and blood-based molecular biomarker research and illustrate their potential in predicting nCRT response in rectal cancer.
Wan, Cen; Lees, Jonathan G; Minneci, Federico; Orengo, Christine A; Jones, David T
2017-10-01
Accurate gene or protein function prediction is a key challenge in the post-genome era. Most current methods perform well on molecular function prediction, but struggle to provide useful annotations relating to biological process functions due to the limited power of sequence-based features in that functional domain. In this work, we systematically evaluate the predictive power of temporal transcription expression profiles for protein function prediction in Drosophila melanogaster. Our results show significantly better performance on predicting protein function when transcription expression profile-based features are integrated with sequence-derived features, compared with the sequence-derived features alone. We also observe that the combination of expression-based and sequence-based features leads to further improvement of accuracy on predicting all three domains of gene function. Based on the optimal feature combinations, we then propose a novel multi-classifier-based function prediction method for Drosophila melanogaster proteins, FFPred-fly+. Interpreting our machine learning models also allows us to identify some of the underlying links between biological processes and developmental stages of Drosophila melanogaster.
Prediction and Dissection of Protein-RNA Interactions by Molecular Descriptors.
Liu, Zhi-Ping; Chen, Luonan
2016-01-01
Protein-RNA interactions play crucial roles in numerous biological processes. However, detecting the interactions and binding sites between protein and RNA by traditional experiments is still time consuming and labor costing. Thus, it is of importance to develop bioinformatics methods for predicting protein-RNA interactions and binding sites. Accurate prediction of protein-RNA interactions and recognitions will highly benefit to decipher the interaction mechanisms between protein and RNA, as well as to improve the RNA-related protein engineering and drug design. In this work, we summarize the current bioinformatics strategies of predicting protein-RNA interactions and dissecting protein-RNA interaction mechanisms from local structure binding motifs. In particular, we focus on the feature-based machine learning methods, in which the molecular descriptors of protein and RNA are extracted and integrated as feature vectors of representing the interaction events and recognition residues. In addition, the available methods are classified and compared comprehensively. The molecular descriptors are expected to elucidate the binding mechanisms of protein-RNA interaction and reveal the functional implications from structural complementary perspective.
Macyszyn, Luke; Akbari, Hamed; Pisapia, Jared M; Da, Xiao; Attiah, Mark; Pigrish, Vadim; Bi, Yingtao; Pal, Sharmistha; Davuluri, Ramana V; Roccograndi, Laura; Dahmane, Nadia; Martinez-Lage, Maria; Biros, George; Wolf, Ronald L; Bilello, Michel; O'Rourke, Donald M; Davatzikos, Christos
2016-03-01
MRI characteristics of brain gliomas have been used to predict clinical outcome and molecular tumor characteristics. However, previously reported imaging biomarkers have not been sufficiently accurate or reproducible to enter routine clinical practice and often rely on relatively simple MRI measures. The current study leverages advanced image analysis and machine learning algorithms to identify complex and reproducible imaging patterns predictive of overall survival and molecular subtype in glioblastoma (GB). One hundred five patients with GB were first used to extract approximately 60 diverse features from preoperative multiparametric MRIs. These imaging features were used by a machine learning algorithm to derive imaging predictors of patient survival and molecular subtype. Cross-validation ensured generalizability of these predictors to new patients. Subsequently, the predictors were evaluated in a prospective cohort of 29 new patients. Survival curves yielded a hazard ratio of 10.64 for predicted long versus short survivors. The overall, 3-way (long/medium/short survival) accuracy in the prospective cohort approached 80%. Classification of patients into the 4 molecular subtypes of GB achieved 76% accuracy. By employing machine learning techniques, we were able to demonstrate that imaging patterns are highly predictive of patient survival. Additionally, we found that GB subtypes have distinctive imaging phenotypes. These results reveal that when imaging markers related to infiltration, cell density, microvascularity, and blood-brain barrier compromise are integrated via advanced pattern analysis methods, they form very accurate predictive biomarkers. These predictive markers used solely preoperative images, hence they can significantly augment diagnosis and treatment of GB patients. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Neuro-Oncology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.
Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969
Knowledge-based fragment binding prediction.
Tang, Grace W; Altman, Russ B
2014-04-01
Target-based drug discovery must assess many drug-like compounds for potential activity. Focusing on low-molecular-weight compounds (fragments) can dramatically reduce the chemical search space. However, approaches for determining protein-fragment interactions have limitations. Experimental assays are time-consuming, expensive, and not always applicable. At the same time, computational approaches using physics-based methods have limited accuracy. With increasing high-resolution structural data for protein-ligand complexes, there is now an opportunity for data-driven approaches to fragment binding prediction. We present FragFEATURE, a machine learning approach to predict small molecule fragments preferred by a target protein structure. We first create a knowledge base of protein structural environments annotated with the small molecule substructures they bind. These substructures have low-molecular weight and serve as a proxy for fragments. FragFEATURE then compares the structural environments within a target protein to those in the knowledge base to retrieve statistically preferred fragments. It merges information across diverse ligands with shared substructures to generate predictions. Our results demonstrate FragFEATURE's ability to rediscover fragments corresponding to the ligand bound with 74% precision and 82% recall on average. For many protein targets, it identifies high scoring fragments that are substructures of known inhibitors. FragFEATURE thus predicts fragments that can serve as inputs to fragment-based drug design or serve as refinement criteria for creating target-specific compound libraries for experimental or computational screening.
Knowledge-based Fragment Binding Prediction
Tang, Grace W.; Altman, Russ B.
2014-01-01
Target-based drug discovery must assess many drug-like compounds for potential activity. Focusing on low-molecular-weight compounds (fragments) can dramatically reduce the chemical search space. However, approaches for determining protein-fragment interactions have limitations. Experimental assays are time-consuming, expensive, and not always applicable. At the same time, computational approaches using physics-based methods have limited accuracy. With increasing high-resolution structural data for protein-ligand complexes, there is now an opportunity for data-driven approaches to fragment binding prediction. We present FragFEATURE, a machine learning approach to predict small molecule fragments preferred by a target protein structure. We first create a knowledge base of protein structural environments annotated with the small molecule substructures they bind. These substructures have low-molecular weight and serve as a proxy for fragments. FragFEATURE then compares the structural environments within a target protein to those in the knowledge base to retrieve statistically preferred fragments. It merges information across diverse ligands with shared substructures to generate predictions. Our results demonstrate FragFEATURE's ability to rediscover fragments corresponding to the ligand bound with 74% precision and 82% recall on average. For many protein targets, it identifies high scoring fragments that are substructures of known inhibitors. FragFEATURE thus predicts fragments that can serve as inputs to fragment-based drug design or serve as refinement criteria for creating target-specific compound libraries for experimental or computational screening. PMID:24762971
Structural features that predict real-value fluctuations of globular proteins.
Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke
2012-05-01
It is crucial to consider dynamics for understanding the biological function of proteins. We used a large number of molecular dynamics (MD) trajectories of nonhomologous proteins as references and examined static structural features of proteins that are most relevant to fluctuations. We examined correlation of individual structural features with fluctuations and further investigated effective combinations of features for predicting the real value of residue fluctuations using the support vector regression (SVR). It was found that some structural features have higher correlation than crystallographic B-factors with fluctuations observed in MD trajectories. Moreover, SVR that uses combinations of static structural features showed accurate prediction of fluctuations with an average Pearson's correlation coefficient of 0.669 and a root mean square error of 1.04 Å. This correlation coefficient is higher than the one observed in predictions by the Gaussian network model (GNM). An advantage of the developed method over the GNMs is that the former predicts the real value of fluctuation. The results help improve our understanding of relationships between protein structure and fluctuation. Furthermore, the developed method provides a convienient practial way to predict fluctuations of proteins using easily computed static structural features of proteins. Copyright © 2012 Wiley Periodicals, Inc.
Gangeh, Mehrdad; Tadayyon, Hadi; Sadeghi-Naini, Ali; Gandhi, Sonal; Wright, Frances C.; Slodkowska, Elzbieta; Curpen, Belinda; Tran, William; Czarnota, Gregory J.
2018-01-01
Background Pathological response of breast cancer to chemotherapy is a prognostic indicator for long-term disease free and overall survival. Responses of locally advanced breast cancer in the neoadjuvant chemotherapy (NAC) settings are often variable, and the prediction of response is imperfect. The purpose of this study was to detect primary tumor responses early after the start of neoadjuvant chemotherapy using quantitative ultrasound (QUS), textural analysis and molecular features in patients with locally advanced breast cancer. Methods The study included ninety six patients treated with neoadjuvant chemotherapy. Breast tumors were scanned with a clinical ultrasound system prior to chemotherapy treatment, during the first, fourth and eighth week of treatment, and prior to surgery. Quantitative ultrasound parameters and scatterer-based features were calculated from ultrasound radio frequency (RF) data within tumor regions of interest. Additionally, texture features were extracted from QUS parametric maps. Prior to therapy, all patients underwent a core needle biopsy and histological subtypes and biomarker ER, PR, and HER2 status were determined. Patients were classified into three treatment response groups based on combination of clinical and pathological analyses: complete responders (CR), partial responders (PR), and non-responders (NR). Response classifications from QUS parameters, receptors status and pathological were compared. Discriminant analysis was performed on extracted parameters using a support vector machine classifier to categorize subjects into CR, PR, and NR groups at all scan times. Results Of the 96 patients, the number of CR, PR and NR patients were 21, 52, and 23, respectively. The best prediction of treatment response was achieved with the combination mean QUS values, texture and molecular features with accuracies of 78%, 86% and 83% at weeks 1, 4, and 8, after treatment respectively. Mean QUS parameters or clinical receptors status alone predicted the three response groups with accuracies less than 60% at all scan time points. Recurrence free survival (RFS) of response groups determined based on combined features followed similar trend as determined based on clinical and pathology. Conclusions This work demonstrates the potential of using QUS, texture and molecular features for predicting the response of primary breast tumors to chemotherapy early, and guiding the treatment planning of refractory patients. PMID:29298305
Breaking the polar-nonpolar division in solvation free energy prediction.
Wang, Bao; Wang, Chengzhang; Wu, Kedi; Wei, Guo-Wei
2018-02-05
Implicit solvent models divide solvation free energies into polar and nonpolar additive contributions, whereas polar and nonpolar interactions are inseparable and nonadditive. We present a feature functional theory (FFT) framework to break this ad hoc division. The essential ideas of FFT are as follows: (i) representability assumption: there exists a microscopic feature vector that can uniquely characterize and distinguish one molecule from another; (ii) feature-function relationship assumption: the macroscopic features, including solvation free energy, of a molecule is a functional of microscopic feature vectors; and (iii) similarity assumption: molecules with similar microscopic features have similar macroscopic properties, such as solvation free energies. Based on these assumptions, solvation free energy prediction is carried out in the following protocol. First, we construct a molecular microscopic feature vector that is efficient in characterizing the solvation process using quantum mechanics and Poisson-Boltzmann theory. Microscopic feature vectors are combined with macroscopic features, that is, physical observable, to form extended feature vectors. Additionally, we partition a solvation dataset into queries according to molecular compositions. Moreover, for each target molecule, we adopt a machine learning algorithm for its nearest neighbor search, based on the selected microscopic feature vectors. Finally, from the extended feature vectors of obtained nearest neighbors, we construct a functional of solvation free energy, which is employed to predict the solvation free energy of the target molecule. The proposed FFT model has been extensively validated via a large dataset of 668 molecules. The leave-one-out test gives an optimal root-mean-square error (RMSE) of 1.05 kcal/mol. FFT predictions of SAMPL0, SAMPL1, SAMPL2, SAMPL3, and SAMPL4 challenge sets deliver the RMSEs of 0.61, 1.86, 1.64, 0.86, and 1.14 kcal/mol, respectively. Using a test set of 94 molecules and its associated training set, the present approach was carefully compared with a classic solvation model based on weighted solvent accessible surface area. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Andrabi, Munazah; Hutchins, Andrew Paul; Miranda-Saavedra, Diego; Kono, Hidetoshi; Nussinov, Ruth; Mizuguchi, Kenji; Ahmad, Shandar
2017-06-22
DNA shape is emerging as an important determinant of transcription factor binding beyond just the DNA sequence. The only tool for large scale DNA shape estimates, DNAshape was derived from Monte-Carlo simulations and predicts four broad and static DNA shape features, Propeller twist, Helical twist, Minor groove width and Roll. The contributions of other shape features e.g. Shift, Slide and Opening cannot be evaluated using DNAshape. Here, we report a novel method DynaSeq, which predicts molecular dynamics-derived ensembles of a more exhaustive set of DNA shape features. We compared the DNAshape and DynaSeq predictions for the common features and applied both to predict the genome-wide binding sites of 1312 TFs available from protein interaction quantification (PIQ) data. The results indicate a good agreement between the two methods for the common shape features and point to advantages in using DynaSeq. Predictive models employing ensembles from individual conformational parameters revealed that base-pair opening - known to be important in strand separation - was the best predictor of transcription factor-binding sites (TFBS) followed by features employed by DNAshape. Of note, TFBS could be predicted not only from the features at the target motif sites, but also from those as far as 200 nucleotides away from the motif.
Evaluating a variety of text-mined features for automatic protein function prediction with GOstruct.
Funk, Christopher S; Kahanda, Indika; Ben-Hur, Asa; Verspoor, Karin M
2015-01-01
Most computational methods that predict protein function do not take advantage of the large amount of information contained in the biomedical literature. In this work we evaluate both ontology term co-mention and bag-of-words features mined from the biomedical literature and analyze their impact in the context of a structured output support vector machine model, GOstruct. We find that even simple literature based features are useful for predicting human protein function (F-max: Molecular Function =0.408, Biological Process =0.461, Cellular Component =0.608). One advantage of using literature features is their ability to offer easy verification of automated predictions. We find through manual inspection of misclassifications that some false positive predictions could be biologically valid predictions based upon support extracted from the literature. Additionally, we present a "medium-throughput" pipeline that was used to annotate a large subset of co-mentions; we suggest that this strategy could help to speed up the rate at which proteins are curated.
Molecular Docking for Prediction and Interpretation of Adverse Drug Reactions.
Luo, Heng; Fokoue-Nkoutche, Achille; Singh, Nalini; Yang, Lun; Hu, Jianying; Zhang, Ping
2018-05-23
Adverse drug reactions (ADRs) present a major burden for patients and the healthcare industry. Various computational methods have been developed to predict ADRs for drug molecules. However, many of these methods require experimental or surveillance data and cannot be used when only structural information is available. We collected 1,231 small molecule drugs and 600 human proteins and utilized molecular docking to generate binding features among them. We developed machine learning models that use these docking features to make predictions for 1,533 ADRs. These models obtain an overall area under the receiver operating characteristic curve (AUROC) of 0.843 and an overall area under the precision-recall curve (AUPR) of 0.395, outperforming seven structural fingerprint-based prediction models. Using the method, we predicted skin striae for fluticasone propionate, dermatitis acneiform for mometasone, and decreased libido for irinotecan, as demonstrations. Furthermore, we analyzed the top binding proteins associated with some of the ADRs, which can help to understand and/or generate hypotheses for underlying mechanisms of ADRs. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Trabanino, Rene J; Vaidehi, Nagarajan; Hall, Spencer E; Goddard, William A; Floriano, Wely
2013-02-05
The invention provides computer-implemented methods and apparatus implementing a hierarchical protocol using multiscale molecular dynamics and molecular modeling methods to predict the presence of transmembrane regions in proteins, such as G-Protein Coupled Receptors (GPCR), and protein structural models generated according to the protocol. The protocol features a coarse grain sampling method, such as hydrophobicity analysis, to provide a fast and accurate procedure for predicting transmembrane regions. Methods and apparatus of the invention are useful to screen protein or polynucleotide databases for encoded proteins with transmembrane regions, such as GPCRs.
Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat
2015-01-01
Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.
A feature-based approach to modeling protein-protein interaction hot spots.
Cho, Kyu-il; Kim, Dongsup; Lee, Doheon
2009-05-01
Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to pi-related interactions, especially pi . . . pi interactions.
Molecular Heterogeneity in Glioblastoma: Potential Clinical Implications
Parker, Nicole Renee; Khong, Peter; Parkinson, Jonathon Fergus; Howell, Viive Maarika; Wheeler, Helen Ruth
2015-01-01
Glioblastomas, (grade 4 astrocytomas), are aggressive primary brain tumors characterized by histopathological heterogeneity. High-resolution sequencing technologies have shown that these tumors also feature significant inter-tumoral molecular heterogeneity. Molecular subtyping of these tumors has revealed several predictive and prognostic biomarkers. However, intra-tumoral heterogeneity may undermine the use of single biopsy analysis for determining tumor genotype and has implications for potential targeted therapies. The clinical relevance and theories of tumoral molecular heterogeneity in glioblastoma are discussed. PMID:25785247
Molecular classification of breast cancer: what the pathologist needs to know.
Rakha, Emad A; Green, Andrew R
2017-02-01
Breast cancer is a heterogeneous disease featuring distinct histological, molecular and clinical phenotypes. Although traditional classification systems utilising clinicopathological and few molecular markers are well established and validated, they remain insufficient to reflect the diverse biological and clinical heterogeneity of breast cancer. Advancements in high-throughput molecular techniques and bioinformatics have contributed to the improved understanding of breast cancer biology, refinement of molecular taxonomies and the development of novel prognostic and predictive molecular assays. Application of such technologies is already underway, and is expected to change the way we manage breast cancer. Despite the enormous amount of work that has been carried out to develop and refine breast cancer molecular prognostic and predictive assays, molecular testing is still in evolution. Pathologists should be aware of the new technology and be ready for the challenge. In this review, we provide an update on the application of molecular techniques with regard to breast cancer diagnosis, prognosis and outcome prediction. The current contribution of emerging technology to our understanding of breast cancer is also highlighted. Copyright © 2016 Royal College of Pathologists of Australasia. Published by Elsevier B.V. All rights reserved.
Cross-Platform Toxicogenomics for the Prediction of Non-Genotoxic Hepatocarcinogenesis in Rat
Metzger, Ute; Templin, Markus F.; Plummer, Simon; Ellinger-Ziegelbauer, Heidrun; Zell, Andreas
2014-01-01
In the area of omics profiling in toxicology, i.e. toxicogenomics, characteristic molecular profiles have previously been incorporated into prediction models for early assessment of a carcinogenic potential and mechanism-based classification of compounds. Traditionally, the biomarker signatures used for model construction were derived from individual high-throughput techniques, such as microarrays designed for monitoring global mRNA expression. In this study, we built predictive models by integrating omics data across complementary microarray platforms and introduced new concepts for modeling of pathway alterations and molecular interactions between multiple biological layers. We trained and evaluated diverse machine learning-based models, differing in the incorporated features and learning algorithms on a cross-omics dataset encompassing mRNA, miRNA, and protein expression profiles obtained from rat liver samples treated with a heterogeneous set of substances. Most of these compounds could be unambiguously classified as genotoxic carcinogens, non-genotoxic carcinogens, or non-hepatocarcinogens based on evidence from published studies. Since mixed characteristics were reported for the compounds Cyproterone acetate, Thioacetamide, and Wy-14643, we reclassified these compounds as either genotoxic or non-genotoxic carcinogens based on their molecular profiles. Evaluating our toxicogenomics models in a repeated external cross-validation procedure, we demonstrated that the prediction accuracy of our models could be increased by joining the biomarker signatures across multiple biological layers and by adding complex features derived from cross-platform integration of the omics data. Furthermore, we found that adding these features resulted in a better separation of the compound classes and a more confident reclassification of the three undefined compounds as non-genotoxic carcinogens. PMID:24830643
An ensemble predictive modeling framework for breast cancer classification.
Nagarajan, Radhakrishnan; Upreti, Meenakshi
2017-12-01
Molecular changes often precede clinical presentation of diseases and can be useful surrogates with potential to assist in informed clinical decision making. Recent studies have demonstrated the usefulness of modeling approaches such as classification that can predict the clinical outcomes from molecular expression profiles. While useful, a majority of these approaches implicitly use all molecular markers as features in the classification process often resulting in sparse high-dimensional projection of the samples often comparable to that of the sample size. In this study, a variant of the recently proposed ensemble classification approach is used for predicting good and poor-prognosis breast cancer samples from their molecular expression profiles. In contrast to traditional single and ensemble classifiers, the proposed approach uses multiple base classifiers with varying feature sets obtained from two-dimensional projection of the samples in conjunction with a majority voting strategy for predicting the class labels. In contrast to our earlier implementation, base classifiers in the ensembles are chosen based on maximal sensitivity and minimal redundancy by choosing only those with low average cosine distance. The resulting ensemble sets are subsequently modeled as undirected graphs. Performance of four different classification algorithms is shown to be better within the proposed ensemble framework in contrast to using them as traditional single classifier systems. Significance of a subset of genes with high-degree centrality in the network abstractions across the poor-prognosis samples is also discussed. Copyright © 2017 Elsevier Inc. All rights reserved.
A feature-based approach to modeling protein–protein interaction hot spots
Cho, Kyu-il; Kim, Dongsup; Lee, Doheon
2009-01-01
Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to π–related interactions, especially π · · · π interactions. PMID:19273533
Iwata, Hiroaki; Sawada, Ryusuke; Mizutani, Sayaka; Yamanishi, Yoshihiro
2015-02-23
Drug repositioning, or the application of known drugs to new indications, is a challenging issue in pharmaceutical science. In this study, we developed a new computational method to predict unknown drug indications for systematic drug repositioning in a framework of supervised network inference. We defined a descriptor for each drug-disease pair based on the phenotypic features of drugs (e.g., medicinal effects and side effects) and various molecular features of diseases (e.g., disease-causing genes, diagnostic markers, disease-related pathways, and environmental factors) and constructed a statistical model to predict new drug-disease associations for a wide range of diseases in the International Classification of Diseases. Our results show that the proposed method outperforms previous methods in terms of accuracy and applicability, and its performance does not depend on drug chemical structure similarity. Finally, we performed a comprehensive prediction of a drug-disease association network consisting of 2349 drugs and 858 diseases and described biologically meaningful examples of newly predicted drug indications for several types of cancers and nonhereditary diseases.
ROLE OF MOLECULAR MARKERS IN THYROID NODULE MANAGEMENT: THEN AND NOW.
Nikiforov, Yuri E
2017-08-01
To describe the evolution and clinical utility of molecular testing for thyroid nodules and cancer achieved over the last 2 decades. Scientific reports on thyroid cancer genetics and molecular diagnostics in thyroid nodules. Over the last 2 decades, our understanding of the genetic mechanisms of thyroid cancer has dramatically expanded, such that most thyroid cancers now have known gene driver events. This knowledge provides the basis for establishing and further improving molecular tests for thyroid nodules and cancer and for the introduction of new entities such as noninvasive follicular thyroid neoplasm with papillary-like nuclear features. The progress with molecular tests for thyroid nodules started in the 1990s from demonstrating feasibility of detecting various molecular alterations in fine-needle aspiration (FNA) material collected from thyroid nodules. It was followed by the introduction of the first single-gene mutational markers, such as BRAF, and a small mutational panel into clinical practice in the mid 2000s. Currently, several more advanced molecular tests are available for clinical use. They are based on multiple molecular markers and have increasing impact on the clinical management of patients with thyroid nodules. The evolution of molecular tests for thyroid nodules followed the discovery of various diagnostic and prognostic molecular markers of thyroid cancer that can be applied to thyroid FNA samples to inform more individualized management of these patients. FNA = fine-needle aspiration miRNA = micro RNA NGS = next-generation sequencing NIFTP = noninvasive follicular thyroid neoplasm with papillary-like nuclear features NPV = negative predictive value PPV = positive predictive value PTC = papillary thyroid carcinoma RAI = radioactive iodine.
Mahapatra, Manoj Kumar; Bera, Krishnendu; Singh, Durg Vijay; Kumar, Rajnish; Kumar, Manoj
2018-04-01
Protein tyrosine phosphatase 1B (PTP1B) has been identified as a negative regulator of insulin and leptin signalling pathway; hence, it can be considered as a new therapeutic target of intervention for the treatment of type 2 diabetes. Inhibition of this molecular target takes care of both diabetes and obesity, i.e. diabestiy. In order to get more information on identification and optimization of lead, pharmacophore modelling, atom-based 3D QSAR, docking and molecular dynamics studies were carried out on a set of ligands containing thiazolidine scaffold. A six-point pharmacophore model consisting of three hydrogen bond acceptor (A), one negative ionic (N) and two aromatic rings (R) with discrete geometries as pharmacophoric features were developed for a predictive 3D QSAR model. The probable binding conformation of the ligands within the active site was studied through molecular docking. The molecular interactions and the structural features responsible for PTP1B inhibition and selectivity were further supplemented by molecular dynamics simulation study for a time scale of 30 ns. The present investigation has identified some of the indispensible structural features of thiazolidine analogues which can further be explored to optimize PTP1B inhibitors.
NASA Astrophysics Data System (ADS)
Allis, Damian G.; Hakey, Patrick M.; Korter, Timothy M.
2008-10-01
The terahertz (THz, far-infrared) spectrum of 3,4-methylene-dioxymethamphetamine hydrochloride (Ecstasy) is simulated using solid-state density functional theory. While a previously reported isolated-molecule calculation is noteworthy for the precision of its solid-state THz reproduction, the solid-state calculation predicts that the isolated-molecule modes account for only half of the spectral features in the THz region, with the remaining structure arising from lattice vibrations that cannot be predicted without solid-state molecular modeling. The molecular origins of the internal mode contributions to the solid-state THz spectrum, as well as the proper consideration of the protonation state of the molecule, are also considered.
Tamayo, Pablo; Cho, Yoon-Jae; Tsherniak, Aviad; Greulich, Heidi; Ambrogio, Lauren; Schouten-van Meeteren, Netteke; Zhou, Tianni; Buxton, Allen; Kool, Marcel; Meyerson, Matthew; Pomeroy, Scott L.; Mesirov, Jill P.
2011-01-01
Purpose Despite significant progress in the molecular understanding of medulloblastoma, stratification of risk in patients remains a challenge. Focus has shifted from clinical parameters to molecular markers, such as expression of specific genes and selected genomic abnormalities, to improve accuracy of treatment outcome prediction. Here, we show how integration of high-level clinical and genomic features or risk factors, including disease subtype, can yield more comprehensive, accurate, and biologically interpretable prediction models for relapse versus no-relapse classification. We also introduce a novel Bayesian nomogram indicating the amount of evidence that each feature contributes on a patient-by-patient basis. Patients and Methods A Bayesian cumulative log-odds model of outcome was developed from a training cohort of 96 children treated for medulloblastoma, starting with the evidence provided by clinical features of metastasis and histology (model A) and incrementally adding the evidence from gene-expression–derived features representing disease subtype–independent (model B) and disease subtype–dependent (model C) pathways, and finally high-level copy-number genomic abnormalities (model D). The models were validated on an independent test cohort (n = 78). Results On an independent multi-institutional test data set, models A to D attain an area under receiver operating characteristic (au-ROC) curve of 0.73 (95% CI, 0.60 to 0.84), 0.75 (95% CI, 0.64 to 0.86), 0.80 (95% CI, 0.70 to 0.90), and 0.78 (95% CI, 0.68 to 0.88), respectively, for predicting relapse versus no relapse. Conclusion The proposed models C and D outperform the current clinical classification schema (au-ROC, 0.68), our previously published eight-gene outcome signature (au-ROC, 0.71), and several new schemas recently proposed in the literature for medulloblastoma risk stratification. PMID:21357789
Bahrami, Naeim; Hartman, Stephen J; Chang, Yu-Hsuan; Delfanti, Rachel; White, Nathan S; Karunamuni, Roshan; Seibert, Tyler M; Dale, Anders M; Hattangadi-Gluth, Jona A; Piccioni, David; Farid, Nikdokht; McDonald, Carrie R
2018-06-02
Molecular markers of WHO grade II/III glioma are known to have important prognostic and predictive implications and may be associated with unique imaging phenotypes. The purpose of this study is to determine whether three clinically relevant molecular markers identified in gliomas-IDH, 1p/19q, and MGMT status-show distinct quantitative MRI characteristics on FLAIR imaging. Sixty-one patients with grade II/III gliomas who had molecular data and MRI available prior to radiation were included. Quantitative MRI features were extracted that measured tissue heterogeneity (homogeneity and pixel correlation) and FLAIR border distinctiveness (edge contrast; EC). T-tests were conducted to determine whether patients with different genotypes differ across the features. Logistic regression with LASSO regularization was used to determine the optimal combination of MRI and clinical features for predicting molecular subtypes. Patients with IDH wildtype tumors showed greater signal heterogeneity (p = 0.001) and lower EC (p = 0.008) within the FLAIR region compared to IDH mutant tumors. Among patients with IDH mutant tumors, 1p/19q co-deleted tumors had greater signal heterogeneity (p = 0.002) and lower EC (p = 0.005) compared to 1p/19q intact tumors. MGMT methylated tumors showed lower EC (p = 0.03) compared to the unmethylated group. The combination of FLAIR border distinctness, heterogeneity, and pixel correlation optimally classified tumors by IDH status. Quantitative imaging characteristics of FLAIR heterogeneity and border pattern in grade II/III gliomas may provide unique information for determining molecular status at time of initial diagnostic imaging, which may then guide subsequent surgical and medical management.
NASA Astrophysics Data System (ADS)
Draghici, Sorin; Cumberland, Lonnie T., Jr.; Kovari, Ladislau C.
2000-04-01
This paper presents some results of data mining HIV genotypic and structural data. Our aim is to try to relate structural features of HIV enzymes essential to its reproductive abilities to the drug resistance phenomenon. This paper concentrates on the HIV protease enzyme and Indinavir which is one of the FDA approved protease inhibitors. Our starting point was the current list of HIV mutations related to drug resistance. We used the fact that some molecular structures determined through high resolution X-ray crystallography were available for the protease-Indinavir complex. Starting with these structures and the known mutations, we modelled the mutant proteases and studied the pattern of atomic contacts between the protease and the drug. After suitable pre- processing, these patterns have been used as the input of our data mining process. We have used both supervised and unsupervised learning techniques with the aim of understanding the relationship between structural features at a molecular level and resistance to Indinavir. The supervised learning was aimed at predicting IC90 values for arbitrary mutants. The SOFM was aimed at identifying those structural features that are important for drug resistance and discovering a classifier based on such features. We have used validation and cross validation to test the generalization abilities of the learning paradigm we have designed. The straightforward supervised learning was able to learn very successfully but validation results are less than satisfactory. This is due to the insufficient number of patterns in the training set which in turn is due to the scarcity of the available data. The data mining using SOFM was very successful. We have managed to distinguish between resistant and non-resistant mutants using structural features. We have been able to divide all reported HIV mutants into several categories based on their 3- dimensional molecular structures and the pattern of contacts between the mutant protease and Indinavir. Our classifier shows reasonably good prediction performance being able to predict the drug resistance of previously unseen mutants with an accuracy of between 60% and 70%. We believe that this performance can be greatly improved once more data becomes available. The results presented here support the hypothesis that structural features of the molecular structure can be used in antiviral drug treatment selection and drug design.
Quantitative radiomic profiling of glioblastoma represents transcriptomic expression.
Kong, Doo-Sik; Kim, Junhyung; Ryu, Gyuha; You, Hye-Jin; Sung, Joon Kyung; Han, Yong Hee; Shin, Hye-Mi; Lee, In-Hee; Kim, Sung-Tae; Park, Chul-Kee; Choi, Seung Hong; Choi, Jeong Won; Seol, Ho Jun; Lee, Jung-Il; Nam, Do-Hyun
2018-01-19
Quantitative imaging biomarkers have increasingly emerged in the field of research utilizing available imaging modalities. We aimed to identify good surrogate radiomic features that can represent genetic changes of tumors, thereby establishing noninvasive means for predicting treatment outcome. From May 2012 to June 2014, we retrospectively identified 65 patients with treatment-naïve glioblastoma with available clinical information from the Samsung Medical Center data registry. Preoperative MR imaging data were obtained for all 65 patients with primary glioblastoma. A total of 82 imaging features including first-order statistics, volume, and size features, were semi-automatically extracted from structural and physiologic images such as apparent diffusion coefficient and perfusion images. Using commercially available software, NordicICE, we performed quantitative imaging analysis and collected the dataset composed of radiophenotypic parameters. Unsupervised clustering methods revealed that the radiophenotypic dataset was composed of three clusters. Each cluster represented a distinct molecular classification of glioblastoma; classical type, proneural and neural types, and mesenchymal type. These clusters also reflected differential clinical outcomes. We found that extracted imaging signatures does not represent copy number variation and somatic mutation. Quantitative radiomic features provide a potential evidence to predict molecular phenotype and treatment outcome. Radiomic profiles represents transcriptomic phenotypes more well.
Yuan, Yaxia; Zheng, Fang; Zhan, Chang-Guo
2018-03-21
Blood-brain barrier (BBB) permeability of a compound determines whether the compound can effectively enter the brain. It is an essential property which must be accounted for in drug discovery with a target in the brain. Several computational methods have been used to predict the BBB permeability. In particular, support vector machine (SVM), which is a kernel-based machine learning method, has been used popularly in this field. For SVM training and prediction, the compounds are characterized by molecular descriptors. Some SVM models were based on the use of molecular property-based descriptors (including 1D, 2D, and 3D descriptors) or fragment-based descriptors (known as the fingerprints of a molecule). The selection of descriptors is critical for the performance of a SVM model. In this study, we aimed to develop a generally applicable new SVM model by combining all of the features of the molecular property-based descriptors and fingerprints to improve the accuracy for the BBB permeability prediction. The results indicate that our SVM model has improved accuracy compared to the currently available models of the BBB permeability prediction.
Final report for the DOE Early Career Award #DE-SC0003912
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jayaraman, Arthi
This DoE supported early career project was aimed at developing computational models, theory and simulation methods that would be then be used to predict assembly and morphology in polymer nanocomposites. In particular, the focus was on composites in active layers of devices, containing conducting polymers that act as electron donors and nanoscale additives that act as electron acceptors. During the course this work, we developed the first of its kind molecular models to represent conducting polymers enabling simulations at the experimentally relevant length and time scales. By comparison with experimentally observed morphologies we validated these models. Furthermore, using these modelsmore » and molecular dynamics simulations on graphical processing units (GPUs) we predicted the molecular level design features in polymers and additive that lead to morphologies with optimal features for charge carrier behavior in solar cells. Additionally, we also predicted computationally new design rules for better dispersion of additives in polymers that have been confirmed through experiments. Achieving dispersion in polymer nanocomposites is valuable to achieve controlled macroscopic properties of the composite. The results obtained during the course of this DOE funded project enables optimal design of higher efficiency organic electronic and photovoltaic devices and improve every day life with engineering of these higher efficiency devices.« less
Selby-Pham, Sophie N B; Howell, Kate S; Dunshea, Frank R; Ludbey, Joel; Lutz, Adrian; Bennett, Louise
2018-04-15
A diet rich in phytochemicals confers benefits for health by reducing the risk of chronic diseases via regulation of oxidative stress and inflammation (OSI). For optimal protective bio-efficacy, the time required for phytochemicals and their metabolites to reach maximal plasma concentrations (T max ) should be synchronised with the time of increased OSI. A statistical model has been reported to predict T max of individual phytochemicals based on molecular mass and lipophilicity. We report the application of the model for predicting the absorption profile of an uncharacterised phytochemical mixture, herein referred to as the 'functional fingerprint'. First, chemical profiles of phytochemical extracts were acquired using liquid chromatography mass spectrometry (LC-MS), then the molecular features for respective components were used to predict their plasma absorption maximum, based on molecular mass and lipophilicity. This method of 'functional fingerprinting' of plant extracts represents a novel tool for understanding and optimising the health efficacy of plant extracts. Copyright © 2017 Elsevier Ltd. All rights reserved.
Predicting beta-turns in proteins using support vector machines with fractional polynomials
2013-01-01
Background β-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design. Results We propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features. Conclusions In this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods. PMID:24565438
Predicting beta-turns in proteins using support vector machines with fractional polynomials.
Elbashir, Murtada; Wang, Jianxin; Wu, Fang-Xiang; Wang, Lusheng
2013-11-07
β-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design. We propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features. In this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods.
Predicting DNA hybridization kinetics from sequence
NASA Astrophysics Data System (ADS)
Zhang, Jinny X.; Fang, John Z.; Duan, Wei; Wu, Lucia R.; Zhang, Angela W.; Dalchau, Neil; Yordanov, Boyan; Petersen, Rasmus; Phillips, Andrew; Zhang, David Yu
2018-01-01
Hybridization is a key molecular process in biology and biotechnology, but so far there is no predictive model for accurately determining hybridization rate constants based on sequence information. Here, we report a weighted neighbour voting (WNV) prediction algorithm, in which the hybridization rate constant of an unknown sequence is predicted based on similarity reactions with known rate constants. To construct this algorithm we first performed 210 fluorescence kinetics experiments to observe the hybridization kinetics of 100 different DNA target and probe pairs (36 nt sub-sequences of the CYCS and VEGF genes) at temperatures ranging from 28 to 55 °C. Automated feature selection and weighting optimization resulted in a final six-feature WNV model, which can predict hybridization rate constants of new sequences to within a factor of 3 with ∼91% accuracy, based on leave-one-out cross-validation. Accurate prediction of hybridization kinetics allows the design of efficient probe sequences for genomics research.
Guo, Song; Liu, Chunhua; Zhou, Peng; Li, Yanling
2016-01-01
Tyrosine sulfation is one of the ubiquitous protein posttranslational modifications, where some sulfate groups are added to the tyrosine residues. It plays significant roles in various physiological processes in eukaryotic cells. To explore the molecular mechanism of tyrosine sulfation, one of the prerequisites is to correctly identify possible protein tyrosine sulfation residues. In this paper, a novel method was presented to predict protein tyrosine sulfation residues from primary sequences. By means of informative feature construction and elaborate feature selection and parameter optimization scheme, the proposed predictor achieved promising results and outperformed many other state-of-the-art predictors. Using the optimal features subset, the proposed method achieved mean MCC of 94.41% on the benchmark dataset, and a MCC of 90.09% on the independent dataset. The experimental performance indicated that our new proposed method could be effective in identifying the important protein posttranslational modifications and the feature selection scheme would be powerful in protein functional residues prediction research fields.
Liu, Chunhua; Zhou, Peng; Li, Yanling
2016-01-01
Tyrosine sulfation is one of the ubiquitous protein posttranslational modifications, where some sulfate groups are added to the tyrosine residues. It plays significant roles in various physiological processes in eukaryotic cells. To explore the molecular mechanism of tyrosine sulfation, one of the prerequisites is to correctly identify possible protein tyrosine sulfation residues. In this paper, a novel method was presented to predict protein tyrosine sulfation residues from primary sequences. By means of informative feature construction and elaborate feature selection and parameter optimization scheme, the proposed predictor achieved promising results and outperformed many other state-of-the-art predictors. Using the optimal features subset, the proposed method achieved mean MCC of 94.41% on the benchmark dataset, and a MCC of 90.09% on the independent dataset. The experimental performance indicated that our new proposed method could be effective in identifying the important protein posttranslational modifications and the feature selection scheme would be powerful in protein functional residues prediction research fields. PMID:27034949
Models of the elastic x-ray scattering feature for warm dense aluminum
Starrett, Charles Edward; Saumon, Didier
2015-09-03
The elastic feature of x-ray scattering from warm dense aluminum has recently been measured by Fletcher et al. [Nature Photonics 9, 274 (2015)] with much higher accuracy than had hitherto been possible. This measurement is a direct test of the ionic structure predicted by models of warm dense matter. We use the method of pseudoatom molecular dynamics to predict this elastic feature for warm dense aluminum with temperatures of 1–100 eV and densities of 2.7–8.1g/cm 3. We compare these predictions to experiments, finding good agreement with Fletcher et al. and corroborating the discrepancy found in analyses of an earlier experimentmore » of Ma et al. [Phys. Rev. Lett. 110, 065001 (2013)]. Lastly, we also evaluate the validity of the Thomas-Fermi model of the electrons and of the hypernetted chain approximation in computing the elastic feature and find them both wanting in the regime currently probed by experiments.« less
Molecular Diagnosis and Biomarker Identification on SELDI proteomics data by ADTBoost method.
Wang, Lu-Yong; Chakraborty, Amit; Comaniciu, Dorin
2005-01-01
Clinical proteomics is an emerging field that will have great impact on molecular diagnosis, identification of disease biomarkers, drug discovery and clinical trials in the post-genomic era. Protein profiling in tissues and fluids in disease and pathological control and other proteomics techniques will play an important role in molecular diagnosis with therapeutics and personalized healthcare. We introduced a new robust diagnostic method based on ADTboost algorithm, a novel algorithm in proteomics data analysis to improve classification accuracy. It generates classification rules, which are often smaller and easier to interpret. This method often gives most discriminative features, which can be utilized as biomarkers for diagnostic purpose. Also, it has a nice feature of providing a measure of prediction confidence. We carried out this method in amyotrophic lateral sclerosis (ALS) disease data acquired by surface enhanced laser-desorption/ionization-time-of-flight mass spectrometry (SELDI-TOF MS) experiments. Our method is shown to have outstanding prediction capacity through the cross-validation, ROC analysis results and comparative study. Our molecular diagnosis method provides an efficient way to distinguish ALS disease from neurological controls. The results are expressed in a simple and straightforward alternating decision tree format or conditional format. We identified most discriminative peaks in proteomic data, which can be utilized as biomarkers for diagnosis. It will have broad application in molecular diagnosis through proteomics data analysis and personalized medicine in this post-genomic era.
Binding Affinity prediction with Property Encoded Shape Distribution signatures
Das, Sourav; Krein, Michael P.
2010-01-01
We report the use of the molecular signatures known as “Property-Encoded Shape Distributions” (PESD) together with standard Support Vector Machine (SVM) techniques to produce validated models that can predict the binding affinity of a large number of protein ligand complexes. This “PESD-SVM” method uses PESD signatures that encode molecular shapes and property distributions on protein and ligand surfaces as features to build SVM models that require no subjective feature selection. A simple protocol was employed for tuning the SVM models during their development, and the results were compared to SFCscore – a regression-based method that was previously shown to perform better than 14 other scoring functions. Although the PESD-SVM method is based on only two surface property maps, the overall results were comparable. For most complexes with a dominant enthalpic contribution to binding (ΔH/-TΔS > 3), a good correlation between true and predicted affinities was observed. Entropy and solvent were not considered in the present approach and further improvement in accuracy would require accounting for these components rigorously. PMID:20095526
NASA Astrophysics Data System (ADS)
Nozaki, Daijiro; Avdoshenko, Stanislav M.; Sevinçli, Hâldun; Gutierrez, Rafael; Cuniberti, Gianaurelio
2013-03-01
Recently the interest in quantum interference (QI) phenomena in molecular devices (molecular junctions) has been growing due to the unique features observed in the transmission spectra. In order to design single molecular devices exploiting QI effects as desired, it is necessary to provide simple rules for predicting the appearance of QI effects such as anti-resonances or Fano line shapes and for controlling them. In this study, we derive a transmission function of a generic molecular junction with a side group (T-shaped molecular junction) using a minimal toy model. We developed a simple method to predict the appearance of quantum interference, Fano resonances or anti- resonances, and its position in the conductance spectrum by introducing a simple graphical representation (parabolic model). Using it we can easily visualize the relation between the key electronic parameters and the positions of normal resonant peaks and anti-resonant peaks induced by quantum interference in the conductance spectrum. We also demonstrate Fano and anti-resonance in T-shaped molecular junctions using a simple tight-binding model. This parabolic model enables one to infer on-site energies of T-shaped molecules and the coupling between side group and main conduction channel from transmission spectra.
Papp, Laszlo; Poetsch, Nina; Grahovac, Marko; Schmidbauer, Victor; Woehrer, Adelheid; Preusser, Matthias; Mitterhauser, Markus; Kiesel, Barbara; Wadsak, Wolfgang; Beyer, Thomas; Hacker, Marcus; Traub-Weidinger, Tatjana
2017-11-24
Gliomas are the most common types of tumors in the brain. While the definite diagnosis is routinely made ex vivo by histopathologic and molecular examination, diagnostic work-up of patients with suspected glioma is mainly done by using magnetic resonance imaging (MRI). Nevertheless, L-S-methyl- 11 C-methionine ( 11 C-MET) Positron Emission Tomography (PET) holds a great potential in characterization of gliomas. The aim of this study was to establish machine learning (ML) driven survival models for glioma built on 11 C-MET-PET, ex vivo and patient characteristics. Methods: 70 patients with a treatment naïve glioma, who had a positive 11 C-MET-PET and histopathology-derived ex vivo feature extraction, such as World Health Organization (WHO) 2007 tumor grade, histology and isocitrate dehydrogenase (IDH1-R132H) mutation status were included. The 11 C-MET-positive primary tumors were delineated semi-automatically on PET images followed by the feature extraction of tumor-to-background ratio based general and higher-order textural features by applying five different binning approaches. In vivo and ex vivo features, as well as patient characteristics (age, weight, height, body-mass-index, Karnofsky-score) were merged to characterize the tumors. Machine learning approaches were utilized to identify relevant in vivo, ex vivo and patient features and their relative weights for 36 months survival prediction. The resulting feature weights were used to establish three predictive models per binning configuration based on a combination of: in vivo/ex vivo and clinical patient information (M36IEP), in vivo and patient-only information (M36IP), and in vivo only (M36I). In addition a binning-independent ex vivo and patient-only (M36EP) model was created. The established models were validated in a Monte Carlo (MC) cross-validation scheme. Results: Most prominent ML-selected and -weighted features were patient and ex vivo based followed by in vivo features. The highest area under the curve (AUC) values of our models as revealed by the MC cross-validation were: 0.9 (M36IEP), 0.87 (M36EP), 0.77 (M36IP) and 0.72 (M36I). Conclusion: Survival prediction of glioma patients based on amino acid PET using computer-supported predictive models based on in vivo, ex vivo and patient features is highly accurate. Copyright © 2017 by the Society of Nuclear Medicine and Molecular Imaging, Inc.
NASA Astrophysics Data System (ADS)
Banerjee, Priyanka; Preissner, Robert
2018-04-01
Taste of a chemical compounds present in food stimulates us to take in nutrients and avoid poisons. However, the perception of taste greatly depends on the genetic as well as evolutionary perspectives. The aim of this work was the development and validation of a machine learning model based on molecular fingerprints to discriminate between sweet and bitter taste of molecules. BitterSweetForest is the first open access model based on KNIME workflow that provides platform for prediction of bitter and sweet taste of chemical compounds using molecular fingerprints and Random Forest based classifier. The constructed model yielded an accuracy of 95% and an AUC of 0.98 in cross-validation. In independent test set, BitterSweetForest achieved an accuracy of 96 % and an AUC of 0.98 for bitter and sweet taste prediction. The constructed model was further applied to predict the bitter and sweet taste of natural compounds, approved drugs as well as on an acute toxicity compound data set. BitterSweetForest suggests 70% of the natural product space, as bitter and 10 % of the natural product space as sweet with confidence score of 0.60 and above. 77 % of the approved drug set was predicted as bitter and 2% as sweet with a confidence scores of 0.75 and above. Similarly, 75% of the total compounds from acute oral toxicity class were predicted only as bitter with a minimum confidence score of 0.75, revealing toxic compounds are mostly bitter. Furthermore, we applied a Bayesian based feature analysis method to discriminate the most occurring chemical features between sweet and bitter compounds from the feature space of a circular fingerprint.
Banerjee, Priyanka; Preissner, Robert
2018-01-01
Taste of a chemical compound present in food stimulates us to take in nutrients and avoid poisons. However, the perception of taste greatly depends on the genetic as well as evolutionary perspectives. The aim of this work was the development and validation of a machine learning model based on molecular fingerprints to discriminate between sweet and bitter taste of molecules. BitterSweetForest is the first open access model based on KNIME workflow that provides platform for prediction of bitter and sweet taste of chemical compounds using molecular fingerprints and Random Forest based classifier. The constructed model yielded an accuracy of 95% and an AUC of 0.98 in cross-validation. In independent test set, BitterSweetForest achieved an accuracy of 96% and an AUC of 0.98 for bitter and sweet taste prediction. The constructed model was further applied to predict the bitter and sweet taste of natural compounds, approved drugs as well as on an acute toxicity compound data set. BitterSweetForest suggests 70% of the natural product space, as bitter and 10% of the natural product space as sweet with confidence score of 0.60 and above. 77% of the approved drug set was predicted as bitter and 2% as sweet with a confidence score of 0.75 and above. Similarly, 75% of the total compounds from acute oral toxicity class were predicted only as bitter with a minimum confidence score of 0.75, revealing toxic compounds are mostly bitter. Furthermore, we applied a Bayesian based feature analysis method to discriminate the most occurring chemical features between sweet and bitter compounds using the feature space of a circular fingerprint. PMID:29696137
Bossi, Flavia; Fan, Jue; Xiao, Jun; Chandra, Lilyana; Shen, Max; Dorone, Yanniv; Wagner, Doris; Rhee, Seung Y
2017-06-26
The molecular function of a gene is most commonly inferred by sequence similarity. Therefore, genes that lack sufficient sequence similarity to characterized genes (such as certain classes of transcriptional regulators) are difficult to classify using most function prediction algorithms and have remained uncharacterized. To identify novel transcriptional regulators systematically, we used a feature-based pipeline to screen protein families of unknown function. This method predicted 43 transcriptional regulator families in Arabidopsis thaliana, 7 families in Drosophila melanogaster, and 9 families in Homo sapiens. Literature curation validated 12 of the predicted families to be involved in transcriptional regulation. We tested 33 out of the 195 Arabidopsis putative transcriptional regulators for their ability to activate transcription of a reporter gene in planta and found twelve coactivators, five of which had no prior literature support. To investigate mechanisms of action in which the predicted regulators might work, we looked for interactors of an Arabidopsis candidate that did not show transactivation activity in planta and found that it might work with other members of its own family and a subunit of the Polycomb Repressive Complex 2 to regulate transcription. Our results demonstrate the feasibility of assigning molecular function to proteins of unknown function without depending on sequence similarity. In particular, we identified novel transcriptional regulators using biological features enriched in transcription factors. The predictions reported here should accelerate the characterization of novel regulators.
Dienstmann, R; Mason, M J; Sinicrope, F A; Phipps, A I; Tejpar, S; Nesbakken, A; Danielsen, S A; Sveen, A; Buchanan, D D; Clendenning, M; Rosty, C; Bot, B; Alberts, S R; Milburn Jessup, J; Lothe, R A; Delorenzi, M; Newcomb, P A; Sargent, D; Guinney, J
2017-05-01
TNM staging alone does not accurately predict outcome in colon cancer (CC) patients who may be eligible for adjuvant chemotherapy. It is unknown to what extent the molecular markers microsatellite instability (MSI) and mutations in BRAF or KRAS improve prognostic estimation in multivariable models that include detailed clinicopathological annotation. After imputation of missing at random data, a subset of patients accrued in phase 3 trials with adjuvant chemotherapy (n = 3016)-N0147 (NCT00079274) and PETACC3 (NCT00026273)-was aggregated to construct multivariable Cox models for 5-year overall survival that were subsequently validated internally in the remaining clinical trial samples (n = 1499), and also externally in different population cohorts of chemotherapy-treated (n = 949) or -untreated (n = 1080) CC patients, and an additional series without treatment annotation (n = 782). TNM staging, MSI and BRAFV600E mutation status remained independent prognostic factors in multivariable models across clinical trials cohorts and observational studies. Concordance indices increased from 0.61-0.68 in the TNM alone model to 0.63-0.71 in models with added molecular markers, 0.65-0.73 with clinicopathological features and 0.66-0.74 with all covariates. In validation cohorts with complete annotation, the integrated time-dependent AUC rose from 0.64 for the TNM alone model to 0.67 for models that included clinicopathological features, with or without molecular markers. In patient cohorts that received adjuvant chemotherapy, the relative proportion of variance explained (R2) by TNM, clinicopathological features and molecular markers was on an average 65%, 25% and 10%, respectively. Incorporation of MSI, BRAFV600E and KRAS mutation status to overall survival models with TNM staging improves the ability to precisely prognosticate in stage II and III CC patients, but only modestly increases prediction accuracy in multivariable models that include clinicopathological features, particularly in chemotherapy-treated patients. © The Author 2017. Published by Oxford University Press on behalf of the European Society for Medical Oncology.
Wang, Tsang-Hsiu; Chu, Hsing-Yu; Wang, I-Teng
2014-10-15
The methyl 1-benzyl-1H-1,2,3-triazole-4-carboxylate (C11H11N3O2) has been studied by theoretically methods. The structure of this compound is optimized by density functional theory (DFT), the second-order Møller-Plesset perturbation theory (MP2) and G3 theory (G3(MP2)) levels. Our calculation results are in very good agreement with experimental values. Compared to a perfect pentagonal structure, the geometrical structures of C11H11N3O2 show a little distortion of 1,2,3-triazole ring due to the highly electronegativity of substitution groups. In addition, dipole moment and frontier molecular orbitals (FMOs) of the C11H11N3O2 are calculated as well. Because of solvent effect, the HOMO-LUMO energy gap in methanol is predicted to be smaller than in gas phase by 0.367eV. The simulated UV-vis spectra are investigated by time-dependent density functional theory (TD-DFT), and two obviously absorption features have been predicted. These two absorption features are located between 170nm and 210nm, which is in ultraviolet C range. Moreover, the UV absorption features in methanol are predicted to be more intense than in gas phase; besides, the red shift is predicted in methanol as well. Copyright © 2014 Elsevier B.V. All rights reserved.
Cultivating cohort studies for observational translational research.
Ransohoff, David F
2013-04-01
"Discovery" research about molecular markers for diagnosis, prognosis, or prediction of response to therapy has frequently produced results that were not reproducible in subsequent studies. What are the reasons, and can observational cohorts be cultivated to provide strong and reliable answers to those questions? Experimental Selected examples are used to illustrate: (i) what features of research design provide strength and reliability in observational studies about markers of diagnosis, prognosis, and response to therapy? (ii) How can those design features be cultivated in existing observational cohorts, for example, within randomized controlled clinical trial (RCT), other existing observational research studies, or practice settings like health maintenance organization (HMOs)? Examples include a study of RNA expression profiles of tumor tissue to predict prognosis of breast cancer, a study of serum proteomics profiles to diagnose ovarian cancer, and a study of stool-based DNA assays to screen for colon cancer. Strengths and weaknesses of observational study design features are discussed, along with lessons about how features that help assure strength might be "cultivated" in the future. By considering these examples and others, it may be possible to develop a process of "cultivating cohorts" in ongoing RCTs, observational cohort studies, and practice settings like HMOs that have strong features of study design. Such an effort could produce sources of data and specimens to reliably answer questions about the use of molecular markers in diagnosis, prognosis, and response to therapy.
Molecular modeling of the microstructure evolution during carbon fiber processing
NASA Astrophysics Data System (ADS)
Desai, Saaketh; Li, Chunyu; Shen, Tongtong; Strachan, Alejandro
2017-12-01
The rational design of carbon fibers with desired properties requires quantitative relationships between the processing conditions, microstructure, and resulting properties. We developed a molecular model that combines kinetic Monte Carlo and molecular dynamics techniques to predict the microstructure evolution during the processes of carbonization and graphitization of polyacrylonitrile (PAN)-based carbon fibers. The model accurately predicts the cross-sectional microstructure of the fibers with the molecular structure of the stabilized PAN fibers and physics-based chemical reaction rates as the only inputs. The resulting structures exhibit key features observed in electron microcopy studies such as curved graphitic sheets and hairpin structures. In addition, computed X-ray diffraction patterns are in good agreement with experiments. We predict the transverse moduli of the resulting fibers between 1 GPa and 5 GPa, in good agreement with experimental results for high modulus fibers and slightly lower than those of high-strength fibers. The transverse modulus is governed by sliding between graphitic sheets, and the relatively low value for the predicted microstructures can be attributed to their perfect longitudinal texture. Finally, the simulations provide insight into the relationships between chemical kinetics and the final microstructure; we observe that high reaction rates result in porous structures with lower moduli.
Edwards, Stefan M.; Sørensen, Izel F.; Sarup, Pernille; Mackay, Trudy F. C.; Sørensen, Peter
2016-01-01
Predicting individual quantitative trait phenotypes from high-resolution genomic polymorphism data is important for personalized medicine in humans, plant and animal breeding, and adaptive evolution. However, this is difficult for populations of unrelated individuals when the number of causal variants is low relative to the total number of polymorphisms and causal variants individually have small effects on the traits. We hypothesized that mapping molecular polymorphisms to genomic features such as genes and their gene ontology categories could increase the accuracy of genomic prediction models. We developed a genomic feature best linear unbiased prediction (GFBLUP) model that implements this strategy and applied it to three quantitative traits (startle response, starvation resistance, and chill coma recovery) in the unrelated, sequenced inbred lines of the Drosophila melanogaster Genetic Reference Panel. Our results indicate that subsetting markers based on genomic features increases the predictive ability relative to the standard genomic best linear unbiased prediction (GBLUP) model. Both models use all markers, but GFBLUP allows differential weighting of the individual genetic marker relationships, whereas GBLUP weighs the genetic marker relationships equally. Simulation studies show that it is possible to further increase the accuracy of genomic prediction for complex traits using this model, provided the genomic features are enriched for causal variants. Our GFBLUP model using prior information on genomic features enriched for causal variants can increase the accuracy of genomic predictions in populations of unrelated individuals and provides a formal statistical framework for leveraging and evaluating information across multiple experimental studies to provide novel insights into the genetic architecture of complex traits. PMID:27235308
NASA Astrophysics Data System (ADS)
Cao, Kunlin; Bhagalia, Roshni; Sood, Anup; Brogi, Edi; Mellinghoff, Ingo K.; Larson, Steven M.
2015-03-01
Positron emission tomography (PET) using uorodeoxyglucose (18F-FDG) is commonly used in the assessment of breast lesions by computing voxel-wise standardized uptake value (SUV) maps. Simple metrics derived from ensemble properties of SUVs within each identified breast lesion are routinely used for disease diagnosis. The maximum SUV within the lesion (SUVmax) is the most popular of these metrics. However these simple metrics are known to be error-prone and are susceptible to image noise. Finding reliable SUV map-based features that correlate to established molecular phenotypes of breast cancer (viz. estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) expression) will enable non-invasive disease management. This study investigated 36 SUV features based on first and second order statistics, local histograms and texture of segmented lesions to predict ER and PR expression in 51 breast cancer patients. True ER and PR expression was obtained via immunohistochemistry (IHC) of tissue samples from each lesion. A supervised learning, adaptive boosting-support vector machine (AdaBoost-SVM), framework was used to select a subset of features to classify breast lesions into distinct phenotypes. Performance of the trained multi-feature classifier was compared against the baseline single-feature SUVmax classifier using receiver operating characteristic (ROC) curves. Results show that texture features encoding local lesion homogeneity extracted from gray-level co-occurrence matrices are the strongest discriminator of lesion ER expression. In particular, classifiers including these features increased prediction accuracy from 0.75 (baseline) to 0.82 and the area under the ROC curve from 0.64 (baseline) to 0.75.
Fast metabolite identification with Input Output Kernel Regression.
Brouard, Céline; Shen, Huibin; Dührkop, Kai; d'Alché-Buc, Florence; Böcker, Sebastian; Rousu, Juho
2016-06-15
An important problematic of metabolomics is to identify metabolites using tandem mass spectrometry data. Machine learning methods have been proposed recently to solve this problem by predicting molecular fingerprint vectors and matching these fingerprints against existing molecular structure databases. In this work we propose to address the metabolite identification problem using a structured output prediction approach. This type of approach is not limited to vector output space and can handle structured output space such as the molecule space. We use the Input Output Kernel Regression method to learn the mapping between tandem mass spectra and molecular structures. The principle of this method is to encode the similarities in the input (spectra) space and the similarities in the output (molecule) space using two kernel functions. This method approximates the spectra-molecule mapping in two phases. The first phase corresponds to a regression problem from the input space to the feature space associated to the output kernel. The second phase is a preimage problem, consisting in mapping back the predicted output feature vectors to the molecule space. We show that our approach achieves state-of-the-art accuracy in metabolite identification. Moreover, our method has the advantage of decreasing the running times for the training step and the test step by several orders of magnitude over the preceding methods. celine.brouard@aalto.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Fast metabolite identification with Input Output Kernel Regression
Brouard, Céline; Shen, Huibin; Dührkop, Kai; d'Alché-Buc, Florence; Böcker, Sebastian; Rousu, Juho
2016-01-01
Motivation: An important problematic of metabolomics is to identify metabolites using tandem mass spectrometry data. Machine learning methods have been proposed recently to solve this problem by predicting molecular fingerprint vectors and matching these fingerprints against existing molecular structure databases. In this work we propose to address the metabolite identification problem using a structured output prediction approach. This type of approach is not limited to vector output space and can handle structured output space such as the molecule space. Results: We use the Input Output Kernel Regression method to learn the mapping between tandem mass spectra and molecular structures. The principle of this method is to encode the similarities in the input (spectra) space and the similarities in the output (molecule) space using two kernel functions. This method approximates the spectra-molecule mapping in two phases. The first phase corresponds to a regression problem from the input space to the feature space associated to the output kernel. The second phase is a preimage problem, consisting in mapping back the predicted output feature vectors to the molecule space. We show that our approach achieves state-of-the-art accuracy in metabolite identification. Moreover, our method has the advantage of decreasing the running times for the training step and the test step by several orders of magnitude over the preceding methods. Availability and implementation: Contact: celine.brouard@aalto.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307628
NASA Astrophysics Data System (ADS)
Xu, Ningning; Liu, Jianxin; Yu, Peiqiang
2018-04-01
Advanced vibrational molecular spectroscopy has been developed as a rapid and non-destructive tool to reveal intrinsic molecular structure conformation of biological tissues. However, this technique has not been used to systematically study flaking induced structure changes at a molecular level. The objective of this study was to use vibrational molecular spectroscopy to reveal association between steam flaking induced CHO molecular structural changes in relation to grain CHO fractionation, predicted CHO biodegradation and biodigestion in ruminant system. The Attenuate Total Reflectance Fourier-transform Vibrational Molecular Spectroscopy (ATR-Ft/VMS) at SRP Key Lab of Molecular Structure and Molecular Nutrition, Ministry of Agriculture Strategic Research Chair Program (SRP, University of Saskatchewan) was applied in this study. The fractionation, predicted biodegradation and biodigestion were evaluated using the Cornell Net Carbohydrate Protein System. The results show that: (1) The steam flaking induced significant changes in CHO subfractions, CHO biodegradation and biodigestion in ruminant system. There were significant differences between non-processed (raw) and steam flaked grain corn (P < .01); (2) The ATR-Ft/VMS molecular technique was able to detect the processing induced CHO molecular structure changes; (3) Induced CHO molecular structure spectral features are significantly correlated (P < .05) to CHO subfractions, CHO biodegradation and biodigestion and could be applied to potentially predict CHO biodegradation (R2 = 0.87, RSD = 0.74, P < .01) and intestinal digestible undegraded CHO (R2 = 0.87, RSD = 0.24, P < .01). In summary, the processing induced molecular CHO structure changes in grain corn could be revealed by the ATR-Ft/VMS vibrational molecular spectroscopy. These molecular structure changes in grain were potentially associated with CHO biodegradation and biodigestion.
NASA Astrophysics Data System (ADS)
Lalit, Manisha; Gangwal, Rahul P.; Dhoke, Gaurao V.; Damre, Mangesh V.; Khandelwal, Kanchan; Sangamwar, Abhay T.
2013-10-01
A combined pharmacophore modelling, 3D-QSAR and molecular docking approach was employed to reveal structural and chemical features essential for the development of small molecules as LRH-1 agonists. The best HypoGen pharmacophore hypothesis (Hypo1) consists of one hydrogen-bond donor (HBD), two general hydrophobic (H), one hydrophobic aromatic (HYAr) and one hydrophobic aliphatic (HYA) feature. It has exhibited high correlation coefficient of 0.927, cost difference of 85.178 bit and low RMS value of 1.411. This pharmacophore hypothesis was cross-validated using test set, decoy set and Cat-Scramble methodology. Subsequently, validated pharmacophore hypothesis was used in the screening of small chemical databases. Further, 3D-QSAR models were developed based on the alignment obtained using substructure alignment. The best CoMFA and CoMSIA model has exhibited excellent rncv2 values of 0.991 and 0.987, and rcv2 values of 0.767 and 0.703, respectively. CoMFA predicted rpred2 of 0.87 and CoMSIA predicted rpred2 of 0.78 showed that the predicted values were in good agreement with the experimental values. Molecular docking analysis reveals that π-π interaction with His390 and hydrogen bond interaction with His390/Arg393 is essential for LRH-1 agonistic activity. The results from pharmacophore modelling, 3D-QSAR and molecular docking are complementary to each other and could serve as a powerful tool for the discovery of potent small molecules as LRH-1 agonists.
Sengupta, Durba; Prasanna, Xavier; Mohole, Madhura; Chattopadhyay, Amitabha
2018-06-07
Gprotein-coupled receptors (GPCRs) are seven transmembrane receptors that mediate a large number of cellular responses and are important drug targets. One of the current challenges in GPCR biology is to analyze the molecular signatures of receptor-lipid interactions and their subsequent effects on GPCR structure, organization, and function. Molecular dynamics simulation studies have been successful in predicting molecular determinants of receptor-lipid interactions. In particular, predicted cholesterol interaction sites appear to correspond well with experimentally determined binding sites and estimated time scales of association. In spite of several success stories, the methodologies in molecular dynamics simulations are still emerging. In this Feature Article, we provide a comprehensive overview of coarse-grain and atomistic molecular dynamics simulations of GPCR-lipid interaction in the context of experimental observations. In addition, we discuss the effect of secondary and tertiary structural constraints in coarse-grain simulations in the context of functional dynamics and structural plasticity of GPCRs. We envision that this comprehensive overview will help resolve differences in computational studies and provide a way forward.
Chen, Shangying; Zhang, Peng; Liu, Xin; Qin, Chu; Tao, Lin; Zhang, Cheng; Yang, Sheng Yong; Chen, Yu Zong; Chui, Wai Keung
2016-06-01
The overall efficacy and safety profile of a new drug is partially evaluated by the therapeutic index in clinical studies and by the protective index (PI) in preclinical studies. In-silico predictive methods may facilitate the assessment of these indicators. Although QSAR and QSTR models can be used for predicting PI, their predictive capability has not been evaluated. To test this capability, we developed QSAR and QSTR models for predicting the activity and toxicity of anticonvulsants at accuracy levels above the literature-reported threshold (LT) of good QSAR models as tested by both the internal 5-fold cross validation and external validation method. These models showed significantly compromised PI predictive capability due to the cumulative errors of the QSAR and QSTR models. Therefore, in this investigation a new quantitative structure-index relationship (QSIR) model was devised and it showed improved PI predictive capability that superseded the LT of good QSAR models. The QSAR, QSTR and QSIR models were developed using support vector regression (SVR) method with the parameters optimized by using the greedy search method. The molecular descriptors relevant to the prediction of anticonvulsant activities, toxicities and PIs were analyzed by a recursive feature elimination method. The selected molecular descriptors are primarily associated with the drug-like, pharmacological and toxicological features and those used in the published anticonvulsant QSAR and QSTR models. This study suggested that QSIR is useful for estimating the therapeutic index of drug candidates. Copyright © 2016. Published by Elsevier Inc.
Lattice-free prediction of three-dimensional structure of programmed DNA assemblies
Pan, Keyao; Kim, Do-Nyun; Zhang, Fei; Adendorff, Matthew R.; Yan, Hao; Bathe, Mark
2014-01-01
DNA can be programmed to self-assemble into high molecular weight 3D assemblies with precise nanometer-scale structural features. Although numerous sequence design strategies exist to realize these assemblies in solution, there is currently no computational framework to predict their 3D structures on the basis of programmed underlying multi-way junction topologies constrained by DNA duplexes. Here, we introduce such an approach and apply it to assemblies designed using the canonical immobile four-way junction. The procedure is used to predict the 3D structure of high molecular weight planar and spherical ring-like origami objects, a tile-based sheet-like ribbon, and a 3D crystalline tensegrity motif, in quantitative agreement with experiments. Our framework provides a new approach to predict programmed nucleic acid 3D structure on the basis of prescribed secondary structure motifs, with possible application to the design of such assemblies for use in biomolecular and materials science. PMID:25470497
Predicting the performance of fingerprint similarity searching.
Vogt, Martin; Bajorath, Jürgen
2011-01-01
Fingerprints are bit string representations of molecular structure that typically encode structural fragments, topological features, or pharmacophore patterns. Various fingerprint designs are utilized in virtual screening and their search performance essentially depends on three parameters: the nature of the fingerprint, the active compounds serving as reference molecules, and the composition of the screening database. It is of considerable interest and practical relevance to predict the performance of fingerprint similarity searching. A quantitative assessment of the potential that a fingerprint search might successfully retrieve active compounds, if available in the screening database, would substantially help to select the type of fingerprint most suitable for a given search problem. The method presented herein utilizes concepts from information theory to relate the fingerprint feature distributions of reference compounds to screening libraries. If these feature distributions do not sufficiently differ, active database compounds that are similar to reference molecules cannot be retrieved because they disappear in the "background." By quantifying the difference in feature distribution using the Kullback-Leibler divergence and relating the divergence to compound recovery rates obtained for different benchmark classes, fingerprint search performance can be quantitatively predicted.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bossi, Flavia; Fan, Jue; Xiao, Jun
Here, the molecular function of a gene is most commonly inferred by sequence similarity. Therefore, genes that lack sufficient sequence similarity to characterized genes (such as certain classes of transcriptional regulators) are difficult to classify using most function prediction algorithms and have remained uncharacterized. As a result, to identify novel transcriptional regulators systematically, we used a feature-based pipeline to screen protein families of unknown function. This method predicted 43 transcriptional regulator families in Arabidopsis thaliana, 7 families in Drosophila melanogaster, and 9 families in Homo sapiens. Literature curation validated 12 of the predicted families to be involved in transcriptional regulation.more » We tested 33 out of the 195 Arabidopsis putative transcriptional regulators for their ability to activate transcription of a reporter gene in planta and found twelve coactivators, five of which had no prior literature support. To investigate mechanisms of action in which the predicted regulators might work, we looked for interactors of an Arabidopsis candidate that did not show transactivation activity in planta and found that it might work with other members of its own family and a subunit of the Polycomb Repressive Complex 2 to regulate transcription. Our results demonstrate the feasibility of assigning molecular function to proteins of unknown function without depending on sequence similarity. In particular, we identified novel transcriptional regulators using biological features enriched in transcription factors. The predictions reported here should accelerate the characterization of novel regulators.« less
Bossi, Flavia; Fan, Jue; Xiao, Jun; ...
2017-06-26
Here, the molecular function of a gene is most commonly inferred by sequence similarity. Therefore, genes that lack sufficient sequence similarity to characterized genes (such as certain classes of transcriptional regulators) are difficult to classify using most function prediction algorithms and have remained uncharacterized. As a result, to identify novel transcriptional regulators systematically, we used a feature-based pipeline to screen protein families of unknown function. This method predicted 43 transcriptional regulator families in Arabidopsis thaliana, 7 families in Drosophila melanogaster, and 9 families in Homo sapiens. Literature curation validated 12 of the predicted families to be involved in transcriptional regulation.more » We tested 33 out of the 195 Arabidopsis putative transcriptional regulators for their ability to activate transcription of a reporter gene in planta and found twelve coactivators, five of which had no prior literature support. To investigate mechanisms of action in which the predicted regulators might work, we looked for interactors of an Arabidopsis candidate that did not show transactivation activity in planta and found that it might work with other members of its own family and a subunit of the Polycomb Repressive Complex 2 to regulate transcription. Our results demonstrate the feasibility of assigning molecular function to proteins of unknown function without depending on sequence similarity. In particular, we identified novel transcriptional regulators using biological features enriched in transcription factors. The predictions reported here should accelerate the characterization of novel regulators.« less
Prediction of response to neoadjuvant chemotherapy in breast cancer: a radiomic study
NASA Astrophysics Data System (ADS)
Wu, Guolin; Fan, Ming; Zhang, Juan; Zheng, Bin; Li, Lihua
2017-03-01
Breast cancer is one of the most malignancies among women in worldwide. Neoadjuvant Chemotherapy (NACT) has gained interest and is increasingly used in treatment of breast cancer in recent years. Therefore, it is necessary to find a reliable non-invasive assessment and prediction method which can evaluate and predict the response of NACT. Recent studies have highlighted the use of MRI for predicting response to NACT. In addition, molecular subtype could also effectively identify patients who are likely have better prognosis in breast cancer. In this study, a radiomic analysis were performed, by extracting features from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) and immunohistochemistry (IHC) to determine subtypes. A dataset with fifty-seven breast cancer patients were included, all of them received preoperative MRI examination. Among them, 47 patients had complete response (CR) or partial response (PR) and 10 had stable disease (SD) to chemotherapy based on the RECIST criterion. A total of 216 imaging features including statistical characteristics, morphology, texture and dynamic enhancement were extracted from DCE-MRI. In multivariate analysis, the proposed imaging predictors achieved an AUC of 0.923 (P = 0.0002) in leave-one-out crossvalidation. The performance of the classifier increased to 0.960, 0.950 and 0.936 when status of HER2, Luminal A and Luminal B subtypes were added into the statistic model, respectively. The results of this study demonstrated that IHC determined molecular status combined with radiomic features from DCE-MRI could be used as clinical marker that is associated with response to NACT.
Deep-Learning Convolutional Neural Networks Accurately Classify Genetic Mutations in Gliomas.
Chang, P; Grinband, J; Weinberg, B D; Bardis, M; Khy, M; Cadena, G; Su, M-Y; Cha, S; Filippi, C G; Bota, D; Baldi, P; Poisson, L M; Jain, R; Chow, D
2018-05-10
The World Health Organization has recently placed new emphasis on the integration of genetic information for gliomas. While tissue sampling remains the criterion standard, noninvasive imaging techniques may provide complimentary insight into clinically relevant genetic mutations. Our aim was to train a convolutional neural network to independently predict underlying molecular genetic mutation status in gliomas with high accuracy and identify the most predictive imaging features for each mutation. MR imaging data and molecular information were retrospectively obtained from The Cancer Imaging Archives for 259 patients with either low- or high-grade gliomas. A convolutional neural network was trained to classify isocitrate dehydrogenase 1 ( IDH1 ) mutation status, 1p/19q codeletion, and O6-methylguanine-DNA methyltransferase ( MGMT ) promotor methylation status. Principal component analysis of the final convolutional neural network layer was used to extract the key imaging features critical for successful classification. Classification had high accuracy: IDH1 mutation status, 94%; 1p/19q codeletion, 92%; and MGMT promotor methylation status, 83%. Each genetic category was also associated with distinctive imaging features such as definition of tumor margins, T1 and FLAIR suppression, extent of edema, extent of necrosis, and textural features. Our results indicate that for The Cancer Imaging Archives dataset, machine-learning approaches allow classification of individual genetic mutations of both low- and high-grade gliomas. We show that relevant MR imaging features acquired from an added dimensionality-reduction technique demonstrate that neural networks are capable of learning key imaging components without prior feature selection or human-directed training. © 2018 by American Journal of Neuroradiology.
Kim, Sungjin; Jinich, Adrián; Aspuru-Guzik, Alán
2017-04-24
We propose a multiple descriptor multiple kernel (MultiDK) method for efficient molecular discovery using machine learning. We show that the MultiDK method improves both the speed and accuracy of molecular property prediction. We apply the method to the discovery of electrolyte molecules for aqueous redox flow batteries. Using multiple-type-as opposed to single-type-descriptors, we obtain more relevant features for machine learning. Following the principle of "wisdom of the crowds", the combination of multiple-type descriptors significantly boosts prediction performance. Moreover, by employing multiple kernels-more than one kernel function for a set of the input descriptors-MultiDK exploits nonlinear relations between molecular structure and properties better than a linear regression approach. The multiple kernels consist of a Tanimoto similarity kernel and a linear kernel for a set of binary descriptors and a set of nonbinary descriptors, respectively. Using MultiDK, we achieve an average performance of r 2 = 0.92 with a test set of molecules for solubility prediction. We also extend MultiDK to predict pH-dependent solubility and apply it to a set of quinone molecules with different ionizable functional groups to assess their performance as flow battery electrolytes.
Navigating at Will on the Water Phase Diagram
NASA Astrophysics Data System (ADS)
Pipolo, S.; Salanne, M.; Ferlat, G.; Klotz, S.; Saitta, A. M.; Pietrucci, F.
2017-12-01
Despite the simplicity of its molecular unit, water is a challenging system because of its uniquely rich polymorphism and predicted but yet unconfirmed features. Introducing a novel space of generalized coordinates that capture changes in the topology of the interatomic network, we are able to systematically track transitions among liquid, amorphous, and crystalline forms throughout the whole phase diagram of water, including the nucleation of crystals above and below the melting point. Our approach, based on molecular dynamics and enhanced sampling or free energy calculation techniques, is not specific to water and could be applied to very different structural phase transitions, paving the way towards the prediction of kinetic routes connecting polymorphic structures in a range of materials.
Zhang, Yanxi; Ye, Gang; Soni, Saurabh; Qiu, Xinkai; Krijger, Theodorus L.; Jonkman, Harry T.; Carlotti, Marco; Sauter, Eric; Zharnikov, Michael
2018-01-01
Quantum interference effects (QI) are of interest in nano-scale devices based on molecular tunneling junctions because they can affect conductance exponentially through minor structural changes. However, their utilization requires the prediction and deterministic control over the position and magnitude of QI features, which remains a significant challenge. In this context, we designed and synthesized three benzodithiophenes based molecular wires; one linearly-conjugated, one cross-conjugated and one cross-conjugated quinone. Using eutectic Ga–In (EGaIn) and CP-AFM, we compared them to a well-known anthraquinone in molecular junctions comprising self-assembled monolayers (SAMs). By combining density functional theory and transition voltage spectroscopy, we show that the presence of an interference feature and its position can be controlled independently by manipulating bond topology and electronegativity. This is the first study to separate these two parameters experimentally, demonstrating that the conductance of a tunneling junction depends on the position and depth of a QI feature, both of which can be controlled synthetically. PMID:29896382
Poly(A) code analyses reveal key determinants for tissue-specific mRNA alternative polyadenylation
Weng, Lingjie; Li, Yi; Xie, Xiaohui; Shi, Yongsheng
2016-01-01
mRNA alternative polyadenylation (APA) is a critical mechanism for post-transcriptional gene regulation and is often regulated in a tissue- and/or developmental stage-specific manner. An ultimate goal for the APA field has been to be able to computationally predict APA profiles under different physiological or pathological conditions. As a first step toward this goal, we have assembled a poly(A) code for predicting tissue-specific poly(A) sites (PASs). Based on a compendium of over 600 features that have known or potential roles in PAS selection, we have generated and refined a machine-learning algorithm using multiple high-throughput sequencing-based data sets of tissue-specific and constitutive PASs. This code can predict tissue-specific PASs with >85% accuracy. Importantly, by analyzing the prediction performance based on different RNA features, we found that PAS context, including the distance between alternative PASs and the relative position of a PAS within the gene, is a key feature for determining the susceptibility of a PAS to tissue-specific regulation. Our poly(A) code provides a useful tool for not only predicting tissue-specific APA regulation, but also for studying its underlying molecular mechanisms. PMID:27095026
Iwata, Hiroaki; Mizutani, Sayaka; Tabei, Yasuo; Kotera, Masaaki; Goto, Susumu; Yamanishi, Yoshihiro
2013-01-01
Most phenotypic effects of drugs are involved in the interactions between drugs and their target proteins, however, our knowledge about the molecular mechanism of the drug-target interactions is very limited. One of challenging issues in recent pharmaceutical science is to identify the underlying molecular features which govern drug-target interactions. In this paper, we make a systematic analysis of the correlation between drug side effects and protein domains, which we call "pharmacogenomic features," based on the drug-target interaction network. We detect drug side effects and protein domains that appear jointly in known drug-target interactions, which is made possible by using classifiers with sparse models. It is shown that the inferred pharmacogenomic features can be used for predicting potential drug-target interactions. We also discuss advantages and limitations of the pharmacogenomic features, compared with the chemogenomic features that are the associations between drug chemical substructures and protein domains. The inferred side effect-domain association network is expected to be useful for estimating common drug side effects for different protein families and characteristic drug side effects for specific protein domains.
Wu, Zhenqin; Ramsundar, Bharath; Feinberg, Evan N.; Gomes, Joseph; Geniesse, Caleb; Pappu, Aneesh S.; Leswing, Karl
2017-01-01
Molecular machine learning has been maturing rapidly over the last few years. Improved methods and the presence of larger datasets have enabled machine learning algorithms to make increasingly accurate predictions about molecular properties. However, algorithmic progress has been limited due to the lack of a standard benchmark to compare the efficacy of proposed methods; most new algorithms are benchmarked on different datasets making it challenging to gauge the quality of proposed methods. This work introduces MoleculeNet, a large scale benchmark for molecular machine learning. MoleculeNet curates multiple public datasets, establishes metrics for evaluation, and offers high quality open-source implementations of multiple previously proposed molecular featurization and learning algorithms (released as part of the DeepChem open source library). MoleculeNet benchmarks demonstrate that learnable representations are powerful tools for molecular machine learning and broadly offer the best performance. However, this result comes with caveats. Learnable representations still struggle to deal with complex tasks under data scarcity and highly imbalanced classification. For quantum mechanical and biophysical datasets, the use of physics-aware featurizations can be more important than choice of particular learning algorithm. PMID:29629118
Zheng, Ce; Kurgan, Lukasz
2008-10-10
beta-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of beta-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based beta-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential beta-turns, while the remaining four amino acids are useful to predict non-beta-turns. Empirical evaluation using three nonredundant datasets shows favorable Q total, Q predicted and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Q total barrier and achieves Q total = 80.9%, MCC = 0.47, and Q predicted higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between beta-turns and non-beta-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at http://biomine.ece.ualberta.ca/BTNpred/BTNpred.html.
The Role of Molecular Diagnostics in the Management of Patients with Gliomas.
Wirsching, Hans-Georg; Weller, Michael
2016-10-01
The revised World Health Organization (WHO) classification of tumors of the central nervous system of 2016 combines biology-driven molecular marker diagnostics with classical histological cancer diagnosis. Reclassification of gliomas by molecular similarity beyond histological boundaries improves outcome prediction and will increasingly guide treatment decisions. This change in paradigms implies more personalized and eventually more efficient therapeutic approaches, but the era of molecular targeted therapies for gliomas is yet at its onset. Promising results of molecularly targeted therapies in genetically less complex gliomas with circumscribed growth such as subependymal giant cell astrocytoma or pilocytic astrocytoma support further development of molecularly targeted therapies. In diffuse gliomas, several molecular markers that predict benefit from alkylating agent chemotherapy have been identified in recent years. For example, co-deletion of chromosome arms 1p and 19q predicts benefit from polychemotherapy with procarbazine, CCNU (lomustine), and vincristine (PCV) in patients with anaplastic oligodendroglioma, and the presence of 1p/19q co-deletion was integrated as a defining feature of oligodendroglial tumors in the revised WHO classification. However, the tremendous increase in knowledge of molecular drivers of diffuse gliomas on genomic, epigenetic, and gene expression levels has not yet translated into effective molecular targeted therapies. Multiple reasons account for the failure of early clinical trials of molecularly targeted therapies in diffuse gliomas, including the lack of molecular entry controls as well as pharmacokinetic and pharmacodynamics issues, but the key challenge of specifically targeting the molecular backbone of diffuse gliomas is probably extensive clonal heterogeneity. A more profound understanding of clonal selection, alternative activation of oncogenic signaling pathways, and genomic instability is warranted to identify effective combination treatments and ultimately improve survival.
Garcia Lopez, Sebastian; Kim, Philip M.
2014-01-01
Advances in sequencing have led to a rapid accumulation of mutations, some of which are associated with diseases. However, to draw mechanistic conclusions, a biochemical understanding of these mutations is necessary. For coding mutations, accurate prediction of significant changes in either the stability of proteins or their affinity to their binding partners is required. Traditional methods have used semi-empirical force fields, while newer methods employ machine learning of sequence and structural features. Here, we show how combining both of these approaches leads to a marked boost in accuracy. We introduce ELASPIC, a novel ensemble machine learning approach that is able to predict stability effects upon mutation in both, domain cores and domain-domain interfaces. We combine semi-empirical energy terms, sequence conservation, and a wide variety of molecular details with a Stochastic Gradient Boosting of Decision Trees (SGB-DT) algorithm. The accuracy of our predictions surpasses existing methods by a considerable margin, achieving correlation coefficients of 0.77 for stability, and 0.75 for affinity predictions. Notably, we integrated homology modeling to enable proteome-wide prediction and show that accurate prediction on modeled structures is possible. Lastly, ELASPIC showed significant differences between various types of disease-associated mutations, as well as between disease and common neutral mutations. Unlike pure sequence-based prediction methods that try to predict phenotypic effects of mutations, our predictions unravel the molecular details governing the protein instability, and help us better understand the molecular causes of diseases. PMID:25243403
Analysis of A Drug Target-based Classification System using Molecular Descriptors.
Lu, Jing; Zhang, Pin; Bi, Yi; Luo, Xiaomin
2016-01-01
Drug-target interaction is an important topic in drug discovery and drug repositioning. KEGG database offers a drug annotation and classification using a target-based classification system. In this study, we gave an investigation on five target-based classes: (I) G protein-coupled receptors; (II) Nuclear receptors; (III) Ion channels; (IV) Enzymes; (V) Pathogens, using molecular descriptors to represent each drug compound. Two popular feature selection methods, maximum relevance minimum redundancy and incremental feature selection, were adopted to extract the important descriptors. Meanwhile, an optimal prediction model based on nearest neighbor algorithm was constructed, which got the best result in identifying drug target-based classes. Finally, some key descriptors were discussed to uncover their important roles in the identification of drug-target classes.
Tulla, Kiara A; Maker, Ajay V
2018-03-01
Predicting the biologic behavior of intraductal papillary mucinous neoplasm (IPMN) remains challenging. Current guidelines utilize patient symptoms and imaging characteristics to determine appropriate surgical candidates. However, the majority of resected cysts remain low-risk lesions, many of which may be feasible to have under surveillance. We herein characterize the most promising and up-to-date molecular diagnostics in order to identify optimal components of a molecular signature to distinguish levels of IPMN dysplasia. A comprehensive systematic review of pertinent literature, including our own experience, was conducted based on the PRISMA guidelines. Molecular diagnostics in IPMN patient tissue, duodenal secretions, cyst fluid, saliva, and serum were evaluated and organized into the following categories: oncogenes, tumor suppressor genes, glycoproteins, markers of the immune response, proteomics, DNA/RNA mutations, and next-generation sequencing/microRNA. Specific targets in each of these categories, and in aggregate, were identified by their ability to both characterize a cyst as an IPMN and determine the level of cyst dysplasia. Combining molecular signatures with clinical and imaging features in this era of next-generation sequencing and advanced computational analysis will enable enhanced sensitivity and specificity of current models to predict the biologic behavior of IPMN.
Sparse feature selection for classification and prediction of metastasis in endometrial cancer.
Ahsen, Mehmet Eren; Boren, Todd P; Singh, Nitin K; Misganaw, Burook; Mutch, David G; Moore, Kathleen N; Backes, Floor J; McCourt, Carolyn K; Lea, Jayanthi S; Miller, David S; White, Michael A; Vidyasagar, Mathukumalli
2017-03-27
Metastasis via pelvic and/or para-aortic lymph nodes is a major risk factor for endometrial cancer. Lymph-node resection ameliorates risk but is associated with significant co-morbidities. Incidence in patients with stage I disease is 4-22% but no mechanism exists to accurately predict it. Therefore, national guidelines for primary staging surgery include pelvic and para-aortic lymph node dissection for all patients whose tumor exceeds 2cm in diameter. We sought to identify a robust molecular signature that can accurately classify risk of lymph node metastasis in endometrial cancer patients. 86 tumors matched for age and race, and evenly distributed between lymph node-positive and lymph node-negative cases, were selected as a training cohort. Genomic micro-RNA expression was profiled for each sample to serve as the predictive feature matrix. An independent set of 28 tumor samples was collected and similarly characterized to serve as a test cohort. A feature selection algorithm was designed for applications where the number of samples is far smaller than the number of measured features per sample. A predictive miRNA expression signature was developed using this algorithm, which was then used to predict the metastatic status of the independent test cohort. A weighted classifier, using 18 micro-RNAs, achieved 100% accuracy on the training cohort. When applied to the testing cohort, the classifier correctly predicted 90% of node-positive cases, and 80% of node-negative cases (FDR = 6.25%). Results indicate that the evaluation of the quantitative sparse-feature classifier proposed here in clinical trials may lead to significant improvement in the prediction of lymphatic metastases in endometrial cancer patients.
Schütte, Moritz; Risch, Thomas; Abdavi-Azar, Nilofar; Boehnke, Karsten; Schumacher, Dirk; Keil, Marlen; Yildiriman, Reha; Jandrasits, Christine; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Worth, Catherine L.; Schweiger, Caroline; Liebs, Sandra; Lange, Martin; Warnatz, Hans- Jörg; Butcher, Lee M.; Barrett, James E.; Sultan, Marc; Wierling, Christoph; Golob-Schwarzl, Nicole; Lax, Sigurd; Uranitsch, Stefan; Becker, Michael; Welte, Yvonne; Regan, Joseph Lewis; Silvestrov, Maxine; Kehler, Inge; Fusi, Alberto; Kessler, Thomas; Herwig, Ralf; Landegren, Ulf; Wienke, Dirk; Nilsson, Mats; Velasco, Juan A.; Garin-Chesa, Pilar; Reinhard, Christoph; Beck, Stephan; Schäfer, Reinhold; Regenbrecht, Christian R. A.; Henderson, David; Lange, Bodo; Haybaeck, Johannes; Keilholz, Ulrich; Hoffmann, Jens; Lehrach, Hans; Yaspo, Marie-Laure
2017-01-01
Colorectal carcinoma represents a heterogeneous entity, with only a fraction of the tumours responding to available therapies, requiring a better molecular understanding of the disease in precision oncology. To address this challenge, the OncoTrack consortium recruited 106 CRC patients (stages I–IV) and developed a pre-clinical platform generating a compendium of drug sensitivity data totalling >4,000 assays testing 16 clinical drugs on patient-derived in vivo and in vitro models. This large biobank of 106 tumours, 35 organoids and 59 xenografts, with extensive omics data comparing donor tumours and derived models provides a resource for advancing our understanding of CRC. Models recapitulate many of the genetic and transcriptomic features of the donors, but defined less complex molecular sub-groups because of the loss of human stroma. Linking molecular profiles with drug sensitivity patterns identifies novel biomarkers, including a signature outperforming RAS/RAF mutations in predicting sensitivity to the EGFR inhibitor cetuximab. PMID:28186126
Zheng, Ce; Kurgan, Lukasz
2008-01-01
Background β-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of β-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based β-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. Results We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential β-turns, while the remaining four amino acids are useful to predict non-β-turns. Empirical evaluation using three nonredundant datasets shows favorable Qtotal, Qpredicted and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Qtotal barrier and achieves Qtotal = 80.9%, MCC = 0.47, and Qpredicted higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. Conclusion Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between β-turns and non-β-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at . PMID:18847492
NASA Technical Reports Server (NTRS)
Smith, R. G.; Charnely, S. B.; Pendleton, Y. J.; Wright, C. M.; Maldoni, M. M.; Robinson, G.
2011-01-01
Recent surface chemistry experiments have shown that the hydrogenation of molecular oxygen on interstellar dust grains is a plausible formation mechanism, via hydrogen peroxide (H2O2), for the production of water (H2O) ice mantles in the dense interstellar medium. Theoretical chemistry models also predict the formation of a significant abundance of H2O2 ice in grain mantles by this route. At their upper limits, the predicted and experimental abundances are sufficiently high that H2O2 should be detectable in molecular cloud ice spectra. To investigate this further, laboratory spectra have been obtained for H2O2/H2O ice films between 2.5 and 200 micron, from 10 to 180 K, containing 3%, 30%, and 97% H2O2 ice. Integrated absorbances for all the absorption features in low-temperature H2O2 ice have been derived from these spectra. For identifying H2O2 ice, the key results are the presence of unique features near 3.5, 7.0, and 11.3 micron. Comparing the laboratory spectra with the spectra of a group of 24 protostars and field stars, all of which have strong H2O ice absorption bands, no absorption features are found that can definitely be identified with H2O2 ice. In the absence of definite H2O2 features, the H2O2 abundance is constrained by its possible contribution to the weak absorption feature near 3.47 micron found on the long-wavelength wing of the 3 micron H2O ice band. This gives an average upper limit for H2O2, as a percentage of H2O, of 9% +/- 4%. This is a strong constraint on parameters for surface chemistry experiments and dense cloud chemistry models.
Molecular diagnostics in the management of rhabdomyosarcoma.
Arnold, Michael A; Barr, Fredric G
2017-02-01
A classification of rhabdomyosarcoma (RMS) with prognostic relevance has primarily relied on clinical features and histologic classification as either embryonal or alveolar RMS. The PAX3-FOXO1 and PAX7-FOXO1 gene fusions occur in 80% of cases with the alveolar subtype and are more predictive of outcome than histologic classification. Identifying additional molecular hallmarks that further subclassify RMS is an active area of research. Areas Covered: The authors review the current state of the PAX3-FOXO1 and PAX7-FOXO1 fusions as prognostic biomarkers. Emerging biomarkers, including mRNA expression profiling, MYOD1 mutations, RAS pathway mutations and gene fusions involving NCOA2 or VGLL2 are also reviewed. Expert commentary: Strategies for modifying RMS risk stratification based on molecular biomarkers are emerging with the potential to transform the clinical management of RMS, ultimately improving patient outcomes by tailoring therapy to predicted patient risk and identifying targets for novel therapies.
Soler, Miguel A; de Marco, Ario; Fortuna, Sara
2016-10-10
Nanobodies (VHHs) have proved to be valuable substitutes of conventional antibodies for molecular recognition. Their small size represents a precious advantage for rational mutagenesis based on modelling. Here we address the problem of predicting how Camelidae nanobody sequences can tolerate mutations by developing a simulation protocol based on all-atom molecular dynamics and whole-molecule docking. The method was tested on two sets of nanobodies characterized experimentally for their biophysical features. One set contained point mutations introduced to humanize a wild type sequence, in the second the CDRs were swapped between single-domain frameworks with Camelidae and human hallmarks. The method resulted in accurate scoring approaches to predict experimental yields and enabled to identify the structural modifications induced by mutations. This work is a promising tool for the in silico development of single-domain antibodies and opens the opportunity to customize single functional domains of larger macromolecules.
NASA Astrophysics Data System (ADS)
Soler, Miguel A.; De Marco, Ario; Fortuna, Sara
2016-10-01
Nanobodies (VHHs) have proved to be valuable substitutes of conventional antibodies for molecular recognition. Their small size represents a precious advantage for rational mutagenesis based on modelling. Here we address the problem of predicting how Camelidae nanobody sequences can tolerate mutations by developing a simulation protocol based on all-atom molecular dynamics and whole-molecule docking. The method was tested on two sets of nanobodies characterized experimentally for their biophysical features. One set contained point mutations introduced to humanize a wild type sequence, in the second the CDRs were swapped between single-domain frameworks with Camelidae and human hallmarks. The method resulted in accurate scoring approaches to predict experimental yields and enabled to identify the structural modifications induced by mutations. This work is a promising tool for the in silico development of single-domain antibodies and opens the opportunity to customize single functional domains of larger macromolecules.
Perco, Paul; Heinzel, Andreas; Leierer, Johannes; Schneeberger, Stefan; Bösmüller, Claudia; Oberhuber, Rupert; Wagner, Silvia; Engler, Franziska; Mayer, Gert
2018-05-03
Donor organ quality affects long term outcome after renal transplantation. A variety of prognostic molecular markers is available, yet their validity often remains undetermined. A network-based molecular model reflecting donor kidney status based on transcriptomics data and molecular features reported in scientific literature to be associated with chronic allograft nephropathy was created. Significantly enriched biological processes were identified and representative markers were selected. An independent kidney pre-implantation transcriptomics dataset of 76 organs was used to predict estimated glomerular filtration rate (eGFR) values twelve months after transplantation using available clinical data and marker expression values. The best-performing regression model solely based on the clinical parameters donor age, donor gender, and recipient gender explained 17% of variance in post-transplant eGFR values. The five molecular markers EGF, CD2BP2, RALBP1, SF3B1, and DDX19B representing key molecular processes of the constructed renal donor organ status molecular model in addition to the clinical parameters significantly improved model performance (p-value = 0.0007) explaining around 33% of the variability of eGFR values twelve months after transplantation. Collectively, molecular markers reflecting donor organ status significantly add to prediction of post-transplant renal function when added to the clinical parameters donor age and gender.
Impact of mutations on the allosteric conformational equilibrium
Weinkam, Patrick; Chen, Yao Chi; Pons, Jaume; Sali, Andrej
2012-01-01
Allostery in a protein involves effector binding at an allosteric site that changes the structure and/or dynamics at a distant, functional site. In addition to the chemical equilibrium of ligand binding, allostery involves a conformational equilibrium between one protein substate that binds the effector and a second substate that less strongly binds the effector. We run molecular dynamics simulations using simple, smooth energy landscapes to sample specific ligand-induced conformational transitions, as defined by the effector-bound and unbound protein structures. These simulations can be performed using our web server: http://salilab.org/allosmod/. We then develop a set of features to analyze the simulations and capture the relevant thermodynamic properties of the allosteric conformational equilibrium. These features are based on molecular mechanics energy functions, stereochemical effects, and structural/dynamic coupling between sites. Using a machine-learning algorithm on a dataset of 10 proteins and 179 mutations, we predict both the magnitude and sign of the allosteric conformational equilibrium shift by the mutation; the impact of a large identifiable fraction of the mutations can be predicted with an average unsigned error of 1 kBT. With similar accuracy, we predict the mutation effects for an 11th protein that was omitted from the initial training and testing of the machine-learning algorithm. We also assess which calculated thermodynamic properties contribute most to the accuracy of the prediction. PMID:23228330
Preoperative Molecular Markers in Thyroid Nodules.
Sahli, Zeyad T; Smith, Philip W; Umbricht, Christopher B; Zeiger, Martha A
2018-01-01
The need for distinguishing benign from malignant thyroid nodules has led to the pursuit of differentiating molecular markers. The most common molecular tests in clinical use are Afirma ® Gene Expression Classifier (GEC) and Thyroseq ® V2. Despite the rapidly developing field of molecular markers, several limitations exist. These challenges include the recent introduction of the histopathological diagnosis "Non-Invasive Follicular Thyroid neoplasm with Papillary-like nuclear features", the correlation of genetic mutations within both benign and malignant pathologic diagnoses, the lack of follow-up of molecular marker negative nodules, and the cost-effectiveness of molecular markers. In this manuscript, we review the current published literature surrounding the diagnostic value of Afirma ® GEC and Thyroseq ® V2. Among Afirma ® GEC studies, sensitivity (Se), specificity (Sp), positive predictive value (PPV), and negative predictive value (NPV) ranged from 75 to 100%, 5 to 53%, 13 to 100%, and 20 to 100%, respectively. Among Thyroseq ® V2 studies, Se, Sp, PPV, and NPV ranged from 40 to 100%, 56 to 93%, 13 to 90%, and 48 to 97%, respectively. We also discuss current challenges to Afirma ® GEC and Thyroseq ® V2 utility and clinical application, and preview the future directions of these rapidly developing technologies.
Prediction of lysine ubiquitylation with ensemble classifier and feature selection.
Zhao, Xiaowei; Li, Xiangtao; Ma, Zhiqiang; Yin, Minghao
2011-01-01
Ubiquitylation is an important process of post-translational modification. Correct identification of protein lysine ubiquitylation sites is of fundamental importance to understand the molecular mechanism of lysine ubiquitylation in biological systems. This paper develops a novel computational method to effectively identify the lysine ubiquitylation sites based on the ensemble approach. In the proposed method, 468 ubiquitylation sites from 323 proteins retrieved from the Swiss-Prot database were encoded into feature vectors by using four kinds of protein sequences information. An effective feature selection method was then applied to extract informative feature subsets. After different feature subsets were obtained by setting different starting points in the search procedure, they were used to train multiple random forests classifiers and then aggregated into a consensus classifier by majority voting. Evaluated by jackknife tests and independent tests respectively, the accuracy of the proposed predictor reached 76.82% for the training dataset and 79.16% for the test dataset, indicating that this predictor is a useful tool to predict lysine ubiquitylation sites. Furthermore, site-specific feature analysis was performed and it was shown that ubiquitylation is intimately correlated with the features of its surrounding sites in addition to features derived from the lysine site itself. The feature selection method is available upon request.
2010-01-01
Background Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. Results In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. Conclusions We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http://liao.cis.udel.edu/pub/svdsvm. Implemented in Matlab and supported on Linux and MS Windows. PMID:21034480
Application of Deep Learning in Automated Analysis of Molecular Images in Cancer: A Survey
Xue, Yong; Chen, Shihui; Liu, Yong
2017-01-01
Molecular imaging enables the visualization and quantitative analysis of the alterations of biological procedures at molecular and/or cellular level, which is of great significance for early detection of cancer. In recent years, deep leaning has been widely used in medical imaging analysis, as it overcomes the limitations of visual assessment and traditional machine learning techniques by extracting hierarchical features with powerful representation capability. Research on cancer molecular images using deep learning techniques is also increasing dynamically. Hence, in this paper, we review the applications of deep learning in molecular imaging in terms of tumor lesion segmentation, tumor classification, and survival prediction. We also outline some future directions in which researchers may develop more powerful deep learning models for better performance in the applications in cancer molecular imaging. PMID:29114182
Gas Sensors Based on Molecular Imprinting Technology.
Zhang, Yumin; Zhang, Jin; Liu, Qingju
2017-07-04
Molecular imprinting technology (MIT); often described as a method of designing a material to remember a target molecular structure (template); is a technique for the creation of molecularly imprinted polymers (MIPs) with custom-made binding sites complementary to the target molecules in shape; size and functional groups. MIT has been successfully applied to analyze; separate and detect macromolecular organic compounds. Furthermore; it has been increasingly applied in assays of biological macromolecules. Owing to its unique features of structure specificity; predictability; recognition and universal application; there has been exploration of the possible application of MIPs in the field of highly selective gas sensors. In this present study; we outline the recent advances in gas sensors based on MIT; classify and introduce the existing molecularly imprinted gas sensors; summarize their advantages and disadvantages; and analyze further research directions.
Park, Hahnbeom; Bradley, Philip; Greisen, Per; Liu, Yuan; Mulligan, Vikram Khipple; Kim, David E.; Baker, David; DiMaio, Frank
2017-01-01
Most biomolecular modeling energy functions for structure prediction, sequence design, and molecular docking, have been parameterized using existing macromolecular structural data; this contrasts molecular mechanics force fields which are largely optimized using small-molecule data. In this study, we describe an integrated method that enables optimization of a biomolecular modeling energy function simultaneously against small-molecule thermodynamic data and high-resolution macromolecular structural data. We use this approach to develop a next-generation Rosetta energy function that utilizes a new anisotropic implicit solvation model, and an improved electrostatics and Lennard-Jones model, illustrating how energy functions can be considerably improved in their ability to describe large-scale energy landscapes by incorporating both small-molecule and macromolecule data. The energy function improves performance in a wide range of protein structure prediction challenges, including monomeric structure prediction, protein-protein and protein-ligand docking, protein sequence design, and prediction of the free energy changes by mutation, while reasonably recapitulating small-molecule thermodynamic properties. PMID:27766851
Jaber, Mohammed; Wölfer, Johannes; Ewelt, Christian; Holling, Markus; Hasselblatt, Martin; Niederstadt, Thomas; Zoubi, Tarek; Weckesser, Matthias; Stummer, Walter
2016-03-01
Approximately 20% of grade II and most grade III gliomas fluoresce after 5-aminolevulinic acid (5-ALA) application. Conversely, approximately 30% of nonenhancing gliomas are actually high grade. The aim of this study was to identify preoperative factors (ie, age, enhancement, 18F-fluoroethyl tyrosine positron emission tomography [F-FET PET] uptake ratios) for predicting fluorescence in gliomas without typical glioblastomas imaging features and to determine whether fluorescence will allow prediction of tumor grade or molecular characteristics. Patients harboring gliomas without typical glioblastoma imaging features were given 5-ALA. Fluorescence was recorded intraoperatively, and biopsy specimens collected from fluorescing tissue. World Health Organization (WHO) grade, Ki-67/MIB-1 index, IDH1 (R132H) mutation status, O-methylguanine DNA methyltransferase (MGMT) promoter methylation status, and 1p/19q co-deletion status were assessed. Predictive factors for fluorescence were derived from preoperative magnetic resonance imaging and F-FET PET. Classification and regression tree analysis and receiver-operating-characteristic curves were generated for defining predictors. Of 166 tumors, 82 were diagnosed as WHO grade II, 76 as grade III, and 8 as glioblastomas grade IV. Contrast enhancement, tumor volume, and F-FET PET uptake ratio >1.85 predicted fluorescence. Fluorescence correlated with WHO grade (P < .001) and Ki-67/MIB-1 index (P < .001), but not with MGMT promoter methylation status, IDH1 mutation status, or 1p19q co-deletion status. The Ki-67/MIB-1 index in fluorescing grade III gliomas was higher than in nonfluorescing tumors, whereas in fluorescing and nonfluorescing grade II tumors, no differences were noted. Age, tumor volume, and F-FET PET uptake are factors predicting 5-ALA-induced fluorescence in gliomas without typical glioblastoma imaging features. Fluorescence was associated with an increased Ki-67/MIB-1 index and high-grade pathology. Whether fluorescence in grade II gliomas identifies a subtype with worse prognosis remains to be determined.
Hasan, Md Mehedi; Khatun, Mst Shamima; Mollah, Md Nurul Haque; Yong, Cao; Guo, Dianjing
2017-01-01
Lysine succinylation, an important type of protein posttranslational modification, plays significant roles in many cellular processes. Accurate identification of succinylation sites can facilitate our understanding about the molecular mechanism and potential roles of lysine succinylation. However, even in well-studied systems, a majority of the succinylation sites remain undetected because the traditional experimental approaches to succinylation site identification are often costly, time-consuming, and laborious. In silico approach, on the other hand, is potentially an alternative strategy to predict succinylation substrates. In this paper, a novel computational predictor SuccinSite2.0 was developed for predicting generic and species-specific protein succinylation sites. This predictor takes the composition of profile-based amino acid and orthogonal binary features, which were used to train a random forest classifier. We demonstrated that the proposed SuccinSite2.0 predictor outperformed other currently existing implementations on a complementarily independent dataset. Furthermore, the important features that make visible contributions to species-specific and cross-species-specific prediction of protein succinylation site were analyzed. The proposed predictor is anticipated to be a useful computational resource for lysine succinylation site prediction. The integrated species-specific online tool of SuccinSite2.0 is publicly accessible.
Matboli, Marwa; El-Nakeep, Sarah; Hossam, Nourhan; Habieb, Alaa; Azazy, Ahmed E M; Ebrahim, Ali E; Nagy, Ziad; Abdel-Rahman, Omar
2016-07-14
Gastric cancer (GC) is a global health problem and a major cause of cancer-related death with high recurrence rates ranging from 25% to 40% for GC patients staging II-IV. Unfortunately, while the majority of GC patients usually present with advanced tumor stage; there is still limited evidence-based therapeutic options. Current approach to GC management consists mainly of; endoscopy followed by, gastrectomy and chemotherapy or chemo-radiotherapy. Recent studies in GC have confirmed that it is a heterogeneous disease. Many molecular characterization studies have been performed in GC. Recent discoveries of the molecular pathways underlying the disease have opened the door to more personalized treatment and better predictable outcome. The identification of molecular markers is a useful tool for clinical managementin GC patients, assisting in diagnosis, evaluation of response to treatment and development of novel therapeutic modalities. While chemotherapeutic agents have certain physiological effects on the tumor cells, the prediction of the response is different from one type of tumor to the other. The specificity of molecular biomarkers is a principal feature driving their application in anticancer therapies. Here we are trying to focus on the role of molecular pathways of GC and well-established molecular markers that can guide the therapeutic management.
Patient-derived xenografts as preclinical neuroblastoma models.
Braekeveldt, Noémie; Bexell, Daniel
2018-05-01
The prognosis for children with high-risk neuroblastoma is often poor and survivors can suffer from severe side effects. Predictive preclinical models and novel therapeutic strategies for high-risk disease are therefore a clinical imperative. However, conventional cancer cell line-derived xenografts can deviate substantially from patient tumors in terms of their molecular and phenotypic features. Patient-derived xenografts (PDXs) recapitulate many biologically and clinically relevant features of human cancers. Importantly, PDXs can closely parallel clinical features and outcome and serve as excellent models for biomarker and preclinical drug development. Here, we review progress in and applications of neuroblastoma PDX models. Neuroblastoma orthotopic PDXs share the molecular characteristics, neuroblastoma markers, invasive properties and tumor stroma of aggressive patient tumors and retain spontaneous metastatic capacity to distant organs including bone marrow. The recent identification of genomic changes in relapsed neuroblastomas opens up opportunities to target treatment-resistant tumors in well-characterized neuroblastoma PDXs. We highlight and discuss the features and various sources of neuroblastoma PDXs, methodological considerations when establishing neuroblastoma PDXs, in vitro 3D models, current limitations of PDX models and their application to preclinical drug testing.
Olfactory perception of chemically diverse molecules.
Keller, Andreas; Vosshall, Leslie B
2016-08-08
Understanding the relationship between a stimulus and how it is perceived reveals fundamental principles about the mechanisms of sensory perception. While this stimulus-percept problem is mostly understood for color vision and tone perception, it is not currently possible to predict how a given molecule smells. While there has been some progress in predicting the pleasantness and intensity of an odorant, perceptual data for a larger number of diverse molecules are needed to improve current predictions. Towards this goal, we tested the olfactory perception of 480 structurally and perceptually diverse molecules at two concentrations using a panel of 55 healthy human subjects. For each stimulus, we collected data on perceived intensity, pleasantness, and familiarity. In addition, subjects were asked to apply 20 semantic odor quality descriptors to these stimuli, and were offered the option to describe the smell in their own words. Using this dataset, we replicated several previous correlations between molecular features of the stimulus and olfactory perception. The number of sulfur atoms in a molecule was correlated with the odor quality descriptors "garlic," "fish," and "decayed," and large and structurally complex molecules were perceived to be more pleasant. We discovered a number of correlations in intensity perception between molecules. We show that familiarity had a strong effect on the ability of subjects to describe a smell. Many subjects used commercial products to describe familiar odorants, highlighting the role of prior experience in verbal reports of olfactory perception. Nonspecific descriptors like "chemical" were applied frequently to unfamiliar odorants, and unfamiliar odorants were generally rated as neither pleasant nor unpleasant. We present a very large psychophysical dataset and use this to correlate molecular features of a stimulus to olfactory percept. Our work reveals robust correlations between molecular features and perceptual qualities, and highlights the dominant role of familiarity and experience in assigning verbal descriptors to odorants.
Inferring protein domains associated with drug side effects based on drug-target interaction network
2013-01-01
Background Most phenotypic effects of drugs are involved in the interactions between drugs and their target proteins, however, our knowledge about the molecular mechanism of the drug-target interactions is very limited. One of challenging issues in recent pharmaceutical science is to identify the underlying molecular features which govern drug-target interactions. Results In this paper, we make a systematic analysis of the correlation between drug side effects and protein domains, which we call "pharmacogenomic features," based on the drug-target interaction network. We detect drug side effects and protein domains that appear jointly in known drug-target interactions, which is made possible by using classifiers with sparse models. It is shown that the inferred pharmacogenomic features can be used for predicting potential drug-target interactions. We also discuss advantages and limitations of the pharmacogenomic features, compared with the chemogenomic features that are the associations between drug chemical substructures and protein domains. Conclusion The inferred side effect-domain association network is expected to be useful for estimating common drug side effects for different protein families and characteristic drug side effects for specific protein domains. PMID:24565527
Saravanan, Vijayakumar; Gautham, Namasivayam
2015-10-01
Proteins embody epitopes that serve as their antigenic determinants. Epitopes occupy a central place in integrative biology, not to mention as targets for novel vaccine, pharmaceutical, and systems diagnostics development. The presence of T-cell and B-cell epitopes has been extensively studied due to their potential in synthetic vaccine design. However, reliable prediction of linear B-cell epitope remains a formidable challenge. Earlier studies have reported discrepancy in amino acid composition between the epitopes and non-epitopes. Hence, this study proposed and developed a novel amino acid composition-based feature descriptor, Dipeptide Deviation from Expected Mean (DDE), to distinguish the linear B-cell epitopes from non-epitopes effectively. In this study, for the first time, only exact linear B-cell epitopes and non-epitopes have been utilized for developing the prediction method, unlike the use of epitope-containing regions in earlier reports. To evaluate the performance of the DDE feature vector, models have been developed with two widely used machine-learning techniques Support Vector Machine and AdaBoost-Random Forest. Five-fold cross-validation performance of the proposed method with error-free dataset and dataset from other studies achieved an overall accuracy between nearly 61% and 73%, with balance between sensitivity and specificity metrics. Performance of the DDE feature vector was better (with accuracy difference of about 2% to 12%), in comparison to other amino acid-derived features on different datasets. This study reflects the efficiency of the DDE feature vector in enhancing the linear B-cell epitope prediction performance, compared to other feature representations. The proposed method is made as a stand-alone tool available freely for researchers, particularly for those interested in vaccine design and novel molecular target development for systems therapeutics and diagnostics: https://github.com/brsaran/LBEEP.
Ding, Feng; Sharma, Shantanu; Chalasani, Poornima; Demidov, Vadim V.; Broude, Natalia E.; Dokholyan, Nikolay V.
2008-01-01
RNA molecules with novel functions have revived interest in the accurate prediction of RNA three-dimensional (3D) structure and folding dynamics. However, existing methods are inefficient in automated 3D structure prediction. Here, we report a robust computational approach for rapid folding of RNA molecules. We develop a simplified RNA model for discrete molecular dynamics (DMD) simulations, incorporating base-pairing and base-stacking interactions. We demonstrate correct folding of 150 structurally diverse RNA sequences. The majority of DMD-predicted 3D structures have <4 Å deviations from experimental structures. The secondary structures corresponding to the predicted 3D structures consist of 94% native base-pair interactions. Folding thermodynamics and kinetics of tRNAPhe, pseudoknots, and mRNA fragments in DMD simulations are in agreement with previous experimental findings. Folding of RNA molecules features transient, non-native conformations, suggesting non-hierarchical RNA folding. Our method allows rapid conformational sampling of RNA folding, with computational time increasing linearly with RNA length. We envision this approach as a promising tool for RNA structural and functional analyses. PMID:18456842
deepNF: Deep network fusion for protein function prediction.
Gligorijevic, Vladimir; Barot, Meet; Bonneau, Richard
2018-06-01
The prevalence of high-throughput experimental methods has resulted in an abundance of large-scale molecular and functional interaction networks. The connectivity of these networks provides a rich source of information for inferring functional annotations for genes and proteins. An important challenge has been to develop methods for combining these heterogeneous networks to extract useful protein feature representations for function prediction. Most of the existing approaches for network integration use shallow models that encounter difficulty in capturing complex and highly-nonlinear network structures. Thus, we propose deepNF, a network fusion method based on Multimodal Deep Autoencoders to extract high-level features of proteins from multiple heterogeneous interaction networks. We apply this method to combine STRING networks to construct a common low-dimensional representation containing high-level protein features. We use separate layers for different network types in the early stages of the multimodal autoencoder, later connecting all the layers into a single bottleneck layer from which we extract features to predict protein function. We compare the cross-validation and temporal holdout predictive performance of our method with state-of-the-art methods, including the recently proposed method Mashup. Our results show that our method outperforms previous methods for both human and yeast STRING networks. We also show substantial improvement in the performance of our method in predicting GO terms of varying type and specificity. deepNF is freely available at: https://github.com/VGligorijevic/deepNF. vgligorijevic@flatironinstitute.org, rb133@nyu.edu. Supplementary data are available at Bioinformatics online.
ChemoPy: freely available python package for computational biology and chemoinformatics.
Cao, Dong-Sheng; Xu, Qing-Song; Hu, Qian-Nan; Liang, Yi-Zeng
2013-04-15
Molecular representation for small molecules has been routinely used in QSAR/SAR, virtual screening, database search, ranking, drug ADME/T prediction and other drug discovery processes. To facilitate extensive studies of drug molecules, we developed a freely available, open-source python package called chemoinformatics in python (ChemoPy) for calculating the commonly used structural and physicochemical features. It computes 16 drug feature groups composed of 19 descriptors that include 1135 descriptor values. In addition, it provides seven types of molecular fingerprint systems for drug molecules, including topological fingerprints, electro-topological state (E-state) fingerprints, MACCS keys, FP4 keys, atom pairs fingerprints, topological torsion fingerprints and Morgan/circular fingerprints. By applying a semi-empirical quantum chemistry program MOPAC, ChemoPy can also compute a large number of 3D molecular descriptors conveniently. The python package, ChemoPy, is freely available via http://code.google.com/p/pychem/downloads/list, and it runs on Linux and MS-Windows. Supplementary data are available at Bioinformatics online.
Warth, A
2015-11-01
Tumor diagnostics are based on histomorphology, immunohistochemistry and molecular pathological analysis of mutations, translocations and amplifications which are of diagnostic, prognostic and/or predictive value. In recent decades only histomorphology was used to classify lung cancer as either small (SCLC) or non-small cell lung cancer (NSCLC), although NSCLC was further subdivided in different entities; however, as no specific therapy options were available classification of specific subtypes was not clinically meaningful. This fundamentally changed with the discovery of specific molecular alterations in adenocarcinoma (ADC), e.g. mutations in KRAS, EGFR and BRAF or translocations of the ALK and ROS1 gene loci, which now form the basis of targeted therapies and have led to a significantly improved patient outcome. The diagnostic, prognostic and predictive value of imaging, morphological, immunohistochemical and molecular characteristics as well as their interaction were systematically assessed in a large cohort with available clinical data including patient survival. Specific and sensitive diagnostic markers and marker panels were defined and diagnostic test algorithms for predictive biomarker assessment were optimized. It was demonstrated that the semi-quantitative assessment of ADC growth patterns is a stage-independent predictor of survival and is reproducibly applicable in the routine setting. Specific histomorphological characteristics correlated with computed tomography (CT) imaging features and thus allowed an improved interdisciplinary classification, especially in the preoperative or palliative setting. Moreover, specific molecular characteristics, for example BRAF mutations and the proliferation index (Ki-67) were identified as clinically relevant prognosticators. Comprehensive clinical, morphological, immunohistochemical and molecular assessment of NSCLCs allow an optimized patient stratification. Respective algorithms now form the backbone of the 2015 lung cancer World Health Organization (WHO) classification.
Pronobis, Wiktor; Tkatchenko, Alexandre; Müller, Klaus-Robert
2018-06-12
Machine learning (ML) based prediction of molecular properties across chemical compound space is an important and alternative approach to efficiently estimate the solutions of highly complex many-electron problems in chemistry and physics. Statistical methods represent molecules as descriptors that should encode molecular symmetries and interactions between atoms. Many such descriptors have been proposed; all of them have advantages and limitations. Here, we propose a set of general two-body and three-body interaction descriptors which are invariant to translation, rotation, and atomic indexing. By adapting the successfully used kernel ridge regression methods of machine learning, we evaluate our descriptors on predicting several properties of small organic molecules calculated using density-functional theory. We use two data sets. The GDB-7 set contains 6868 molecules with up to 7 heavy atoms of type CNO. The GDB-9 set is composed of 131722 molecules with up to 9 heavy atoms containing CNO. When trained on 5000 random molecules, our best model achieves an accuracy of 0.8 kcal/mol (on the remaining 1868 molecules of GDB-7) and 1.5 kcal/mol (on the remaining 126722 molecules of GDB-9) respectively. Applying a linear regression model on our novel many-body descriptors performs almost equal to a nonlinear kernelized model. Linear models are readily interpretable: a feature importance ranking measure helps to obtain qualitative and quantitative insights on the importance of two- and three-body molecular interactions for predicting molecular properties computed with quantum-mechanical methods.
A Predictive Model of Intein Insertion Site for Use in the Engineering of Molecular Switches
Apgar, James; Ross, Mary; Zuo, Xiao; Dohle, Sarah; Sturtevant, Derek; Shen, Binzhang; de la Vega, Humberto; Lessard, Philip; Lazar, Gabor; Raab, R. Michael
2012-01-01
Inteins are intervening protein domains with self-splicing ability that can be used as molecular switches to control activity of their host protein. Successfully engineering an intein into a host protein requires identifying an insertion site that permits intein insertion and splicing while allowing for proper folding of the mature protein post-splicing. By analyzing sequence and structure based properties of native intein insertion sites we have identified four features that showed significant correlation with the location of the intein insertion sites, and therefore may be useful in predicting insertion sites in other proteins that provide native-like intein function. Three of these properties, the distance to the active site and dimer interface site, the SVM score of the splice site cassette, and the sequence conservation of the site showed statistically significant correlation and strong predictive power, with area under the curve (AUC) values of 0.79, 0.76, and 0.73 respectively, while the distance to secondary structure/loop junction showed significance but with less predictive power (AUC of 0.54). In a case study of 20 insertion sites in the XynB xylanase, two features of native insertion sites showed correlation with the splice sites and demonstrated predictive value in selecting non-native splice sites. Structural modeling of intein insertions at two sites highlighted the role that the insertion site location could play on the ability of the intein to modulate activity of the host protein. These findings can be used to enrich the selection of insertion sites capable of supporting intein splicing and hosting an intein switch. PMID:22649521
Latent feature decompositions for integrative analysis of multi-platform genomic data
Gregory, Karl B.; Momin, Amin A.; Coombes, Kevin R.; Baladandayuthapani, Veerabhadran
2015-01-01
Increased availability of multi-platform genomics data on matched samples has sparked research efforts to discover how diverse molecular features interact both within and between platforms. In addition, simultaneous measurements of genetic and epigenetic characteristics illuminate the roles their complex relationships play in disease progression and outcomes. However, integrative methods for diverse genomics data are faced with the challenges of ultra-high dimensionality and the existence of complex interactions both within and between platforms. We propose a novel modeling framework for integrative analysis based on decompositions of the large number of platform-specific features into a smaller number of latent features. Subsequently we build a predictive model for clinical outcomes accounting for both within- and between-platform interactions based on Bayesian model averaging procedures. Principal components, partial least squares and non-negative matrix factorization as well as sparse counterparts of each are used to define the latent features, and the performance of these decompositions is compared both on real and simulated data. The latent feature interactions are shown to preserve interactions between the original features and not only aid prediction but also allow explicit selection of outcome-related features. The methods are motivated by and applied to, a glioblastoma multiforme dataset from The Cancer Genome Atlas to predict patient survival times integrating gene expression, microRNA, copy number and methylation data. For the glioblastoma data, we find a high concordance between our selected prognostic genes and genes with known associations with glioblastoma. In addition, our model discovers several relevant cross-platform interactions such as copy number variation associated gene dosing and epigenetic regulation through promoter methylation. On simulated data, we show that our proposed method successfully incorporates interactions within and between genomic platforms to aid accurate prediction and variable selection. Our methods perform best when principal components are used to define the latent features. PMID:26146492
Long-range spin coherence in a strongly coupled all-electronic dot-cavity system
NASA Astrophysics Data System (ADS)
Ferguson, Michael Sven; Oehri, David; Rössler, Clemens; Ihn, Thomas; Ensslin, Klaus; Blatter, Gianni; Zilberberg, Oded
2017-12-01
We present a theoretical analysis of spin-coherent electronic transport across a mesoscopic dot-cavity system. Such spin-coherent transport has been recently demonstrated in an experiment with a dot-cavity hybrid implemented in a high-mobility two-dimensional electron gas [C. Rössler et al., Phys. Rev. Lett. 115, 166603 (2015), 10.1103/PhysRevLett.115.166603] and its spectroscopic signatures have been interpreted in terms of a competition between Kondo-type dot-lead and molecular-type dot-cavity singlet formation. Our analysis brings forward all the transport features observed in the experiments and supports the claim that a spin-coherent molecular singlet forms across the full extent of the dot-cavity device. Our model analysis includes (i) a single-particle numerical investigation of the two-dimensional geometry, its quantum-coral-type eigenstates, and associated spectroscopic transport features, (ii) the derivation of an effective interacting model based on the observations of the numerical and experimental studies, and (iii) the prediction of transport characteristics through the device using a combination of a master-equation approach on top of exact eigenstates of the dot-cavity system, and an equation-of-motion analysis that includes Kondo physics. The latter provides additional temperature scaling predictions for the many-body phase transition between molecular- and Kondo-singlet formation and its associated transport signatures.
Computational prediction of kink properties of helices in membrane proteins
NASA Astrophysics Data System (ADS)
Mai, T.-L.; Chen, C.-M.
2014-02-01
We have combined molecular dynamics simulations and fold identification procedures to investigate the structure of 696 kinked and 120 unkinked transmembrane (TM) helices in the PDBTM database. Our main aim of this study is to understand the formation of helical kinks by simulating their quasi-equilibrium heating processes, which might be relevant to the prediction of their structural features. The simulated structural features of these TM helices, including the position and the angle of helical kinks, were analyzed and compared with statistical data from PDBTM. From quasi-equilibrium heating processes of TM helices with four very different relaxation time constants, we found that these processes gave comparable predictions of the structural features of TM helices. Overall, 95 % of our best kink position predictions have an error of no more than two residues and 75 % of our best angle predictions have an error of less than 15°. Various structure assessments have been carried out to assess our predicted models of TM helices in PDBTM. Our results show that, in 696 predicted kinked helices, 70 % have a RMSD less than 2 Å, 71 % have a TM-score greater than 0.5, 69 % have a MaxSub score greater than 0.8, 60 % have a GDT-TS score greater than 85, and 58 % have a GDT-HA score greater than 70. For unkinked helices, our predicted models are also highly consistent with their crystal structure. These results provide strong supports for our assumption that kink formation of TM helices in quasi-equilibrium heating processes is relevant to predicting the structure of TM helices.
NASA Astrophysics Data System (ADS)
Andersson, C. David; Hillgren, J. Mikael; Lindgren, Cecilia; Qian, Weixing; Akfur, Christine; Berg, Lotta; Ekström, Fredrik; Linusson, Anna
2015-03-01
Scientific disciplines such as medicinal- and environmental chemistry, pharmacology, and toxicology deal with the questions related to the effects small organic compounds exhort on biological targets and the compounds' physicochemical properties responsible for these effects. A common strategy in this endeavor is to establish structure-activity relationships (SARs). The aim of this work was to illustrate benefits of performing a statistical molecular design (SMD) and proper statistical analysis of the molecules' properties before SAR and quantitative structure-activity relationship (QSAR) analysis. Our SMD followed by synthesis yielded a set of inhibitors of the enzyme acetylcholinesterase (AChE) that had very few inherent dependencies between the substructures in the molecules. If such dependencies exist, they cause severe errors in SAR interpretation and predictions by QSAR-models, and leave a set of molecules less suitable for future decision-making. In our study, SAR- and QSAR models could show which molecular sub-structures and physicochemical features that were advantageous for the AChE inhibition. Finally, the QSAR model was used for the prediction of the inhibition of AChE by an external prediction set of molecules. The accuracy of these predictions was asserted by statistical significance tests and by comparisons to simple but relevant reference models.
Janet, Jon Paul; Kulik, Heather J
2017-11-22
Machine learning (ML) of quantum mechanical properties shows promise for accelerating chemical discovery. For transition metal chemistry where accurate calculations are computationally costly and available training data sets are small, the molecular representation becomes a critical ingredient in ML model predictive accuracy. We introduce a series of revised autocorrelation functions (RACs) that encode relationships of the heuristic atomic properties (e.g., size, connectivity, and electronegativity) on a molecular graph. We alter the starting point, scope, and nature of the quantities evaluated in standard ACs to make these RACs amenable to inorganic chemistry. On an organic molecule set, we first demonstrate superior standard AC performance to other presently available topological descriptors for ML model training, with mean unsigned errors (MUEs) for atomization energies on set-aside test molecules as low as 6 kcal/mol. For inorganic chemistry, our RACs yield 1 kcal/mol ML MUEs on set-aside test molecules in spin-state splitting in comparison to 15-20× higher errors for feature sets that encode whole-molecule structural information. Systematic feature selection methods including univariate filtering, recursive feature elimination, and direct optimization (e.g., random forest and LASSO) are compared. Random-forest- or LASSO-selected subsets 4-5× smaller than the full RAC set produce sub- to 1 kcal/mol spin-splitting MUEs, with good transferability to metal-ligand bond length prediction (0.004-5 Å MUE) and redox potential on a smaller data set (0.2-0.3 eV MUE). Evaluation of feature selection results across property sets reveals the relative importance of local, electronic descriptors (e.g., electronegativity, atomic number) in spin-splitting and distal, steric effects in redox potential and bond lengths.
Chen, Lei; Zhang, Yu-Hang; Zheng, Mingyue; Huang, Tao; Cai, Yu-Dong
2016-12-01
Compound-protein interactions play important roles in every cell via the recognition and regulation of specific functional proteins. The correct identification of compound-protein interactions can lead to a good comprehension of this complicated system and provide useful input for the investigation of various attributes of compounds and proteins. In this study, we attempted to understand this system by extracting properties from both proteins and compounds, in which proteins were represented by gene ontology and KEGG pathway enrichment scores and compounds were represented by molecular fragments. Advanced feature selection methods, including minimum redundancy maximum relevance, incremental feature selection, and the basic machine learning algorithm random forest, were used to analyze these properties and extract core factors for the determination of actual compound-protein interactions. Compound-protein interactions reported in The Binding Databases were used as positive samples. To improve the reliability of the results, the analytic procedure was executed five times using different negative samples. Simultaneously, five optimal prediction methods based on a random forest and yielding maximum MCCs of approximately 77.55 % were constructed and may be useful tools for the prediction of compound-protein interactions. This work provides new clues to understanding the system of compound-protein interactions by analyzing extracted core features. Our results indicate that compound-protein interactions are related to biological processes involving immune, developmental and hormone-associated pathways.
Shahid, Mohammad; Shahzad Cheema, Muhammad; Klenner, Alexander; Younesi, Erfan; Hofmann-Apitius, Martin
2013-03-01
Systems pharmacological modeling of drug mode of action for the next generation of multitarget drugs may open new routes for drug design and discovery. Computational methods are widely used in this context amongst which support vector machines (SVM) have proven successful in addressing the challenge of classifying drugs with similar features. We have applied a variety of such SVM-based approaches, namely SVM-based recursive feature elimination (SVM-RFE). We use the approach to predict the pharmacological properties of drugs widely used against complex neurodegenerative disorders (NDD) and to build an in-silico computational model for the binary classification of NDD drugs from other drugs. Application of an SVM-RFE model to a set of drugs successfully classified NDD drugs from non-NDD drugs and resulted in overall accuracy of ∼80 % with 10 fold cross validation using 40 top ranked molecular descriptors selected out of total 314 descriptors. Moreover, SVM-RFE method outperformed linear discriminant analysis (LDA) based feature selection and classification. The model reduced the multidimensional descriptors space of drugs dramatically and predicted NDD drugs with high accuracy, while avoiding over fitting. Based on these results, NDD-specific focused libraries of drug-like compounds can be designed and existing NDD-specific drugs can be characterized by a well-characterized set of molecular descriptors. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Hybrid optimal descriptors as a tool to predict skin sensitization in accordance to OECD principles.
Toropova, Alla P; Toropov, Andrey A
2017-06-05
Skin sensitization (allergic contact dermatitis) is a widespread problem arising from the contact of chemicals with the skin. The detection of molecular features with undesired effect for skin is complex task owing to unclear biochemical mechanisms and unclearness of conditions of action of chemicals to skin. The development of computational methods for estimation of this endpoint in order to reduce animal testing is recommended (Cosmetics Directive EC regulation 1907/2006; EU Regulation, Regulation, 1223/2009). The CORAL software (http://www.insilico.eu/coral) gives good predictive models for the skin sensitization. Simplified molecular input-line entry system (SMILES) together with molecular graph are used to represent the molecular structure for these models. So-called hybrid optimal descriptors are used to establish quantitative structure-activity relationships (QSARs). The aim of this study is the estimation of the predictive potential of the hybrid descriptors. Three different distributions into the training (≈70%), calibration (≈15%), and validation (≈15%) sets are studied. QSAR for these three distributions are built up with using the Monte Carlo technique. The statistical characteristics of these models for external validation set are used as a measure of predictive potential of these models. The best model, according to the above criterion, is characterized by n validation =29, r 2 validation =0.8596, RMSE validation =0.489. Mechanistic interpretation and domain of applicability for these models are defined. Copyright © 2017 Elsevier B.V. All rights reserved.
Wu, Jiansheng; Zhang, Qiuming; Wu, Weijian; Pang, Tao; Hu, Haifeng; Chan, Wallace K B; Ke, Xiaoyan; Zhang, Yang; Wren, Jonathan
2018-02-08
Precise assessment of ligand bioactivities (including IC50, EC50, Ki, Kd, etc.) is essential for virtual screening and lead compound identification. However, not all ligands have experimentally-determined activities. In particular, many G protein-coupled receptors (GPCRs), which are the largest integral membrane protein family and represent targets of nearly 40% drugs on the market, lack published experimental data about ligand interactions. Computational methods with the ability to accurately predict the bioactivity of ligands can help efficiently address this problem. We proposed a new method, WDL-RF, using weighted deep learning and random forest, to model the bioactivity of GPCR-associated ligand molecules. The pipeline of our algorithm consists of two consecutive stages: 1) molecular fingerprint generation through a new weighted deep learning method, and 2) bioactivity calculations with a random forest model; where one uniqueness of the approach is that the model allows end-to-end learning of prediction pipelines with input ligands being of arbitrary size. The method was tested on a set of twenty-six non-redundant GPCRs that have a high number of active ligands, each with 200∼4000 ligand associations. The results from our benchmark show that WDL-RF can generate bioactivity predictions with an average root-mean square error 1.33 and correlation coefficient (r2) 0.80 compared to the experimental measurements, which are significantly more accurate than the control predictors with different molecular fingerprints and descriptors. In particular, data-driven molecular fingerprint features, as extracted from the weighted deep learning models, can help solve deficiencies stemming from the use of traditional hand-crafted features and significantly increase the efficiency of short molecular fingerprints in virtual screening. The WDL-RF web server, as well as source codes and datasets of WDL-RF, is freely available at https://zhanglab.ccmb.med.umich.edu/WDL-RF/ for academic purposes. Xiaoyan Ke (kexynj@hotmail.com); Yang Zhang (zhng@umich.edu). Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
A molecular thermodynamic model for the stability of hepatitis B capsids
NASA Astrophysics Data System (ADS)
Kim, Jehoon; Wu, Jianzhong
2014-06-01
Self-assembly of capsid proteins and genome encapsidation are two critical steps in the life cycle of most plant and animal viruses. A theoretical description of such processes from a physiochemical perspective may help better understand viral replication and morphogenesis thus provide fresh insights into the experimental studies of antiviral strategies. In this work, we propose a molecular thermodynamic model for predicting the stability of Hepatitis B virus (HBV) capsids either with or without loading nucleic materials. With the key components represented by coarse-grained thermodynamic models, the theoretical predictions are in excellent agreement with experimental data for the formation free energies of empty T4 capsids over a broad range of temperature and ion concentrations. The theoretical model predicts T3/T4 dimorphism also in good agreement with the capsid formation at in vivo and in vitro conditions. In addition, we have studied the stability of the viral particles in response to physiological cellular conditions with the explicit consideration of the hydrophobic association of capsid subunits, electrostatic interactions, molecular excluded volume effects, entropy of mixing, and conformational changes of the biomolecular species. The course-grained model captures the essential features of the HBV nucleocapsid stability revealed by recent experiments.
A VEGF-dependent gene signature enriched in mesenchymal ovarian cancer predicts patient prognosis.
Yin, Xia; Wang, Xiaojie; Shen, Boqiang; Jing, Ying; Li, Qing; Cai, Mei-Chun; Gu, Zhuowei; Yang, Qi; Zhang, Zhenfeng; Liu, Jin; Li, Hongxia; Di, Wen; Zhuang, Guanglei
2016-08-08
We have previously reported surrogate biomarkers of VEGF pathway activities with the potential to provide predictive information for anti-VEGF therapies. The aim of this study was to systematically evaluate a new VEGF-dependent gene signature (VDGs) in relation to molecular subtypes of ovarian cancer and patient prognosis. Using microarray profiling and cross-species analysis, we identified 140-gene mouse VDGs and corresponding 139-gene human VDGs, which displayed enrichment of vasculature and basement membrane genes. In patients who received bevacizumab therapy and showed partial response, the expressions of VDGs (summarized to yield VDGs scores) were markedly decreased in post-treatment biopsies compared with pre-treatment baselines. In contrast, VDGs scores were not significantly altered following bevacizumab treatment in patients with stable or progressive disease. Analysis of VDGs in ovarian cancer showed that VDGs as a prognostic signature was able to predict patient outcome. Correlation estimation of VDGs scores and molecular features revealed that VDGs was overrepresented in mesenchymal subtype and BRCA mutation carriers. These findings highlighted the prognostic role of VEGF-mediated angiogenesis in ovarian cancer, and proposed a VEGF-dependent gene signature as a molecular basis for developing novel diagnostic strategies to aid patient selection for VEGF-targeted agents.
KRAS Mutation as a Potential Prognostic Biomarker of Biliary Tract Cancers
Yokoyama, Masaaki; Ohnishi, Hiroaki; Ohtsuka, Kouki; Matsushima, Satsuki; Ohkura, Yasuo; Furuse, Junji; Watanabe, Takashi; Mori, Toshiyuki; Sugiyama, Masanori
2016-01-01
BACKGROUND The aim of this study was to identify the unique molecular characteristics of biliary tract cancer (BTC) for the development of novel molecular-targeted therapies. MATERIALS AND METHODS We performed mutational analysis of KRAS, BRAF, PIK3CA, and FBXW7 and immunohistochemical analysis of EGFR and TP53 in 63 Japanese patients with BTC and retrospectively evaluated the association between the molecular characteristics and clinicopathological features of BTC. RESULTS KRAS mutations were identified in 9 (14%) of the 63 BTC patients; no mutations were detected within the analyzed regions of BRAF, PIK3CA, and FBXW7. EGFR overexpression was observed in 5 (8%) of the 63 tumors, while TP53 overexpression was observed in 48% (30/63) of the patients. Overall survival of patients with KRAS mutation was significantly shorter than that of patients with the wild-type KRAS gene (P = 0.005). By multivariate analysis incorporating molecular and clinicopathological features, KRAS mutations and lymph node metastasis were identified to be independently associated with shorter overall survival (KRAS, P = 0.004; lymph node metastasis, P = 0.015). CONCLUSIONS Our data suggest that KRAS mutation is a poor prognosis predictive biomarker for the survival in BTC patients. PMID:28008299
NASA Astrophysics Data System (ADS)
Toropov, Andrey A.; Toropova, Alla P.
2018-06-01
Predictive model of logP for Pt(II) and Pt(IV) complexes built up with the Monte Carlo method using the CORAL software has been validated with six different splits into the training and validation sets. The improving of the predictive potential of models for six different splits has been obtained using so-called index of ideality of correlation. The suggested models give possibility to extract molecular features, which cause the increase or vice versa decrease of the logP.
Kandaswamy, Krishna Kumar; Pugalenthi, Ganesan; Möller, Steffen; Hartmann, Enno; Kalies, Kai-Uwe; Suganthan, P N; Martinetz, Thomas
2010-12-01
Apoptosis is an essential process for controlling tissue homeostasis by regulating a physiological balance between cell proliferation and cell death. The subcellular locations of proteins performing the cell death are determined by mostly independent cellular mechanisms. The regular bioinformatics tools to predict the subcellular locations of such apoptotic proteins do often fail. This work proposes a model for the sorting of proteins that are involved in apoptosis, allowing us to both the prediction of their subcellular locations as well as the molecular properties that contributed to it. We report a novel hybrid Genetic Algorithm (GA)/Support Vector Machine (SVM) approach to predict apoptotic protein sequences using 119 sequence derived properties like frequency of amino acid groups, secondary structure, and physicochemical properties. GA is used for selecting a near-optimal subset of informative features that is most relevant for the classification. Jackknife cross-validation is applied to test the predictive capability of the proposed method on 317 apoptosis proteins. Our method achieved 85.80% accuracy using all 119 features and 89.91% accuracy for 25 features selected by GA. Our models were examined by a test dataset of 98 apoptosis proteins and obtained an overall accuracy of 90.34%. The results show that the proposed approach is promising; it is able to select small subsets of features and still improves the classification accuracy. Our model can contribute to the understanding of programmed cell death and drug discovery. The software and dataset are available at http://www.inb.uni-luebeck.de/tools-demos/apoptosis/GASVM.
Rosnik, Andreana M; Curutchet, Carles
2015-12-08
Over the past decade, both experimentalists and theorists have worked to develop methods to describe pigment-protein coupling in photosynthetic light-harvesting complexes in order to understand the molecular basis of quantum coherence effects observed in photosynthesis. Here we present an improved strategy based on the combination of quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulations and excited-state calculations to predict the spectral density of electronic-vibrational coupling. We study the water-soluble chlorophyll-binding protein (WSCP) reconstituted with Chl a or Chl b pigments as the system of interest and compare our work with data obtained by Pieper and co-workers from differential fluorescence line-narrowing spectra (Pieper et al. J. Phys. Chem. B 2011, 115 (14), 4042-4052). Our results demonstrate that the use of QM/MM MD simulations where the nuclear positions are still propagated at the classical level leads to a striking improvement of the predicted spectral densities in the middle- and high-frequency regions, where they nearly reach quantitative accuracy. This demonstrates that the so-called "geometry mismatch" problem related to the use of low-quality structures in QM calculations, not the quantum features of pigments high-frequency motions, causes the failure of previous studies relying on similar protocols. Thus, this work paves the way toward quantitative predictions of pigment-protein coupling and the comprehension of quantum coherence effects in photosynthesis.
Scott, Milcah C.; Sarver, Aaron L.; Gavin, Katherine J.; Thayanithy, Venugopal; Getzy, David M.; Newman, Robert A.; Cutter, Gary R.; Lindblad-Toh, Kerstin; Kisseberth, William C.; Hunter, Lawrence E.; Subramanian, Subbaya; Breen, Matthew; Modiano, Jaime F.
2011-01-01
The heterogeneous and chaotic nature of osteosarcoma has confounded accurate molecular classification, prognosis, and prediction for this tumor. The occurrence of spontaneous osteosarcoma is largely confined to humans and dogs. While the clinical features are remarkably similar in both species, the organization of dogs into defined breeds provides a more homogeneous genetic background that may increase the likelihood to uncover molecular subtypes for this complex disease. We thus hypothesized that molecular profiles derived from canine osteosarcoma would aid in molecular subclassification of this disease when applied to humans. To test the hypothesis, we performed genome wide gene expression profiling in a cohort of dogs with osteosarcoma, primarily from high-risk breeds. To further reduce inter-sample heterogeneity, we assessed tumor-intrinsic properties through use of an extensive panel of osteosarcoma-derived cell lines. We observed strong differential gene expression that segregated samples into two groups with differential survival probabilities. Groupings were characterized by the inversely correlated expression of genes associated with G2/M transition and DNA damage checkpoint and microenvironment-interaction categories. This signature was preserved in data from whole tumor samples of three independent dog osteosarcoma cohorts, with stratification into the two expected groups. Significantly, this restricted signature partially overlapped a previously defined, predictive signature for soft tissue sarcomas, and it unmasked orthologous molecular subtypes and their corresponding natural histories in five independent data sets from human patients with osteosarcoma. Our results indicate that the narrower genetic diversity of dogs can be utilized to group complex human osteosarcoma into biologically and clinically relevant molecular subtypes. This in turn may enhance prognosis and prediction, and identify relevant therapeutic targets. PMID:21621658
Machine learning for epigenetics and future medical applications.
Holder, Lawrence B; Haque, M Muksitul; Skinner, Michael K
2017-07-03
Understanding epigenetic processes holds immense promise for medical applications. Advances in Machine Learning (ML) are critical to realize this promise. Previous studies used epigenetic data sets associated with the germline transmission of epigenetic transgenerational inheritance of disease and novel ML approaches to predict genome-wide locations of critical epimutations. A combination of Active Learning (ACL) and Imbalanced Class Learning (ICL) was used to address past problems with ML to develop a more efficient feature selection process and address the imbalance problem in all genomic data sets. The power of this novel ML approach and our ability to predict epigenetic phenomena and associated disease is suggested. The current approach requires extensive computation of features over the genome. A promising new approach is to introduce Deep Learning (DL) for the generation and simultaneous computation of novel genomic features tuned to the classification task. This approach can be used with any genomic or biological data set applied to medicine. The application of molecular epigenetic data in advanced machine learning analysis to medicine is the focus of this review.
Molecular genetics of childhood, adolescent and young adult non-Hodgkin lymphoma.
Miles, Rodney R; Shah, Rikin K; Frazer, J Kimble
2016-05-01
Molecular genetic abnormalities are ubiquitous in non-Hodgkin lymphoma (NHL), but genetic changes are not yet used to define specific lymphoma subtypes. Certain recurrent molecular genetic abnormalities in NHL underlie molecular pathogenesis and/or are associated with prognosis or represent potential therapeutic targets. Most molecular genetic studies of B- and T-NHL have been performed on adult patient samples, and the relevance of many of these findings for childhood, adolescent and young adult NHL remains to be demonstrated. In this review, we focus on NHL subtypes that are most common in young patients and emphasize features actually studied in younger NHL patients. This approach highlights what is known about NHL genetics in young patients but also points to gaps that remain, which will require cooperative efforts to collect and share biological specimens for genomic and genetic analyses in order to help predict outcomes and guide therapy in the future. © 2016 John Wiley & Sons Ltd.
KBG syndrome involving a single-nucleotide duplication in ANKRD11
Kleyner, Robert; Malcolmson, Janet; Tegay, David; Ward, Kenneth; Maughan, Annette; Maughan, Glenn; Nelson, Lesa; Wang, Kai; Robison, Reid; Lyon, Gholson J.
2016-01-01
KBG syndrome is a rare autosomal dominant genetic condition characterized by neurological involvement and distinct facial, hand, and skeletal features. More than 70 cases have been reported; however, it is likely that KBG syndrome is underdiagnosed because of lack of comprehensive characterization of the heterogeneous phenotypic features. We describe the clinical manifestations in a male currently 13 years of age, who exhibited symptoms including epilepsy, severe developmental delay, distinct facial features, and hand anomalies, without a positive genetic diagnosis. Subsequent exome sequencing identified a novel de novo heterozygous single base pair duplication (c.6015dupA) in ANKRD11, which was validated by Sanger sequencing. This single-nucleotide duplication is predicted to lead to a premature stop codon and loss of function in ANKRD11, thereby implicating it as contributing to the proband's symptoms and yielding a molecular diagnosis of KBG syndrome. Before molecular diagnosis, this syndrome was not recognized in the proband, as several key features of the disorder were mild and were not recognized by clinicians, further supporting the concept of variable expressivity in many disorders. Although a diagnosis of cerebral folate deficiency has also been given, its significance for the proband's condition remains uncertain. PMID:27900361
Chaudhari, Mangesh I; Holleran, Sinead A; Ashbaugh, Henry S; Pratt, Lawrence R
2013-12-17
The osmotic second virial coefficients, B2, for atomic-sized hard spheres in water are attractive (B2 < 0) and become more attractive with increasing temperature (ΔB2/ΔT < 0) in the temperature range 300 K ≤ T ≤ 360 K. Thus, these hydrophobic interactions are attractive and endothermic at moderate temperatures. Hydrophobic interactions between atomic-sized hard spheres in water are more attractive than predicted by the available statistical mechanical theory. These results constitute an initial step toward detailed molecular theory of additional intermolecular interaction features, specifically, attractive interactions associated with hydrophobic solutes.
Grasso, Ernesto J.; Sottile, Adolfo E.; Coronel, Carlos E.
2016-01-01
It is known that caltrin (calcium transport inhibitor) protein binds to sperm cells during ejaculation and inhibits extracellular Ca2+ uptake. Although the sequence and some biological features of mouse caltrin I and bovine caltrin are known, their physicochemical properties and tertiary structure are mainly unknown. We predicted the 3D structures of mouse caltrin I and bovine caltrin by molecular homology modeling and threading. Surface electrostatic potentials and electric fields were calculated using the Poisson–Boltzmann equation. Several different bioinformatics tools and available web servers were used to thoroughly analyze the physicochemical characteristics of both proteins, such as their Kyte and Doolittle hydropathy scores and helical wheel projections. The results presented in this work significantly aid further understanding of the molecular mechanisms of caltrin proteins modulating physiological processes associated with fertilization. PMID:27812283
Predicting clinical outcome of neuroblastoma patients using an integrative network-based approach.
Tranchevent, Léon-Charles; Nazarov, Petr V; Kaoma, Tony; Schmartz, Georges P; Muller, Arnaud; Kim, Sang-Yoon; Rajapakse, Jagath C; Azuaje, Francisco
2018-06-07
One of the main current challenges in computational biology is to make sense of the huge amounts of multidimensional experimental data that are being produced. For instance, large cohorts of patients are often screened using different high-throughput technologies, effectively producing multiple patient-specific molecular profiles for hundreds or thousands of patients. We propose and implement a network-based method that integrates such patient omics data into Patient Similarity Networks. Topological features derived from these networks were then used to predict relevant clinical features. As part of the 2017 CAMDA challenge, we have successfully applied this strategy to a neuroblastoma dataset, consisting of genomic and transcriptomic data. In particular, we observe that models built on our network-based approach perform at least as well as state of the art models. We furthermore explore the effectiveness of various topological features and observe, for instance, that redundant centrality metrics can be combined to build more powerful models. We demonstrate that the networks inferred from omics data contain clinically relevant information and that patient clinical outcomes can be predicted using only network topological data. This article was reviewed by Yang-Yu Liu, Tomislav Smuc and Isabel Nepomuceno.
EffectorP: predicting fungal effector proteins from secretomes using machine learning.
Sperschneider, Jana; Gardiner, Donald M; Dodds, Peter N; Tini, Francesco; Covarelli, Lorenzo; Singh, Karam B; Manners, John M; Taylor, Jennifer M
2016-04-01
Eukaryotic filamentous plant pathogens secrete effector proteins that modulate the host cell to facilitate infection. Computational effector candidate identification and subsequent functional characterization delivers valuable insights into plant-pathogen interactions. However, effector prediction in fungi has been challenging due to a lack of unifying sequence features such as conserved N-terminal sequence motifs. Fungal effectors are commonly predicted from secretomes based on criteria such as small size and cysteine-rich, which suffers from poor accuracy. We present EffectorP which pioneers the application of machine learning to fungal effector prediction. EffectorP improves fungal effector prediction from secretomes based on a robust signal of sequence-derived properties, achieving sensitivity and specificity of over 80%. Features that discriminate fungal effectors from secreted noneffectors are predominantly sequence length, molecular weight and protein net charge, as well as cysteine, serine and tryptophan content. We demonstrate that EffectorP is powerful when combined with in planta expression data for predicting high-priority effector candidates. EffectorP is the first prediction program for fungal effectors based on machine learning. Our findings will facilitate functional fungal effector studies and improve our understanding of effectors in plant-pathogen interactions. EffectorP is available at http://effectorp.csiro.au. © 2015 CSIRO New Phytologist © 2015 New Phytologist Trust.
Docking and multivariate methods to explore HIV-1 drug-resistance: a comparative analysis
NASA Astrophysics Data System (ADS)
Almerico, Anna Maria; Tutone, Marco; Lauria, Antonino
2008-05-01
In this paper we describe a comparative analysis between multivariate and docking methods in the study of the drug resistance to the reverse transcriptase and the protease inhibitors. In our early papers we developed a simple but efficient method to evaluate the features of compounds that are less likely to trigger resistance or are effective against mutant HIV strains, using the multivariate statistical procedures PCA and DA. In the attempt to create a more solid background for the prediction of susceptibility or resistance, we carried out a comparative analysis between our previous multivariate approach and molecular docking study. The intent of this paper is not only to find further support to the results obtained by the combined use of PCA and DA, but also to evidence the structural features, in terms of molecular descriptors, similarity, and energetic contributions, derived from docking, which can account for the arising of drug-resistance against mutant strains.
Gharakhanian, Eric G; Deming, Timothy J
2016-07-07
A series of thermoresponsive polypeptides has been synthesized using a methodology that allowed facile adjustment of side-chain functional groups. The lower critical solution temperature (LCST) properties of these polymers in water were then evaluated relative to systematic molecular modifications in their side-chains. It was found that in addition to the number of ethylene glycol repeats in the side-chains, terminal and linker groups also have substantial and predictable effects on cloud point temperatures (Tcp). In particular, we found that the structure of these polypeptides allowed for inclusion of polar hydroxyl groups, which significantly increased their hydrophilicity and decreased the need to use long oligoethylene glycol repeats to obtain LCSTs. The thioether linkages in these polypeptides were found to provide an additional structural feature for reversible switching of both polypeptide conformation and thermoresponsive properties.
Yan, Mingquan; Han, Xuze; Zhang, Chenyang
2017-11-01
In this study, seven model compounds containing typical functional groups (phenolic and carboxylic groups) present in nature organic matter (NOM) were used to ascertain the nature of the characteristic bands in differential absorbance spectra (DAS) of NOM that are induced by metal ion binding. Some similarities were found between the DAS of the examined model compounds, caffeic acid, ferulic acid, sinapic acid, terephthalic acid, isophthalic acid, esculetin and myricetin and those of NOM. The binding of Cu(II) with carboxylic group might produce two peaks, A1 and A2, while the binding of Cu(II) with phenolic group might produce all four Gaussian peaks, from A1 to A4 displayed in the DAS of NOM. The UV-visible spectra predicted using time-dependent density functional theory (TD-DFT)-based methods met well with the experimental DAS of model compounds at different stages of Cu(II) binding. It demonstrates that the features in absorbance spectra are chiefly caused by HOMO (Highest Occupied Molecular Orbital) - LUMO (Lowest Unoccupied Molecular Orbital) transitions in the molecule and that the appearance of peaks in DAS reflects the changes of the molecular orbitals around reactive functional groups in a molecule before and after metal ion binding. The basis of the DAS features of NOM that are induced by metal ion binding could be identified primarily by the frontier molecular orbital theory. Copyright © 2017 Elsevier Ltd. All rights reserved.
Molecular structure of bottlebrush polymers in melts
Paturej, Jarosław; Sheiko, Sergei S.; Panyukov, Sergey; Rubinstein, Michael
2016-01-01
Bottlebrushes are fascinating macromolecules that display an intriguing combination of molecular and particulate features having vital implications in both living and synthetic systems, such as cartilage and ultrasoft elastomers. However, the progress in practical applications is impeded by the lack of knowledge about the hierarchic organization of both individual bottlebrushes and their assemblies. We delineate fundamental correlations between molecular architecture, mesoscopic conformation, and macroscopic properties of polymer melts. Numerical simulations corroborate theoretical predictions for the effect of grafting density and side-chain length on the dimensions and rigidity of bottlebrushes, which effectively behave as a melt of flexible filaments. These findings provide quantitative guidelines for the design of novel materials that allow architectural tuning of their properties in a broad range without changing chemical composition. PMID:28861466
Integration of multimodal RNA-seq data for prediction of kidney cancer survival
Schwartzi, Matt; Parkl, Martin; Phanl, John H.; Wang., May D.
2016-01-01
Kidney cancer is of prominent concern in modern medicine. Predicting patient survival is critical to patient awareness and developing a proper treatment regimens. Previous prediction models built upon molecular feature analysis are limited to just gene expression data. In this study we investigate the difference in predicting five year survival between unimodal and multimodal analysis of RNA-seq data from gene, exon, junction, and isoform modalities. Our preliminary findings report higher predictive accuracy-as measured by area under the ROC curve (AUC)-for multimodal learning when compared to unimodal learning with both support vector machine (SVM) and k-nearest neighbor (KNN) methods. The results of this study justify further research on the use of multimodal RNA-seq data to predict survival for other cancer types using a larger sample size and additional machine learning methods. PMID:27532026
Güssregen, Stefan; Matter, Hans; Hessler, Gerhard; Müller, Marco; Schmidt, Friedemann; Clark, Timothy
2012-09-24
Current 3D-QSAR methods such as CoMFA or CoMSIA make use of classical force-field approaches for calculating molecular fields. Thus, they can not adequately account for noncovalent interactions involving halogen atoms like halogen bonds or halogen-π interactions. These deficiencies in the underlying force fields result from the lack of treatment of the anisotropy of the electron density distribution of those atoms, known as the "σ-hole", although recent developments have begun to take specific interactions such as halogen bonding into account. We have now replaced classical force field derived molecular fields by local properties such as the local ionization energy, local electron affinity, or local polarizability, calculated using quantum-mechanical (QM) techniques that do not suffer from the above limitation for 3D-QSAR. We first investigate the characteristics of QM-based local property fields to show that they are suitable for statistical analyses after suitable pretreatment. We then analyze these property fields with partial least-squares (PLS) regression to predict biological affinities of two data sets comprising factor Xa and GABA-A/benzodiazepine receptor ligands. While the resulting models perform equally well or even slightly better in terms of consistency and predictivity than the classical CoMFA fields, the most important aspect of these augmented field-types is that the chemical interpretation of resulting QM-based property field models reveals unique SAR trends driven by electrostatic and polarizability effects, which cannot be extracted directly from CoMFA electrostatic maps. Within the factor Xa set, the interaction of chlorine and bromine atoms with a tyrosine side chain in the protease S1 pocket are correctly predicted. Within the GABA-A/benzodiazepine ligand data set, PLS models of high predictivity resulted for our QM-based property fields, providing novel insights into key features of the SAR for two receptor subtypes and cross-receptor selectivity of the ligands. The detailed interpretation of regression models derived using improved QM-derived property fields thus provides a significant advantage by revealing chemically meaningful correlations with biological activity and helps in understanding novel structure-activity relationship features. This will allow such knowledge to be used to design novel molecules on the basis of interactions additional to steric and hydrogen-bonding features.
Development of companion diagnostics
Mankoff, David A.; Edmonds, Christine E.; Farwell, Michael D.; ...
2015-12-12
The goal of individualized and targeted treatment and precision medicine requires the assessment of potential therapeutic targets to direct treatment selection. The biomarkers used to direct precision medicine, often termed companion diagnostics, for highly targeted drugs have thus far been almost entirely based on in vitro assay of biopsy material. Molecular imaging companion diagnostics offer a number of features complementary to those from in vitro assay, including the ability to measure the heterogeneity of each patient’s cancer across the entire disease burden and to measure early changes in response to treatment. We discuss the use of molecular imaging methods asmore » companion diagnostics for cancer therapy with the goal of predicting response to targeted therapy and measuring early (pharmacodynamic) response as an indication of whether the treatment has “hit” the target. We also discuss considerations for probe development for molecular imaging companion diagnostics, including both small-molecule probes and larger molecules such as labeled antibodies and related constructs. We then describe two examples where both predictive and pharmacodynamic molecular imaging markers have been tested in humans: endocrine therapy for breast cancer and human epidermal growth factor receptor type 2–targeted therapy. Lastly, the review closes with a summary of the items needed to move molecular imaging companion diagnostics from early studies into multicenter trials and into the clinic.« less
Development of companion diagnostics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mankoff, David A.; Edmonds, Christine E.; Farwell, Michael D.
The goal of individualized and targeted treatment and precision medicine requires the assessment of potential therapeutic targets to direct treatment selection. The biomarkers used to direct precision medicine, often termed companion diagnostics, for highly targeted drugs have thus far been almost entirely based on in vitro assay of biopsy material. Molecular imaging companion diagnostics offer a number of features complementary to those from in vitro assay, including the ability to measure the heterogeneity of each patient’s cancer across the entire disease burden and to measure early changes in response to treatment. We discuss the use of molecular imaging methods asmore » companion diagnostics for cancer therapy with the goal of predicting response to targeted therapy and measuring early (pharmacodynamic) response as an indication of whether the treatment has “hit” the target. We also discuss considerations for probe development for molecular imaging companion diagnostics, including both small-molecule probes and larger molecules such as labeled antibodies and related constructs. We then describe two examples where both predictive and pharmacodynamic molecular imaging markers have been tested in humans: endocrine therapy for breast cancer and human epidermal growth factor receptor type 2–targeted therapy. Lastly, the review closes with a summary of the items needed to move molecular imaging companion diagnostics from early studies into multicenter trials and into the clinic.« less
Development of Companion Diagnostics
Mankoff, David A.; Edmonds, Christine E.; Farwell, Michael D.; Pryma, Daniel A.
2016-01-01
The goal of individualized and targeted treatment and precision medicine requires the assessment of potential therapeutic targets to direct treatment selection. The biomarkers used to direct precision medicine, often termed companion diagnostics, for highly targeted drugs have thus far been almost entirely based on in vitro assay of biopsy material. Molecular imaging companion diagnostics offer a number of features complementary to those from in vitro assay, including the ability to measure the heterogeneity of each patient’s cancer across the entire disease burden and to measure early changes in response to treatment. We discuss the use of molecular imaging methods as companion diagnostics for cancer therapy with the goal of predicting response to targeted therapy and measuring early (pharmacodynamic) response as an indication of whether the treatment has “hit” the target. We also discuss considerations for probe development for molecular imaging companion diagnostics, including both small-molecule probes and larger molecules such as labeled antibodies and related constructs. We then describe two examples where both predictive and pharmacodynamic molecular imaging markers have been tested in humans: endocrine therapy for breast cancer and human epidermal growth factor receptor type 2–targeted therapy. The review closes with a summary of the items needed to move molecular imaging companion diagnostics from early studies into multicenter trials and into the clinic. PMID:26687857
Oh, Min; Ahn, Jaegyoon; Yoon, Youngmi
2014-01-01
The growing number and variety of genetic network datasets increases the feasibility of understanding how drugs and diseases are associated at the molecular level. Properly selected features of the network representations of existing drug-disease associations can be used to infer novel indications of existing drugs. To find new drug-disease associations, we generated an integrative genetic network using combinations of interactions, including protein-protein interactions and gene regulatory network datasets. Within this network, network adjacencies of drug-drug and disease-disease were quantified using a scored path between target sets of them. Furthermore, the common topological module of drugs or diseases was extracted, and thereby the distance between topological drug-module and disease (or disease-module and drug) was quantified. These quantified scores were used as features for the prediction of novel drug-disease associations. Our classifiers using Random Forest, Multilayer Perceptron and C4.5 showed a high specificity and sensitivity (AUC score of 0.855, 0.828 and 0.797 respectively) in predicting novel drug indications, and displayed a better performance than other methods with limited drug and disease properties. Our predictions and current clinical trials overlap significantly across the different phases of drug development. We also identified and visualized the topological modules of predicted drug indications for certain types of cancers, and for Alzheimer’s disease. Within the network, those modules show potential pathways that illustrate the mechanisms of new drug indications, including propranolol as a potential anticancer agent and telmisartan as treatment for Alzheimer’s disease. PMID:25356910
NASA Astrophysics Data System (ADS)
De, Biplab; Adhikari, Indrani; Nandy, Ashis; Saha, Achintya; Goswami, Binoy Behari
2017-06-01
Design and development of antioxidant supplements constitute an essential aspect of research in order to derive molecules that would help to combat the free radical invasion to the human body and curb oxidative stress related diseases. The present work deals with the development of in silico models for a series of thiazolidine derivatives having antioxidant potential. The objective of the work is to obtain models that would help to design new thazolidine derivatives based on substituent modification and thereby predict their activity profile. The QSAR model thus developed helps in quantification of the extent of contribution of the various molecular fragments towards the activity of the molecules, while the 3D pharmacophore model provides a brief idea of the essential molecular features that help the molecules to interact with the neighbouring free radicals. Both the models have been extensively validated which ensures their predictive ability as well the potential to search molecular databases for selection of thiazolidine derivatives with potent antioxidant activity. The models can thus be utilised effectively for database searching with the aim to isolate active antioxidants belonging to the thiazolidine group.
NASA Astrophysics Data System (ADS)
Aouidate, Adnane; Ghaleb, Adib; Ghamali, Mounir; Chtita, Samir; Choukrad, M'barek; Sbai, Abdelouahid; Bouachrine, Mohammed; Lakhlifi, Tahar
2017-07-01
A series of nineteen DHFR inhibitors was studied based on the combination of two computational techniques namely, three-dimensional quantitative structure activity relationship (3D-QSAR) and molecular docking. The comparative molecular field analysis (CoMFA) and comparative molecular similarity index analysis (CoMSIA) were developed using 19 molecules having pIC50 ranging from 9.244 to 5.839. The best CoMFA and CoMSIA models show conventional determination coefficients R2 of 0.96 and 0.93 as well as the Leave One Out cross-validation determination coefficients Q2 of 0.64 and 0.72, respectively. The predictive ability of those models was evaluated by the external validation using a test set of five compounds with predicted determination coefficients R2test of 0.92 and 0.94, respectively. The binding mode between this kind of compounds and the DHFR enzyme in addition to the key amino acid residues were explored by molecular docking simulation. Contour maps and molecular docking identified that the R1 and R2 natures at the pyrazole moiety are the important features for the optimization of the binding affinity to the DHFR receptor. According to the good concordance between the CoMFA/CoMSIA contour maps and docking results, the obtained information was explored to design novel molecules.
Role of Macrophage-Induced Inflammation in Mesothelioma
2010-07-01
in human mesothelioma tumors and correlate immune cell infiltration with histopathologic subtype (months 1-6). Using tumor tissue microarrays of... histopathologic subtype (months 1-6). • Acquired 71 fixed and paraffin-embedded mesothelioma tumor samples • Prepared mesothelioma tumor tissue...Biol., 2008. 84: p. 1-8. 5. Dave, S.S., et al., Prediction of survival in follicular lymphoma based on molecular features of tumor-infiltrating
Wang, Yong-Cui; Wang, Yong; Yang, Zhi-Xia; Deng, Nai-Yang
2011-06-20
Enzymes are known as the largest class of proteins and their functions are usually annotated by the Enzyme Commission (EC), which uses a hierarchy structure, i.e., four numbers separated by periods, to classify the function of enzymes. Automatically categorizing enzyme into the EC hierarchy is crucial to understand its specific molecular mechanism. In this paper, we introduce two key improvements in predicting enzyme function within the machine learning framework. One is to introduce the efficient sequence encoding methods for representing given proteins. The second one is to develop a structure-based prediction method with low computational complexity. In particular, we propose to use the conjoint triad feature (CTF) to represent the given protein sequences by considering not only the composition of amino acids but also the neighbor relationships in the sequence. Then we develop a support vector machine (SVM)-based method, named as SVMHL (SVM for hierarchy labels), to output enzyme function by fully considering the hierarchical structure of EC. The experimental results show that our SVMHL with the CTF outperforms SVMHL with the amino acid composition (AAC) feature both in predictive accuracy and Matthew's correlation coefficient (MCC). In addition, SVMHL with the CTF obtains the accuracy and MCC ranging from 81% to 98% and 0.82 to 0.98 when predicting the first three EC digits on a low-homologous enzyme dataset. We further demonstrate that our method outperforms the methods which do not take account of hierarchical relationship among enzyme categories and alternative methods which incorporate prior knowledge about inter-class relationships. Our structure-based prediction model, SVMHL with the CTF, reduces the computational complexity and outperforms the alternative approaches in enzyme function prediction. Therefore our new method will be a useful tool for enzyme function prediction community.
Image analysis and machine learning in digital pathology: Challenges and opportunities.
Madabhushi, Anant; Lee, George
2016-10-01
With the rise in whole slide scanner technology, large numbers of tissue slides are being scanned and represented and archived digitally. While digital pathology has substantial implications for telepathology, second opinions, and education there are also huge research opportunities in image computing with this new source of "big data". It is well known that there is fundamental prognostic data embedded in pathology images. The ability to mine "sub-visual" image features from digital pathology slide images, features that may not be visually discernible by a pathologist, offers the opportunity for better quantitative modeling of disease appearance and hence possibly improved prediction of disease aggressiveness and patient outcome. However the compelling opportunities in precision medicine offered by big digital pathology data come with their own set of computational challenges. Image analysis and computer assisted detection and diagnosis tools previously developed in the context of radiographic images are woefully inadequate to deal with the data density in high resolution digitized whole slide images. Additionally there has been recent substantial interest in combining and fusing radiologic imaging and proteomics and genomics based measurements with features extracted from digital pathology images for better prognostic prediction of disease aggressiveness and patient outcome. Again there is a paucity of powerful tools for combining disease specific features that manifest across multiple different length scales. The purpose of this review is to discuss developments in computational image analysis tools for predictive modeling of digital pathology images from a detection, segmentation, feature extraction, and tissue classification perspective. We discuss the emergence of new handcrafted feature approaches for improved predictive modeling of tissue appearance and also review the emergence of deep learning schemes for both object detection and tissue classification. We also briefly review some of the state of the art in fusion of radiology and pathology images and also combining digital pathology derived image measurements with molecular "omics" features for better predictive modeling. The review ends with a brief discussion of some of the technical and computational challenges to be overcome and reflects on future opportunities for the quantitation of histopathology. Copyright © 2016 Elsevier B.V. All rights reserved.
Predicting residue-wise contact orders in proteins by support vector regression.
Song, Jiangning; Burrage, Kevin
2006-10-03
The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
Nottingham Prognostic Index Plus (NPI+): a modern clinical decision making tool in breast cancer.
Rakha, E A; Soria, D; Green, A R; Lemetre, C; Powe, D G; Nolan, C C; Garibaldi, J M; Ball, G; Ellis, I O
2014-04-02
Current management of breast cancer (BC) relies on risk stratification based on well-defined clinicopathologic factors. Global gene expression profiling studies have demonstrated that BC comprises distinct molecular classes with clinical relevance. In this study, we hypothesised that molecular features of BC are a key driver of tumour behaviour and when coupled with a novel and bespoke application of established clinicopathologic prognostic variables can predict both clinical outcome and relevant therapeutic options more accurately than existing methods. In the current study, a comprehensive panel of biomarkers with relevance to BC was applied to a large and well-characterised series of BC, using immunohistochemistry and different multivariate clustering techniques, to identify the key molecular classes. Subsequently, each class was further stratified using a set of well-defined prognostic clinicopathologic variables. These variables were combined in formulae to prognostically stratify different molecular classes, collectively known as the Nottingham Prognostic Index Plus (NPI+). The NPI+ was then used to predict outcome in the different molecular classes. Seven core molecular classes were identified using a selective panel of 10 biomarkers. Incorporation of clinicopathologic variables in a second-stage analysis resulted in identification of distinct prognostic groups within each molecular class (NPI+). Outcome analysis showed that using the bespoke NPI formulae for each biological BC class provides improved patient outcome stratification superior to the traditional NPI. This study provides proof-of-principle evidence for the use of NPI+ in supporting improved individualised clinical decision making.
Molecular markers in bladder cancer: Novel research frontiers.
Sanguedolce, Francesca; Cormio, Antonella; Bufo, Pantaleo; Carrieri, Giuseppe; Cormio, Luigi
2015-01-01
Bladder cancer (BC) is a heterogeneous disease encompassing distinct biologic features that lead to extremely different clinical behaviors. In the last 20 years, great efforts have been made to predict disease outcome and response to treatment by developing risk assessment calculators based on multiple standard clinical-pathological factors, as well as by testing several molecular markers. Unfortunately, risk assessment calculators alone fail to accurately assess a single patient's prognosis and response to different treatment options. Several molecular markers easily assessable by routine immunohistochemical techniques hold promise for becoming widely available and cost-effective tools for a more reliable risk assessment, but none have yet entered routine clinical practice. Current research is therefore moving towards (i) identifying novel molecular markers; (ii) testing old and new markers in homogeneous patients' populations receiving homogeneous treatments; (iii) generating a multimarker panel that could be easily, and thus routinely, used in clinical practice; (iv) developing novel risk assessment tools, possibly combining standard clinical-pathological factors with molecular markers. This review analyses the emerging body of literature concerning novel biomarkers, ranging from genetic changes to altered expression of a huge variety of molecules, potentially involved in BC outcome and response to treatment. Findings suggest that some of these indicators, such as serum circulating tumor cells and tissue mitochondrial DNA, seem to be easily assessable and provide reliable information. Other markers, such as the phosphoinositide-3-kinase (PI3K)/AKT (serine-threonine kinase)/mTOR (mammalian target of rapamycin) pathway and epigenetic changes in DNA methylation seem to not only have prognostic/predictive value but also, most importantly, represent valuable therapeutic targets. Finally, there is increasing evidence that the development of novel risk assessment tools combining standard clinical-pathological factors with molecular markers represents a major quest in managing this poorly predictable disease.
Beukinga, Roelof J; Hulshoff, Jan B; van Dijk, Lisanne V; Muijs, Christina T; Burgerhof, Johannes G M; Kats-Ugurlu, Gursah; Slart, Riemer H J A; Slump, Cornelis H; Mul, Véronique E M; Plukker, John Th M
2017-05-01
Adequate prediction of tumor response to neoadjuvant chemoradiotherapy (nCRT) in esophageal cancer (EC) patients is important in a more personalized treatment. The current best clinical method to predict pathologic complete response is SUV max in 18 F-FDG PET/CT imaging. To improve the prediction of response, we constructed a model to predict complete response to nCRT in EC based on pretreatment clinical parameters and 18 F-FDG PET/CT-derived textural features. Methods: From a prospectively maintained single-institution database, we reviewed 97 consecutive patients with locally advanced EC and a pretreatment 18 F-FDG PET/CT scan between 2009 and 2015. All patients were treated with nCRT (carboplatin/paclitaxel/41.4 Gy) followed by esophagectomy. We analyzed clinical, geometric, and pretreatment textural features extracted from both 18 F-FDG PET and CT. The current most accurate prediction model with SUV max as a predictor variable was compared with 6 different response prediction models constructed using least absolute shrinkage and selection operator regularized logistic regression. Internal validation was performed to estimate the model's performances. Pathologic response was defined as complete versus incomplete response (Mandard tumor regression grade system 1 vs. 2-5). Results: Pathologic examination revealed 19 (19.6%) complete and 78 (80.4%) incomplete responders. Least absolute shrinkage and selection operator regularization selected the clinical parameters: histologic type and clinical T stage, the 18 F-FDG PET-derived textural feature long run low gray level emphasis, and the CT-derived textural feature run percentage. Introducing these variables to a logistic regression analysis showed areas under the receiver-operating-characteristic curve (AUCs) of 0.78 compared with 0.58 in the SUV max model. The discrimination slopes were 0.17 compared with 0.01, respectively. After internal validation, the AUCs decreased to 0.74 and 0.54, respectively. Conclusion: The predictive values of the constructed models were superior to the standard method (SUV max ). These results can be considered as an initial step in predicting tumor response to nCRT in locally advanced EC. Further research in refining the predictive value of these models is needed to justify omission of surgery. © 2017 by the Society of Nuclear Medicine and Molecular Imaging.
NASA Astrophysics Data System (ADS)
Temi, Pasquale; Amblard, Alexandre; Gitti, Myriam; Brighenti, Fabrizio; Gaspari, Massimo; Mathews, William G.; David, Laurence
2018-05-01
We present new ALMA CO(2–1) observations of two well-studied group-centered elliptical galaxies: NGC 4636 and NGC 5846. In addition, we include a revised analysis of Cycle 0 ALMA observations of the central galaxy in the NGC 5044 group. We find evidence that molecular gas is a common presence in bright group-centered galaxies (BGG). CO line widths are broader than Galactic molecular clouds, and using the reference Milky Way X CO, the total molecular mass ranges from 2.6 × 105 M ⊙ in NGC 4636 to 6.1 × 107 M ⊙ in NGC 5044. Complementary observations using the ALMA Compact Array do not exhibit any detection of a CO diffuse component at the sensitivity level achieved by current exposures. The origin of the detected molecular features is still uncertain, but these ALMA observations suggest that they are the end product of the hot gas cooling process and not the result of merger events. Some of the molecular clouds are associated with dust features as revealed by HST dust extinction maps, suggesting that these clouds formed from dust-enhanced cooling. The global nonlinear condensation may be triggered via the chaotic turbulent field or buoyant uplift. The large virial parameter of the molecular structures and correlation with the warm ({10}3{--}{10}5 {{K}})/hot (≥106) phase velocity dispersion provide evidence that they are unbound giant molecular associations drifting in the turbulent field, consistent with numerical predictions of the chaotic cold accretion process. Alternatively, the observed large CO line widths may be generated by molecular gas flowing out from cloud surfaces due to heating by the local hot gas atmosphere.
Can, Nuray; Celik, Mehmet; Sezer, Yavuz Atakan; Ozyilmaz, Filiz; Ayturk, Semra; Tastekin, Ebru; Sut, Necdet; Gurkan, Hakan; Ustun, Funda; Bulbul, Buket Yilmaz; Guldiken, Sibel; Puyan, Fulya Oz
2017-08-20
The newly proposed nomenclature and diagnostic criteria for encapsulated follicular variant of papillary thyroid carcinoma (EFVPTC), the noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP), could improve the consistency and accuracy of diagnosing this entity. Diagnosis of NIFTP requires evaluation of the complete tumor border or capsule. The presence of tumor invasion in follicular thyroid neoplasms with papillary-like nuclear features has been recently discussed by many authors. In this study, we examined the predictive value and association of follicular morphological characteristics with the tumor invasion. In addition, we analyzed the association between tumor encapsulation and molecular profile in EFVPTC/NIFTP cases. A total of 106 cases of FVPTC were included in the study. The tumors were grouped based on the presence of tumor capsule and characteristics of tumor border, as 1) completely encapsulated tumors without invasion, 2) encapsulated tumors with invasion, 3) infiltrative tumors without a capsule. Clinicopathological features, histomorphological features [nuclear criteria, minor diagnostic features, follicles oriented perpendicular to tumor border/capsule (FOPBC)] and molecular alterations in BRAF, NRAS, and KRAS genes were evaluated. FOPBC were significantly more frequently seen in encapsulated tumors with invasion (p = 0.008). The nuclear features were not associated with the presence of encapsulation and characteristics of tumor border. BRAF mutation was more frequent in infiltrative tumors, while NRAS mutation was more frequent in encapsulated tumors, but the results were not statistically significant (p = 0.917). In conclusion, FOPBC histomorphological feature may be associated with tumor invasion in EFVPTC/NIFTP. Additionally, BRAF/KRAS/NRAS mutation analysis may prevent inadequate treatment in these patients.
Vibrational spectroscopic study of terbutaline hemisulphate
NASA Astrophysics Data System (ADS)
Ali, H. R. H.; Edwards, H. G. M.; Kendrick, J.; Scowen, I. J.
2009-05-01
The Raman spectrum of terbutaline hemisulphate is reported for the first time, and molecular assignments are proposed on the basis of ab initio BLYP DFT calculations with a 6-31G* basis set and vibrational frequencies predicted within the quasi-harmonic approximation; these predictions compare favourably with the observed vibrational spectra. Comparison with previously published infrared data explains several spectral features. The results from this study provide data that can be used for the preparative process monitoring of terbutaline hemisulphate, an important β 2 agonist drug in various dosage forms and its interaction with excipients and other components.
Carbohydrate-protein interactions: molecular modeling insights.
Pérez, Serge; Tvaroška, Igor
2014-01-01
The article reviews the significant contributions to, and the present status of, applications of computational methods for the characterization and prediction of protein-carbohydrate interactions. After a presentation of the specific features of carbohydrate modeling, along with a brief description of the experimental data and general features of carbohydrate-protein interactions, the survey provides a thorough coverage of the available computational methods and tools. At the quantum-mechanical level, the use of both molecular orbitals and density-functional theory is critically assessed. These are followed by a presentation and critical evaluation of the applications of semiempirical and empirical methods: QM/MM, molecular dynamics, free-energy calculations, metadynamics, molecular robotics, and others. The usefulness of molecular docking in structural glycobiology is evaluated by considering recent docking- validation studies on a range of protein targets. The range of applications of these theoretical methods provides insights into the structural, energetic, and mechanistic facets that occur in the course of the recognition processes. Selected examples are provided to exemplify the usefulness and the present limitations of these computational methods in their ability to assist in elucidation of the structural basis underlying the diverse function and biological roles of carbohydrates in their dialogue with proteins. These test cases cover the field of both carbohydrate biosynthesis and glycosyltransferases, as well as glycoside hydrolases. The phenomenon of (macro)molecular recognition is illustrated for the interactions of carbohydrates with such proteins as lectins, monoclonal antibodies, GAG-binding proteins, porins, and viruses. © 2014 Elsevier Inc. All rights reserved.
Uncertainty Quantification in Alchemical Free Energy Methods.
Bhati, Agastya P; Wan, Shunzhou; Hu, Yuan; Sherborne, Brad; Coveney, Peter V
2018-06-12
Alchemical free energy methods have gained much importance recently from several reports of improved ligand-protein binding affinity predictions based on their implementation using molecular dynamics simulations. A large number of variants of such methods implementing different accelerated sampling techniques and free energy estimators are available, each claimed to be better than the others in its own way. However, the key features of reproducibility and quantification of associated uncertainties in such methods have barely been discussed. Here, we apply a systematic protocol for uncertainty quantification to a number of popular alchemical free energy methods, covering both absolute and relative free energy predictions. We show that a reliable measure of error estimation is provided by ensemble simulation-an ensemble of independent MD simulations-which applies irrespective of the free energy method. The need to use ensemble methods is fundamental and holds regardless of the duration of time of the molecular dynamics simulations performed.
How to Compute Electron Ionization Mass Spectra from First Principles.
Bauer, Christoph Alexander; Grimme, Stefan
2016-06-02
The prediction of electron ionization (EI) mass spectra (MS) from first principles has been a major challenge for quantum chemistry (QC). The unimolecular reaction space grows rapidly with increasing molecular size. On the one hand, statistical models like Eyring's quasi-equilibrium theory and Rice-Ramsperger-Kassel-Marcus theory have provided valuable insight, and some predictions and quantitative results can be obtained from such calculations. On the other hand, molecular dynamics-based methods are able to explore automatically the energetically available regions of phase space and thus yield reaction paths in an unbiased way. We describe in this feature article the status of both methodologies in relation to mass spectrometry for small to medium sized molecules. We further present results obtained with the QCEIMS program developed in our laboratory. Our method, which incorporates stochastic and dynamic elements, has been a significant step toward the reliable routine calculation of EI mass spectra.
System among the corticosteroids: specificity and molecular dynamics
Brookes, Jennifer C.; Galigniana, Mario D.; Harker, Anthony H.; Stoneham, A. Marshall; Vinson, Gavin P.
2012-01-01
Understanding how structural features determine specific biological activities has often proved elusive. With over 161 000 steroid structures described, an algorithm able to predict activity from structural attributes would provide manifest benefits. Molecular simulations of a range of 35 corticosteroids show striking correlations between conformational mobility and biological specificity. Thus steroid ring A is important for glucocorticoid action, and is rigid in the most specific (and potent) examples, such as dexamethasone. By contrast, ring C conformation is important for the mineralocorticoids, and is rigid in aldosterone. Other steroids that are less specific, or have mixed functions, or none at all, are more flexible. One unexpected example is 11-deoxycorticosterone, which the methods predict (and our activity studies confirm) is not only a specific mineralocorticoid, but also has significant glucocorticoid activity. These methods may guide the design of new corticosteroid agonists and antagonists. They will also have application in other examples of ligand–receptor interactions. PMID:21613285
Vibron and phonon hybridization in dielectric nanostructures.
Preston, Thomas C; Signorell, Ruth
2011-04-05
Plasmon hybridization theory has been an invaluable tool in advancing our understanding of the optical properties of metallic nanostructures. Through the prism of molecular orbital theory, it allows one to interpret complex structures as "plasmonic molecules" and easily predict and engineer their electromagnetic response. However, this formalism is limited to conducting particles. Here, we present a hybridization scheme for the external and internal vibrations of dielectric nanostructures that provides a straightforward understanding of the infrared signatures of these particles through analogy to existing hybridization models of both molecular orbitals and plasmons extending the range of applications far beyond metallic nanostructures. This method not only provides a qualitative understanding, but also allows for the quantitative prediction of vibrational spectra of complex nanoobjects from well-known spectra of their primitive building blocks. The examples of nanoshells illustrate how spectral features can be understood in terms of symmetry, number of nodal planes, and scale parameters.
[Prediction of ETA oligopeptides antagonists from Glycine max based on in silico proteolysis].
Qiao, Lian-Sheng; Jiang, Lu-di; Luo, Gang-Gang; Lu, Fang; Chen, Yan-Kun; Wang, Ling-Zhi; Li, Gong-Yu; Zhang, Yan-Ling
2017-02-01
Oligopeptides are one of the the key pharmaceutical effective constituents of traditional Chinese medicine(TCM). Systematic study on composition and efficacy of TCM oligopeptides is essential for the analysis of material basis and mechanism of TCM. In this study, the potential anti-hypertensive oligopeptides from Glycine max and their endothelin receptor A (ETA) antagonistic activity were discovered and predicted based on in silico technologies.Main protein sequences of G. max were collected and oligopeptides were obtained using in silico gastrointestinal tract proteolysis. Then, the pharmacophore of ETA antagonistic peptides was constructed and included one hydrophobic feature, one ionizable negative feature, one ring aromatic feature and five excluded volumes. Meanwhile, three-dimensional structure of ETA was developed by homology modeling methods for further docking studies. According to docking analysis and consensus score, the key amino acid of GLN165 was identified for ETA antagonistic activity. And 27 oligopeptides from G. max were predicted as the potential ETA antagonists by pharmacophore and docking studies.In silico proteolysis could be used to analyze the protein sequences from TCM. According to combination of in silico proteolysis and molecular simulation, the biological activities of oligopeptides could be predicted rapidly based on the known TCM protein sequence. It might provide the methodology basis for rapidly and efficiently implementing the mechanism analysis of TCM oligopeptides. Copyright© by the Chinese Pharmaceutical Association.
NASA Astrophysics Data System (ADS)
Hansen, U.; Rodgers, S.; Jensen, K. F.
2000-07-01
A general method for modeling ionized physical vapor deposition is presented. As an example, the method is applied to growth of an aluminum film in the presence of an ionized argon flux. Molecular dynamics techniques are used to examine the surface adsorption, reflection, and sputter reactions taking place during ionized physical vapor deposition. We predict their relative probabilities and discuss their dependence on energy and incident angle. Subsequently, we combine the information obtained from molecular dynamics with a line of sight transport model in a two-dimensional feature, incorporating all effects of reemission and resputtering. This provides a complete growth rate model that allows inclusion of energy- and angular-dependent reaction rates. Finally, a level-set approach is used to describe the morphology of the growing film. We thus arrive at a computationally highly efficient and accurate scheme to model the growth of thin films. We demonstrate the capabilities of the model predicting the major differences on Al film topographies between conventional and ionized sputter deposition techniques studying thin film growth under ionized physical vapor deposition conditions with different Ar fluxes.
Varela, Miguel A; Curtis, Helen J; Douglas, Andrew GL; Hammond, Suzan M; O'Loughlin, Aisling J; Sobrido, Maria J; Scholefield, Janine; Wood, Matthew JA
2016-01-01
Allele-specific gene therapy aims to silence expression of mutant alleles through targeting of disease-linked single-nucleotide polymorphisms (SNPs). However, SNP linkage to disease varies between populations, making such molecular therapies applicable only to a subset of patients. Moreover, not all SNPs have the molecular features necessary for potent gene silencing. Here we provide knowledge to allow the maximisation of patient coverage by building a comprehensive understanding of SNPs ranked according to their predicted suitability toward allele-specific silencing in 14 repeat expansion diseases: amyotrophic lateral sclerosis and frontotemporal dementia, dentatorubral-pallidoluysian atrophy, myotonic dystrophy 1, myotonic dystrophy 2, Huntington's disease and several spinocerebellar ataxias. Our systematic analysis of DNA sequence variation shows that most annotated SNPs are not suitable for potent allele-specific silencing across populations because of suboptimal sequence features and low variability (>97% in HD). We suggest maximising patient coverage by selecting SNPs with high heterozygosity across populations, and preferentially targeting SNPs that lead to purine:purine mismatches in wild-type alleles to obtain potent allele-specific silencing. We therefore provide fundamental knowledge on strategies for optimising patient coverage of therapeutics for microsatellite expansion disorders by linking analysis of population genetic variation to the selection of molecular targets. PMID:25990798
Varela, Miguel A; Curtis, Helen J; Douglas, Andrew G L; Hammond, Suzan M; O'Loughlin, Aisling J; Sobrido, Maria J; Scholefield, Janine; Wood, Matthew J A
2016-02-01
Allele-specific gene therapy aims to silence expression of mutant alleles through targeting of disease-linked single-nucleotide polymorphisms (SNPs). However, SNP linkage to disease varies between populations, making such molecular therapies applicable only to a subset of patients. Moreover, not all SNPs have the molecular features necessary for potent gene silencing. Here we provide knowledge to allow the maximisation of patient coverage by building a comprehensive understanding of SNPs ranked according to their predicted suitability toward allele-specific silencing in 14 repeat expansion diseases: amyotrophic lateral sclerosis and frontotemporal dementia, dentatorubral-pallidoluysian atrophy, myotonic dystrophy 1, myotonic dystrophy 2, Huntington's disease and several spinocerebellar ataxias. Our systematic analysis of DNA sequence variation shows that most annotated SNPs are not suitable for potent allele-specific silencing across populations because of suboptimal sequence features and low variability (>97% in HD). We suggest maximising patient coverage by selecting SNPs with high heterozygosity across populations, and preferentially targeting SNPs that lead to purine:purine mismatches in wild-type alleles to obtain potent allele-specific silencing. We therefore provide fundamental knowledge on strategies for optimising patient coverage of therapeutics for microsatellite expansion disorders by linking analysis of population genetic variation to the selection of molecular targets.
Molecular Imaging and Precision Medicine in Breast Cancer.
Chudgar, Amy V; Mankoff, David A
2017-01-01
Precision medicine, basing treatment approaches on patient traits and specific molecular features of disease processes, has an important role in the management of patients with breast cancer as targeted therapies continue to improve. PET imaging offers noninvasive information that is complementary to traditional tissue biomarkers, including information about tumor burden, tumor metabolism, receptor status, and proliferation. Several PET agents that image breast cancer receptors can visually demonstrate the extent and heterogeneity of receptor-positive disease and help predict which tumors are likely to respond to targeted treatments. This review presents applications of PET imaging in the targeted treatment of breast cancer. Copyright © 2016 Elsevier Inc. All rights reserved.
Toropova, Alla P; Toropov, Andrey A
2013-11-01
The increasing use of nanomaterials incorporated into consumer products leads to the need for developing approaches to establish "quantitative structure-activity relationships" (QSARs) for various nanomaterials. However, the molecular structure as rule is not available for nanomaterials at least in its classic meaning. An possible alternative of classic QSAR (based on the molecular structure) is the using of data on physicochemical features of TiO(2) nanoparticles. The damage to cellular membranes (units L(-1)) by means of various TiO(2) nanoparticles is examined as the endpoint. Copyright © 2013 Elsevier Ltd. All rights reserved.
Endogenous Molecular-Cellular Network Cancer Theory: A Systems Biology Approach.
Wang, Gaowei; Yuan, Ruoshi; Zhu, Xiaomei; Ao, Ping
2018-01-01
In light of ever apparent limitation of the current dominant cancer mutation theory, a quantitative hypothesis for cancer genesis and progression, endogenous molecular-cellular network hypothesis has been proposed from the systems biology perspective, now for more than 10 years. It was intended to include both the genetic and epigenetic causes to understand cancer. Its development enters the stage of meaningful interaction with experimental and clinical data and the limitation of the traditional cancer mutation theory becomes more evident. Under this endogenous network hypothesis, we established a core working network of hepatocellular carcinoma (HCC) according to the hypothesis and quantified the working network by a nonlinear dynamical system. We showed that the two stable states of the working network reproduce the main known features of normal liver and HCC at both the modular and molecular levels. Using endogenous network hypothesis and validated working network, we explored genetic mutation pattern in cancer and potential strategies to cure or relieve HCC from a totally new perspective. Patterns of genetic mutations have been traditionally analyzed by posteriori statistical association approaches in light of traditional cancer mutation theory. One may wonder the possibility of a priori determination of any mutation regularity. Here, we found that based on the endogenous network theory the features of genetic mutations in cancers may be predicted without any prior knowledge of mutation propensities. Normal hepatocyte and cancerous hepatocyte stable states, specified by distinct patterns of expressions or activities of proteins in the network, provide means to directly identify a set of most probable genetic mutations and their effects in HCC. As the key proteins and main interactions in the network are conserved through cell types in an organism, similar mutational features may also be found in other cancers. This analysis yielded straightforward and testable predictions on an accumulated and preferred mutation spectrum in normal tissue. The validation of predicted cancer state mutation patterns demonstrates the usefulness and potential of a causal dynamical framework to understand and predict genetic mutations in cancer. We also obtained the following implication related to HCC therapy, (1) specific positive feedback loops are responsible for the maintenance of normal liver and HCC; (2) inhibiting proliferation and inflammation-related positive feedback loops, and simultaneously inducing liver-specific positive feedback loop is predicated as the potential strategy to cure or relieve HCC; (3) the genesis and regression of HCC is asymmetric. In light of the characteristic property of the nonlinear dynamical system, we demonstrate that positive feedback loops must be existed as a simple and general molecular basis for the maintenance of phenotypes such as normal liver and HCC, and regulating the positive feedback loops directly or indirectly provides potential strategies to cure or relieve HCC.
Molecular subgroups of adult medulloblastoma: a long-term single-institution study.
Zhao, Fu; Ohgaki, Hiroko; Xu, Lei; Giangaspero, Felice; Li, Chunde; Li, Peng; Yang, Zhijun; Wang, Bo; Wang, Xingchao; Wang, Zhenmin; Ai, Lin; Zhang, Jing; Luo, Lin; Liu, Pinan
2016-07-01
Recent transcriptomic approaches have demonstrated that there are at least 4 distinct subgroups in medulloblastoma (MB); however, survival studies of molecular subgroups in adult MB have been inconclusive because of small sample sizes. The aim of this study is to investigate the molecular subgroups in adult MB and identify their clinical and prognostic implications in a large, single-institution cohort. We determined gene expression profiles for 13 primary adult MBs. Bioinformatics tools were used to establish distinct molecular subgroups based on the most informative genes in the dataset. Immunohistochemistry with subgroup-specific antibodies was then used for validation within an independent cohort of 201 formalin-fixed MB tumors, in conjunction with a systematic analysis of clinical and histological characteristics. Three distinct molecular variants of adult MB were identified: the SHH, WNT, and group 4 subgroups. Validation of these subgroups in the 201-tumor cohort by immunohistochemistry identified significant differences in subgroup-specific demographics, histology, and metastatic status. The SHH subgroup accounted for the majority of the tumors (62%), followed by the group 4 subgroup (28%) and the WNT subgroup (10%). Group 4 tumors had significantly worse progression-free and overall survival compared with tumors of the other molecular subtypes. We have identified 3 subgroups of adult MB, characterized by distinct expression profiles, clinical features, pathological features, and prognosis. Clinical variables incorporated with molecular subgroup are more significantly informative for predicting adult patient outcome. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Neuro-Oncology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Hathout, Rania M; Metwally, Abdelkader A
2016-11-01
This study represents one of the series applying computer-oriented processes and tools in digging for information, analysing data and finally extracting correlations and meaningful outcomes. In this context, binding energies could be used to model and predict the mass of loaded drugs in solid lipid nanoparticles after molecular docking of literature-gathered drugs using MOE® software package on molecularly simulated tripalmitin matrices using GROMACS®. Consequently, Gaussian processes as a supervised machine learning artificial intelligence technique were used to correlate the drugs' descriptors (e.g. M.W., xLogP, TPSA and fragment complexity) with their molecular docking binding energies. Lower percentage bias was obtained compared to previous studies which allows the accurate estimation of the loaded mass of any drug in the investigated solid lipid nanoparticles by just projecting its chemical structure to its main features (descriptors). Copyright © 2016 Elsevier B.V. All rights reserved.
Optimal Design of Experiments by Combining Coarse and Fine Measurements
NASA Astrophysics Data System (ADS)
Lee, Alpha A.; Brenner, Michael P.; Colwell, Lucy J.
2017-11-01
In many contexts, it is extremely costly to perform enough high-quality experimental measurements to accurately parametrize a predictive quantitative model. However, it is often much easier to carry out large numbers of experiments that indicate whether each sample is above or below a given threshold. Can many such categorical or "coarse" measurements be combined with a much smaller number of high-resolution or "fine" measurements to yield accurate models? Here, we demonstrate an intuitive strategy, inspired by statistical physics, wherein the coarse measurements are used to identify the salient features of the data, while the fine measurements determine the relative importance of these features. A linear model is inferred from the fine measurements, augmented by a quadratic term that captures the correlation structure of the coarse data. We illustrate our strategy by considering the problems of predicting the antimalarial potency and aqueous solubility of small organic molecules from their 2D molecular structure.
Linking high resolution mass spectrometry data with exposure ...
There is a growing need in the field of exposure science for monitoring methods that rapidly screen environmental media for suspect contaminants. Measurement and analysis platforms, based on high resolution mass spectrometry (HRMS), now exist to meet this need. Here we describe results of a study that links HRMS data with exposure predictions from the U.S. EPA's ExpoCast™ program and in vitro bioassay data from the U.S. interagency Tox21 consortium. Vacuum dust samples were collected from 56 households across the U.S. as part of the American Healthy Homes Survey (AHHS). Sample extracts were analyzed using liquid chromatography time-of-flight mass spectrometry (LC–TOF/MS) with electrospray ionization. On average, approximately 2000 molecular features were identified per sample (based on accurate mass) in negative ion mode, and 3000 in positive ion mode. Exact mass, isotope distribution, and isotope spacing were used to match molecular features with a unique listing of chemical formulas extracted from EPA's Distributed Structure-Searchable Toxicity (DSSTox) database. A total of 978 DSSTox formulas were consistent with the dust LC–TOF/molecular feature data (match score ≥ 90); these formulas mapped to 3228 possible chemicals in the database. Correct assignment of a unique chemical to a given formula required additional validation steps. Each suspect chemical was prioritized for follow-up confirmation using abundance and detection frequency results, along wi
There is a growing need in the field of exposure science for monitoring methods that rapidly screen environmental media for suspect contaminants. Measurement and analysis platforms, based on high resolution mass spectrometry (HRMS), now exist to meet this need. Here we describe results of a study that links HRMS data with exposure predictions from the U.S. EPA's ExpoCast? program and in vitro bioassay data from the U.S. interagency Tox21 consortium. Vacuum dust samples were collected from 56 households across the U.S. as part of the American Healthy Homes Survey (AHHS). Sample extracts were analyzed using liquid chromatography time-of-flight mass spectrometry (LC??TOF/MS) with electrospray ionization. On average, approximately 2000 molecular features were identified per sample (based on accurate mass) in negative ion mode, and 3000 in positive ion mode. Exact mass, isotope distribution, and isotope spacing were used to match molecular features with a unique listing of chemical formulas extracted from EPA's Distributed Structure-Searchable Toxicity (DSSTox) database. A total of 978 DSSTox formulas were consistent with the dust LC??TOF/molecular feature data (match score ? 90); these formulas mapped to 3228 possible chemicals in the database. Correct assignment of a unique chemical to a given formula required additional validation steps. Each suspect chemical was prioritized for follow-up confirmation using abundance and detection frequency results, along with exp
Scheer, A; Fanelli, F; Costa, T; De Benedetti, P G; Cotecchia, S
1996-01-01
Site-directed mutagenesis and molecular dynamics simulations of the alpha 1B-adrenergic receptor (AR) were combined to explore the potential molecular changes correlated with the transition from R (inactive state) to R (active state). Using molecular dynamics analysis we compared the structural/dynamic features of constitutively active mutants with those of the wild type and of an inactive alpha 1B-AR to build a theoretical model which defines the essential features of R and R. The results of site-directed mutagenesis were in striking agreement with the predictions of the model supporting the following hypothesis. (i) The equilibrium between R and R depends on the equilibrium between the deprotonated and protonated forms, respectively, of D142 of the DRY motif. In fact, replacement of D142 with alanine confers high constitutive activity to the alpha 1B-AR. (ii) The shift of R143 of the DRY sequence out of a conserved 'polar pocket' formed by N63, D91, N344 and Y348 is a feature common to all the active structures, suggesting that the role of R143 is fundamental for mediating receptor activation. Disruption of these intramolecular interactions by replacing N63 with alanine constitutively activates the alpha 1B-AR. Our findings might provide interesting generalities about the activation process of G protein-coupled receptors. Images PMID:8670860
Sakkal, Leon A; Rajkowski, Kyle Z; Armen, Roger S
2017-06-05
Following insights from recent crystal structures of the muscarinic acetylcholine receptor, binding modes of Positive Allosteric Modulators (PAMs) were predicted under the assumption that PAMs should bind to the extracellular surface of the active state. A series of well-characterized PAMs for adenosine (A 1 R, A 2A R, A 3 R) and muscarinic acetylcholine (M 1 R, M 5 R) receptors were modeled using both rigid and flexible receptor CHARMM-based molecular docking. Studies of adenosine receptors investigated the molecular basis of the probe-dependence of PAM activity by modeling in complex with specific agonist radioligands. Consensus binding modes map common pharmacophore features of several chemical series to specific binding interactions. These models provide a rationalization of how PAM binding slows agonist radioligand dissociation kinetics. M 1 R PAMs were predicted to bind in the analogous M 2 R PAM LY2119620 binding site. The M 5 R NAM (ML-375) was predicted to bind in the PAM (ML-380) binding site with a unique induced-fit receptor conformation. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Tamilvanan, Thangaraju; Hopper, Waheeta
2014-01-01
Yersinia pestis, a Gram negative bacillus, spreads via lymphatic to lymph nodes and to all organs through the bloodstream, causing plague. Yersinia outer protein H (YopH) is one of the important effector proteins, which paralyzes lymphocytes and macrophages by dephosphorylating critical tyrosine kinases and signal transduction molecules. The purpose of the study is to generate a three-dimensional (3D) pharmacophore model by using diverse sets of YopH inhibitors, which would be useful for designing of potential antitoxin. In this study, we have selected 60 biologically active inhibitors of YopH to perform Ligand based pharmacophore study to elucidate the important structural features responsible for biological activity. Pharmacophore model demonstrated the importance of two acceptors, one hydrophobic and two aromatic features toward the biological activity. Based on these features, different databases were screened to identify novel compounds and these ligands were subjected for docking, ADME properties and Binding energy prediction. Post docking validation was performed using molecular dynamics simulation for selected ligands to calculate the Root Mean Square Deviation (RMSD) and Root Mean Square Fluctuation (RMSF). The ligands, ASN03270114, Mol_252138, Mol_31073 and ZINC04237078 may act as inhibitors against YopH of Y. pestis.
NASA Astrophysics Data System (ADS)
Fang, Jun
Thermotropic liquid crystalline polymers (TLCPs) are a class of promising engineering materials for high-demanding structural applications. Their excellent mechanical properties are highly correlated to the underlying molecular orientation states, which may be affected by complex flow fields during melt processing. Thus, understanding and eventually predicting how processing flows impact molecular orientation is a critical step towards rational design work in order to achieve favorable, balanced physical properties in finished products. This thesis aims to develop deeper understanding of orientation development in commercial TLCPs during processing by coordinating extensive experimental measurements with numerical computations. In situ measurements of orientation development of LCPs during processing are a focal point of this thesis. An x-ray capable injection molding apparatus is enhanced and utilized for time-resolved measurements of orientation development in multiple commercial TLCPs during injection molding. Ex situ wide angle x-ray scattering is also employed for more thorough characterization of molecular orientation distributions in molded plaques. Incompletely injection molded plaques ("short shots") are studied to gain further insights into the intermediate orientation states during mold filling. Finally, two surface orientation characterization techniques, near edge x-ray absorption fine structure (NEXAFS) and infrared attenuated total reflectance (FTIR-ATR) are combined to investigate the surface orientation distribution of injection molded plaques. Surface orientation states are found to be vastly different from their bulk counterparts due to different kinematics involved in mold filling. In general, complex distributions of orientation in molded plaques reflect the spatially varying competition between shear and extension during mold filling. To complement these experimental measurements, numerical calculations based on the Larson-Doi polydomain model are performed. The implementation of the Larson-Doi in complex processing flows is performed using a commercial process modeling software suite (MOLDFLOWRTM), exploiting a nearly exact analogy between the Larson-Doi model and a fiber orientation model that has been widely used in composites processing simulations. The modeling scheme is first verified by predicting many qualitative and quantitative features of molecular orientation distributions in isothermal extrusion-fed channel flows. In coordination with experiments, the model predictions are found to capture many qualitative features observed in injection molded plaques (including short shots). The final, stringent test of Larson-Doi model performance is prediction of in situ transient orientation data collected during mold filling. The model yields satisfactory results, though certain numerical approximations limit performance near the mold front.
Machine learning for epigenetics and future medical applications
Holder, Lawrence B.; Haque, M. Muksitul; Skinner, Michael K.
2017-01-01
ABSTRACT Understanding epigenetic processes holds immense promise for medical applications. Advances in Machine Learning (ML) are critical to realize this promise. Previous studies used epigenetic data sets associated with the germline transmission of epigenetic transgenerational inheritance of disease and novel ML approaches to predict genome-wide locations of critical epimutations. A combination of Active Learning (ACL) and Imbalanced Class Learning (ICL) was used to address past problems with ML to develop a more efficient feature selection process and address the imbalance problem in all genomic data sets. The power of this novel ML approach and our ability to predict epigenetic phenomena and associated disease is suggested. The current approach requires extensive computation of features over the genome. A promising new approach is to introduce Deep Learning (DL) for the generation and simultaneous computation of novel genomic features tuned to the classification task. This approach can be used with any genomic or biological data set applied to medicine. The application of molecular epigenetic data in advanced machine learning analysis to medicine is the focus of this review. PMID:28524769
Naturally-Occurring Canine Invasive Urothelial Carcinoma: A Model for Emerging Therapies
Sommer, Breann C.; Dhawan, Deepika; Ratliff, Timothy L.; Knapp, Deborah W.
2018-01-01
The development of targeted therapies and the resurgence of immunotherapy offer enormous potential to dramatically improve the outlook for patients with invasive urothelial carcinoma (InvUC). Optimization of these therapies, however, is crucial as only a minority of patients achieve dramatic remission, and toxicities are common. With the complexities of the therapies, and the growing list of possible drug combinations to test, highly relevant animal models are needed to assess and select the most promising approaches to carry forward into human trials. The animal model(s) should possess key features that dictate success or failure of cancer drugs in humans including tumor heterogeneity, genetic-epigenetic crosstalk, immune cell responsiveness, invasive and metastatic behavior, and molecular subtypes (e.g., luminal, basal). While it may not be possible to create these collective features in experimental models, these features are present in naturally-occurring InvUC in pet dogs. Naturally occurring canine InvUC closely mimics muscle-invasive bladder cancer in humans in regards to cellular and molecular features, molecular subtypes, biological behavior (sites and frequency of metastasis), and response to therapy. Clinical treatment trials in pet dogs with InvUC are considered a win-win scenario; the individual dog benefits from effective treatment, the results are expected to help other dogs, and the findings are expected to translate to better treatment outcomes in humans. This review will provide an overview of canine InvUC, the similarities to the human condition, and the potential for dogs with InvUC to serve as a model to predict the outcomes of targeted therapy and immunotherapy in humans. PMID:29732386
Naturally-Occurring Canine Invasive Urothelial Carcinoma: A Model for Emerging Therapies.
Sommer, Breann C; Dhawan, Deepika; Ratliff, Timothy L; Knapp, Deborah W
2018-04-26
The development of targeted therapies and the resurgence of immunotherapy offer enormous potential to dramatically improve the outlook for patients with invasive urothelial carcinoma (InvUC). Optimization of these therapies, however, is crucial as only a minority of patients achieve dramatic remission, and toxicities are common. With the complexities of the therapies, and the growing list of possible drug combinations to test, highly relevant animal models are needed to assess and select the most promising approaches to carry forward into human trials. The animal model(s) should possess key features that dictate success or failure of cancer drugs in humans including tumor heterogeneity, genetic-epigenetic crosstalk, immune cell responsiveness, invasive and metastatic behavior, and molecular subtypes (e.g., luminal, basal). While it may not be possible to create these collective features in experimental models, these features are present in naturally-occurring InvUC in pet dogs. Naturally occurring canine InvUC closely mimics muscle-invasive bladder cancer in humans in regards to cellular and molecular features, molecular subtypes, biological behavior (sites and frequency of metastasis), and response to therapy. Clinical treatment trials in pet dogs with InvUC are considered a win-win scenario; the individual dog benefits from effective treatment, the results are expected to help other dogs, and the findings are expected to translate to better treatment outcomes in humans. This review will provide an overview of canine InvUC, the similarities to the human condition, and the potential for dogs with InvUC to serve as a model to predict the outcomes of targeted therapy and immunotherapy in humans.
Quantitative imaging features: extension of the oncology medical image database
NASA Astrophysics Data System (ADS)
Patel, M. N.; Looney, P. T.; Young, K. C.; Halling-Brown, M. D.
2015-03-01
Radiological imaging is fundamental within the healthcare industry and has become routinely adopted for diagnosis, disease monitoring and treatment planning. With the advent of digital imaging modalities and the rapid growth in both diagnostic and therapeutic imaging, the ability to be able to harness this large influx of data is of paramount importance. The Oncology Medical Image Database (OMI-DB) was created to provide a centralized, fully annotated dataset for research. The database contains both processed and unprocessed images, associated data, and annotations and where applicable expert determined ground truths describing features of interest. Medical imaging provides the ability to detect and localize many changes that are important to determine whether a disease is present or a therapy is effective by depicting alterations in anatomic, physiologic, biochemical or molecular processes. Quantitative imaging features are sensitive, specific, accurate and reproducible imaging measures of these changes. Here, we describe an extension to the OMI-DB whereby a range of imaging features and descriptors are pre-calculated using a high throughput approach. The ability to calculate multiple imaging features and data from the acquired images would be valuable and facilitate further research applications investigating detection, prognosis, and classification. The resultant data store contains more than 10 million quantitative features as well as features derived from CAD predictions. Theses data can be used to build predictive models to aid image classification, treatment response assessment as well as to identify prognostic imaging biomarkers.
Zhang, Daqing; Xiao, Jianfeng; Zhou, Nannan; Luo, Xiaomin; Jiang, Hualiang; Chen, Kaixian
2015-01-01
Blood-brain barrier (BBB) is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM) is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA) to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available log BB models. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our log BB model suggests that carboxylic acid group, polar surface area (PSA)/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration. PMID:26504797
Impact of genetic features on treatment decisions in AML.
Döhner, Hartmut; Gaidzik, Verena I
2011-01-01
In recent years, research in molecular genetics has been instrumental in deciphering the molecular pathogenesis of acute myeloid leukemia (AML). With the advent of the novel genomics technologies such as next-generation sequencing, it is expected that virtually all genetic lesions in AML will soon be identified. Gene mutations or deregulated expression of genes or sets of genes now allow us to explore the enormous diversity among cytogenetically defined subsets of AML, in particular the large subset of cytogenetically normal AML. Nonetheless, there are several challenges, such as discriminating driver from passenger mutations, evaluating the prognostic and predictive value of a specific mutation in the concert of the various concurrent mutations, or translating findings from molecular disease pathogenesis into novel therapies. Progress is unlikely to be fast in developing molecular targeted therapies. Contrary to the initial assumption, the development of molecular targeted therapies is slow and the various reports of promising new compounds will need to be put into perspective because many of these drugs did not show the expected effects.
Li, Hongzhi; Zhong, Ziyan; Li, Lin; Gao, Rui; Cui, Jingxia; Gao, Ting; Hu, Li Hong; Lu, Yinghua; Su, Zhong-Min; Li, Hui
2015-05-30
A cascaded model is proposed to establish the quantitative structure-activity relationship (QSAR) between the overall power conversion efficiency (PCE) and quantum chemical molecular descriptors of all-organic dye sensitizers. The cascaded model is a two-level network in which the outputs of the first level (JSC, VOC, and FF) are the inputs of the second level, and the ultimate end-point is the overall PCE of dye-sensitized solar cells (DSSCs). The model combines quantum chemical methods and machine learning methods, further including quantum chemical calculations, data division, feature selection, regression, and validation steps. To improve the efficiency of the model and reduce the redundancy and noise of the molecular descriptors, six feature selection methods (multiple linear regression, genetic algorithms, mean impact value, forward selection, backward elimination, and +n-m algorithm) are used with the support vector machine. The best established cascaded model predicts the PCE values of DSSCs with a MAE of 0.57 (%), which is about 10% of the mean value PCE (5.62%). The validation parameters according to the OECD principles are R(2) (0.75), Q(2) (0.77), and Qcv2 (0.76), which demonstrate the great goodness-of-fit, predictivity, and robustness of the model. Additionally, the applicability domain of the cascaded QSAR model is defined for further application. This study demonstrates that the established cascaded model is able to effectively predict the PCE for organic dye sensitizers with very low cost and relatively high accuracy, providing a useful tool for the design of dye sensitizers with high PCE. © 2015 Wiley Periodicals, Inc.
The vibrational properties of the bee-killer imidacloprid insecticide: A molecular description
NASA Astrophysics Data System (ADS)
Moreira, Antônio A. G.; De Lima-Neto, Pedro; Caetano, Ewerton W. S.; Barroso-Neto, Ito L.; Freire, Valder N.
2017-10-01
The chemical imidacloprid belongs to the neonicotinoids insecticide class, widely used for insect pest control mainly for crop protection. However, imidacloprid is a non-selective agrochemical to the insects and it is able to kill the most important pollinators, the bees. The high toxicity of imidacloprid requires controlled release and continuous monitoring. For this purpose, high performance liquid chromatography (HPLC) is usually employed; infrared and Raman spectroscopy, however, are simple and viable techniques that can be adapted to portable devices for field application. In this communication, state-of-the-art quantum level simulations were used to predict the infrared and Raman spectra of the most stable conformer of imidacloprid. Four molecular geometries were investigated in vacuum and solvated within the Density Functional Theory (DFT) approach employing the hybrid meta functional M06-2X and the hybrid functional B3LYP. The M062X/PCM model proved to be the best to predict structural features, while the values of harmonic vibrational frequencies were predicted more accurately using the B3LYP functional.
Kinetic rate constant prediction supports the conformational selection mechanism of protein binding.
Moal, Iain H; Bates, Paul A
2012-01-01
The prediction of protein-protein kinetic rate constants provides a fundamental test of our understanding of molecular recognition, and will play an important role in the modeling of complex biological systems. In this paper, a feature selection and regression algorithm is applied to mine a large set of molecular descriptors and construct simple models for association and dissociation rate constants using empirical data. Using separate test data for validation, the predicted rate constants can be combined to calculate binding affinity with accuracy matching that of state of the art empirical free energy functions. The models show that the rate of association is linearly related to the proportion of unbound proteins in the bound conformational ensemble relative to the unbound conformational ensemble, indicating that the binding partners must adopt a geometry near to that of the bound prior to binding. Mirroring the conformational selection and population shift mechanism of protein binding, the models provide a strong separate line of evidence for the preponderance of this mechanism in protein-protein binding, complementing structural and theoretical studies.
Macromolecular target prediction by self-organizing feature maps.
Schneider, Gisbert; Schneider, Petra
2017-03-01
Rational drug discovery would greatly benefit from a more nuanced appreciation of the activity of pharmacologically active compounds against a diverse panel of macromolecular targets. Already, computational target-prediction models assist medicinal chemists in library screening, de novo molecular design, optimization of active chemical agents, drug re-purposing, in the spotting of potential undesired off-target activities, and in the 'de-orphaning' of phenotypic screening hits. The self-organizing map (SOM) algorithm has been employed successfully for these and other purposes. Areas covered: The authors recapitulate contemporary artificial neural network methods for macromolecular target prediction, and present the basic SOM algorithm at a conceptual level. Specifically, they highlight consensus target-scoring by the employment of multiple SOMs, and discuss the opportunities and limitations of this technique. Expert opinion: Self-organizing feature maps represent a straightforward approach to ligand clustering and classification. Some of the appeal lies in their conceptual simplicity and broad applicability domain. Despite known algorithmic shortcomings, this computational target prediction concept has been proven to work in prospective settings with high success rates. It represents a prototypic technique for future advances in the in silico identification of the modes of action and macromolecular targets of bioactive molecules.
Predicting lysine glycation sites using bi-profile bayes feature extraction.
Ju, Zhe; Sun, Juhe; Li, Yanjie; Wang, Li
2017-12-01
Glycation is a nonenzymatic post-translational modification which has been found to be involved in various biological processes and closely associated with many metabolic diseases. The accurate identification of glycation sites is important to understand the underlying molecular mechanisms of glycation. As the traditional experimental methods are often labor-intensive and time-consuming, it is desired to develop computational methods to predict glycation sites. In this study, a novel predictor named BPB_GlySite is proposed to predict lysine glycation sites by using bi-profile bayes feature extraction and support vector machine algorithm. As illustrated by 10-fold cross-validation, BPB_GlySite achieves a satisfactory performance with a Sensitivity of 63.68%, a Specificity of 72.60%, an Accuracy of 69.63% and a Matthew's correlation coefficient of 0.3499. Experimental results also indicate that BPB_GlySite significantly outperforms three existing glycation sites predictors: NetGlycate, PreGly and Gly-PseAAC. Therefore, BPB_GlySite can be a useful bioinformatics tool for the prediction of glycation sites. A user-friendly web-server for BPB_GlySite is established at 123.206.31.171/BPB_GlySite/. Copyright © 2017 Elsevier Ltd. All rights reserved.
Bain, L J; McLachlan, J B; LeBlanc, G A
1997-01-01
The multixenobiotic resistance phenotype is characterized by the reduced accumulation of xenobiotics by cells or organisms due to increased efflux of the compounds by P-glycoprotein (P-gp) or related transporters. An extensive xenobiotic database, consisting primarily of pesticides, was utilized in this study to identify molecular characteristics that render a xenobiotic susceptible to transport by or inhibition of P-gp. Transport substrates were differentiated by several molecular size/shape parameters, lipophilicity, and hydrogen bonding potential. Electrostatic features differentiated inhibitory ligands from compounds not catagorized as transport substrates and that did no interact with P-gp. A two-tiered system was developed using the derived structure-activity relationships to identify P-gp transport substrates and inhibitory ligands. Prediction accuracy of the approach was 82%. We then validated the system using six additional pesticides of which tow were predicted to be P-gp inhibitors and four were predicted to be noninteractors, based upon the structure-activity analyses. Experimental determinations using cells transfected with the human MDR1 gene demonstrated that five of the six pesticides were properly catagorized by the structure-activity analyses (83% accuracy). Finally, structure-activity analyses revealed that among P-gp inhibitors, relative inhibitory potency can be predicted based upon the surface area or volume of the compound. These results demonstrate that P-gp transport substrates and inhibitory ligands can be distinguished using molecular characteristics. Molecular characteristics of transport substrates suggest that P-gp may function in the elimination of hydroxylated metabolites of xenobiotics. Images Figure 1. A Figure 1. B Figure 1. C Figure 1. D Figure 1. E Figure 1. F Figure 1. G Figure 1. H Figure 2. Figure 2. Figure 2. Figure 2. Figure 2. Figure 2. Figure 3. A Figure 3. B PMID:9347896
Molecular biomarkers for chronological age in animal ecology.
Jarman, Simon N; Polanowski, Andrea M; Faux, Cassandra E; Robbins, Jooke; De Paoli-Iseppi, Ricardo; Bravington, Mark; Deagle, Bruce E
2015-10-01
The chronological age of an individual animal predicts many of its biological characteristics, and these in turn influence population-level ecological processes. Animal age information can therefore be valuable in ecological research, but many species have no external features that allow age to be reliably determined. Molecular age biomarkers provide a potential solution to this problem. Research in this area of molecular ecology has so far focused on a limited range of age biomarkers. The most commonly tested molecular age biomarker is change in average telomere length, which predicts age well in a small number of species and tissues, but performs poorly in many other situations. Epigenetic regulation of gene expression has recently been shown to cause age-related modifications to DNA and to cause changes in abundance of several RNA types throughout animal lifespans. Age biomarkers based on these epigenetic changes, and other new DNA-based assays, have already been applied to model organisms, humans and a limited number of wild animals. There is clear potential to apply these marker types more widely in ecological studies. For many species, these new approaches will produce age estimates where this was previously impractical. They will also enable age information to be gathered in cross-sectional studies and expand the range of demographic characteristics that can be quantified with molecular methods. We describe the range of molecular age biomarkers that have been investigated to date and suggest approaches for developing the newer marker types as age assays in nonmodel animal species. © 2015 John Wiley & Sons Ltd.
MODTRAN cloud and multiple scattering upgrades with application to AVIRIS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berk, A.; Bernstein, L.S.; Acharya, P.K.
1998-09-01
Recent upgrades to the MODTRAN atmospheric radiation code improve the accuracy of its radiance predictions, especially in the presence of clouds and thick aerosols, and for multiple scattering in regions of strong molecular line absorption. The current public-released version of MODTRAN (MODTRAN3.7) features a generalized specification of cloud properties, while the current research version of MODTRAN (MODTRAN4) implements a correlated-k (CK) approach for more accurate calculation of multiple scattered radiance. Comparisons to cloud measurements demonstrate the viability of the CK approach. The impact of these upgrades on predictions for AVIRIS viewing scenarios is discussed for both clear and clouded skies;more » the CK approach provides refined predictions for AVIRIS nadir and near-nadir viewing.« less
Jia, Cang-Zhi; He, Wen-Ying; Yao, Yu-Hua
2017-03-01
Hydroxylation of proline or lysine residues in proteins is a common post-translational modification event, and such modifications are found in many physiological and pathological processes. Nonetheless, the exact molecular mechanism of hydroxylation remains under investigation. Because experimental identification of hydroxylation is time-consuming and expensive, bioinformatics tools with high accuracy represent desirable alternatives for large-scale rapid identification of protein hydroxylation sites. In view of this, we developed a supporter vector machine-based tool, OH-PRED, for the prediction of protein hydroxylation sites using the adapted normal distribution bi-profile Bayes feature extraction in combination with the physicochemical property indexes of the amino acids. In a jackknife cross validation, OH-PRED yields an accuracy of 91.88% and a Matthew's correlation coefficient (MCC) of 0.838 for the prediction of hydroxyproline sites, and yields an accuracy of 97.42% and a MCC of 0.949 for the prediction of hydroxylysine sites. These results demonstrate that OH-PRED increased significantly the prediction accuracy of hydroxyproline and hydroxylysine sites by 7.37 and 14.09%, respectively, when compared with the latest predictor PredHydroxy. In independent tests, OH-PRED also outperforms previously published methods.
[Activities of Harvard College Observatory
NASA Technical Reports Server (NTRS)
Dalgarno, A.; Smith, Peter L.; Stark, G.; Yoshino, K.
2002-01-01
With support from this grant, we have: 1) Developed techniques for improving wavelengths and f-values for singly and doubly charged ions of the iron group and have improved the accuracy of Fe III wavelengths by an order of magnitude. New Fe II f-values have also resulted from this work. 2) Measured line oscillator strengths and photoabsorption cross sections for UV molecular spectral feature that have been, or could be, used for searches for and detection of molecules in diffuse and translucent interstellar clouds and for determination of molecular column densities there. In addition, we have determined other molecular parameters -- line assignments, wavelengths, and line widths -- that are essential for theoretical descriptions of the abundance, fractionation, and excitation of interstellar molecules and for comparison of predictions with observations. 3) Measured A-values for spin-changing and other weak lines in low-Z ions. When A-values are available, these spectral features are useful for astrophysical plasma density and temperature diagnostics. Such lines are also used in interstellar abundance determinations in cases where the stronger allowed lines are saturated in astronomical spectra. 4) Taken an activist approach to ensuring that, (i), astronomers have ready access to our data, and, (ii), avenues of communication between data users and producers are strengthened.
Jaber, Mohammed; Wölfer, Johannes; Ewelt, Christian; Holling, Markus; Hasselblatt, Martin; Niederstadt, Thomas; Zoubi, Tarek; Weckesser, Matthias
2015-01-01
BACKGROUND: Approximately 20% of grade II and most grade III gliomas fluoresce after 5-aminolevulinic acid (5-ALA) application. Conversely, approximately 30% of nonenhancing gliomas are actually high grade. OBJECTIVE: The aim of this study was to identify preoperative factors (ie, age, enhancement, 18F-fluoroethyl tyrosine positron emission tomography [18F-FET PET] uptake ratios) for predicting fluorescence in gliomas without typical glioblastomas imaging features and to determine whether fluorescence will allow prediction of tumor grade or molecular characteristics. METHODS: Patients harboring gliomas without typical glioblastoma imaging features were given 5-ALA. Fluorescence was recorded intraoperatively, and biopsy specimens collected from fluorescing tissue. World Health Organization (WHO) grade, Ki-67/MIB-1 index, IDH1 (R132H) mutation status, O6-methylguanine DNA methyltransferase (MGMT) promoter methylation status, and 1p/19q co-deletion status were assessed. Predictive factors for fluorescence were derived from preoperative magnetic resonance imaging and 18F-FET PET. Classification and regression tree analysis and receiver-operating-characteristic curves were generated for defining predictors. RESULTS: Of 166 tumors, 82 were diagnosed as WHO grade II, 76 as grade III, and 8 as glioblastomas grade IV. Contrast enhancement, tumor volume, and 18F-FET PET uptake ratio >1.85 predicted fluorescence. Fluorescence correlated with WHO grade (P < .001) and Ki-67/MIB-1 index (P < .001), but not with MGMT promoter methylation status, IDH1 mutation status, or 1p19q co-deletion status. The Ki-67/MIB-1 index in fluorescing grade III gliomas was higher than in nonfluorescing tumors, whereas in fluorescing and nonfluorescing grade II tumors, no differences were noted. CONCLUSION: Age, tumor volume, and 18F-FET PET uptake are factors predicting 5-ALA-induced fluorescence in gliomas without typical glioblastoma imaging features. Fluorescence was associated with an increased Ki-67/MIB-1 index and high-grade pathology. Whether fluorescence in grade II gliomas identifies a subtype with worse prognosis remains to be determined. ABBREVIATIONS: 5-ALA, 5-aminolevulinic acid CRT, classification and regression tree 18F-FET PET, 18F-fluoroethyl tyrosine positron emission tomography FLAIR, fluid-attenuated inversion recovery GBM, glioblastoma multiforme O6-MGMT, methylguanine DNA methyltransferase ROC, receiver-operating characteristic SUV, standardized uptake value WHO, World Health Organization PMID:26366972
Evolution of Local Mutation Rate and Its Determinants.
Terekhanova, Nadezhda V; Seplyarskiy, Vladimir B; Soldatov, Ruslan A; Bazykin, Georgii A
2017-05-01
Mutation rate varies along the human genome, and part of this variation is explainable by measurable local properties of the DNA molecule. Moreover, mutation rates differ between orthologous genomic regions of different species, but the drivers of this change are unclear. Here, we use data on human divergence from chimpanzee, human rare polymorphism, and human de novo mutations to predict the substitution rate at orthologous regions of non-human mammals. We show that the local mutation rates are very similar between human and apes, implying that their variation has a strong underlying cryptic component not explainable by the known genomic features. Mutation rates become progressively less similar in more distant species, and these changes are partially explainable by changes in the local genomic features of orthologous regions, most importantly, in the recombination rate. However, they are much more rapid, implying that the cryptic component underlying the mutation rate is more ephemeral than the known genomic features. These findings shed light on the determinants of mutation rate evolution. local mutation rate, molecular evolution, recombination rate. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Molecular machinery of signal transduction and cell cycle regulation in Plasmodium.
Koyama, Fernanda C; Chakrabarti, Debopam; Garcia, Célia R S
2009-05-01
The regulation of the Plasmodium cell cycle is not understood. Although the Plasmodium falciparum genome is completely sequenced, about 60% of the predicted proteins share little or no sequence similarity with other eukaryotes. This feature impairs the identification of important proteins participating in the regulation of the cell cycle. There are several open questions that concern cell cycle progression in malaria parasites, including the mechanism by which multiple nuclear divisions is controlled and how the cell cycle is managed in all phases of their complex life cycle. Cell cycle synchrony of the parasite population within the host, as well as the circadian rhythm of proliferation, are striking features of some Plasmodium species, the molecular basis of which remains to be elucidated. In this review we discuss the role of indole-related molecules as signals that modulate the cell cycle in Plasmodium and other eukaryotes, and we also consider the possible role of kinases in the signal transduction and in the responses it triggers.
Toropova, Alla P; Toropov, Andrey A; Benfenati, Emilio; Puzyn, Tomasz; Leszczynska, Danuta; Leszczynski, Jerzy
2014-10-01
The development of quantitative structure-activity relationships for nanomaterials needs representation of molecular structure of extremely complex molecular systems. Obviously, various characteristics of nanomaterial could impact associated biochemical endpoints. Following features of TiO2 and ZnO nanoparticles (n=42) are considered here: (i) engineered size (nm); (ii) size in water suspension (nm); (iii) size in phosphate buffered saline (PBS, nm); (iv) concentration (mg/L); and (v) zeta potential (mV). The damage to cellular membranes (units/L) is selected as an endpoint. Quantitative features-activity relationships (QFARs) are calculated by the Monte Carlo technique for three distributions of data representing values associated with membrane damage into the training and validation sets. The obtained models are characterized by the following average statistics: 0.78
Singhi, Aatur D; Zeh, Herbert J; Brand, Randall E; Nikiforova, Marina N; Chennat, Jennifer S; Fasanella, Kenneth E; Khalid, Asif; Papachristou, Georgios I; Slivka, Adam; Hogg, Melissa; Lee, Kenneth K; Tsung, Allan; Zureikat, Amer H; McGrath, Kevin
2016-06-01
The American Gastroenterological Association (AGA) recently reported evidence-based guidelines for the management of asymptomatic neoplastic pancreatic cysts. These guidelines advocate a higher threshold for surgical resection than prior guidelines and imaging surveillance for a considerable number of patients with pancreatic cysts. The aims of this study were to assess the accuracy of the AGA guidelines in detecting advanced neoplasia and present an alternative approach to pancreatic cysts. The study population consisted of 225 patients who underwent EUS-guided FNA for pancreatic cysts between January 2014 and May 2015. For each patient, clinical findings, EUS features, cytopathology results, carcinoembryonic antigen analysis, and molecular testing of pancreatic cyst fluid were reviewed. Molecular testing included the assessment of hotspot mutations and deletions for KRAS, GNAS, VHL, TP53, PIK3CA, and PTEN. Diagnostic pathology results were available for 41 patients (18%), with 13 (6%) harboring advanced neoplasia. Among these cases, the AGA guidelines identified advanced neoplasia with 62% sensitivity, 79% specificity, 57% positive predictive value, and 82% negative predictive value. Moreover, the AGA guidelines missed 45% of intraductal papillary mucinous neoplasms with adenocarcinoma or high-grade dysplasia. For cases without confirmatory pathology, 27 of 184 patients (15%) with serous cystadenomas (SCAs) based on EUS findings and/or VHL alterations would continue magnetic resonance imaging (MRI) surveillance. In comparison, a novel algorithmic pathway using molecular testing of pancreatic cyst fluid detected advanced neoplasias with 100% sensitivity, 90% specificity, 79% positive predictive value, and 100% negative predictive value. The AGA guidelines were inaccurate in detecting pancreatic cysts with advanced neoplasia. Furthermore, because the AGA guidelines manage all neoplastic cysts similarly, patients with SCAs will continue to undergo unnecessary MRI surveillance. The results of an alternative approach with integrative molecular testing are encouraging but require further validation. Copyright © 2016 American Society for Gastrointestinal Endoscopy. Published by Elsevier Inc. All rights reserved.
Shen, Mingyun; Zhou, Shunye; Li, Youyong; Li, Dan; Hou, Tingjun
2013-10-01
LIM kinases (LIMKs), downstream of Rho-associated protein kinases (ROCKs) and p21-activated protein kinases (PAKs), are shown to be promising targets for the treatment of cancers. In this study, the inhibition mechanism of 41 pyrrolopyrimidine derivatives as LIMK2 inhibitors was explored through a series of theoretical approaches. First, a model of LIMK2 was generated through molecular homology modeling, and the studied inhibitors were docked into the binding active site of LIMK2 by the docking protocol, taking into consideration the flexibility of the protein. The binding poses predicted by molecular docking for 17 selected inhibitors with different bioactivities complexed with LIMK2 underwent molecular dynamics (MD) simulations, and the binding free energies for the complexes were predicted by using the molecular mechanics/generalized born surface area (MM/GBSA) method. The predicted binding free energies correlated well with the experimental bioactivities (r(2) = 0.63 or 0.62). Next, the free energy decomposition analysis was utilized to highlight the following key structural features related to biological activity: (1) the important H-bond between Ile408 and pyrrolopyrimidine, (2) the H-bonds between the inhibitors and Asp469 and Gly471 which maintain the stability of the DFG-out conformation, and (3) the hydrophobic interactions between the inhibitors and several key residues (Leu337, Phe342, Ala345, Val358, Lys360, Leu389, Ile408, Leu458 and Leu472). Finally, a variety of LIMK2 inhibitors with a pyrrolopyrimidine scaffold were designed, some of which showed improved potency according to the predictions. Our studies suggest that the use of molecular docking with MD simulations and free energy calculations could be a powerful tool for understanding the binding mechanism of LIMK2 inhibitors and for the design of more potent LIMK2 inhibitors.
Predicting multicellular function through multi-layer tissue networks
Zitnik, Marinka; Leskovec, Jure
2017-01-01
Abstract Motivation: Understanding functions of proteins in specific human tissues is essential for insights into disease diagnostics and therapeutics, yet prediction of tissue-specific cellular function remains a critical challenge for biomedicine. Results: Here, we present OhmNet, a hierarchy-aware unsupervised node feature learning approach for multi-layer networks. We build a multi-layer network, where each layer represents molecular interactions in a different human tissue. OhmNet then automatically learns a mapping of proteins, represented as nodes, to a neural embedding-based low-dimensional space of features. OhmNet encourages sharing of similar features among proteins with similar network neighborhoods and among proteins activated in similar tissues. The algorithm generalizes prior work, which generally ignores relationships between tissues, by modeling tissue organization with a rich multiscale tissue hierarchy. We use OhmNet to study multicellular function in a multi-layer protein interaction network of 107 human tissues. In 48 tissues with known tissue-specific cellular functions, OhmNet provides more accurate predictions of cellular function than alternative approaches, and also generates more accurate hypotheses about tissue-specific protein actions. We show that taking into account the tissue hierarchy leads to improved predictive power. Remarkably, we also demonstrate that it is possible to leverage the tissue hierarchy in order to effectively transfer cellular functions to a functionally uncharacterized tissue. Overall, OhmNet moves from flat networks to multiscale models able to predict a range of phenotypes spanning cellular subsystems. Availability and implementation: Source code and datasets are available at http://snap.stanford.edu/ohmnet. Contact: jure@cs.stanford.edu PMID:28881986
Kros, Johan M; Huizer, Karin; Hernández-Laín, Aurelio; Marucci, Gianluca; Michotte, Alex; Pollo, Bianca; Rushing, Elisabeth J; Ribalta, Teresa; French, Pim; Jaminé, David; Bekka, Nawal; Lacombe, Denis; van den Bent, Martin J; Gorlia, Thierry
2015-06-10
With the rapid discovery of prognostic and predictive molecular parameters for glioma, the status of histopathology in the diagnostic process should be scrutinized. Our project aimed to construct a diagnostic algorithm for gliomas based on molecular and histologic parameters with independent prognostic values. The pathology slides of 636 patients with gliomas who had been included in EORTC 26951 and 26882 trials were reviewed using virtual microscopy by a panel of six neuropathologists who independently scored 18 histologic features and provided an overall diagnosis. The molecular data for IDH1, 1p/19q loss, EGFR amplification, loss of chromosome 10 and chromosome arm 10q, gain of chromosome 7, and hypermethylation of the promoter of MGMT were available for some of the cases. The slides were divided in discovery (n = 426) and validation sets (n = 210). The diagnostic algorithm resulting from analysis of the discovery set was validated in the latter. In 66% of cases, consensus of overall diagnosis was present. A diagnostic algorithm consisting of two molecular markers and one consensus histologic feature was created by conditional inference tree analysis. The order of prognostic significance was: 1p/19q loss, EGFR amplification, and astrocytic morphology, which resulted in the identification of four diagnostic nodes. Validation of the nodes in the validation set confirmed the prognostic value (P < .001). We succeeded in the creation of a timely diagnostic algorithm for anaplastic glioma based on multivariable analysis of consensus histopathology and molecular parameters. © 2015 by American Society of Clinical Oncology.
Extending Halogen-based Medicinal Chemistry to Proteins: IODO-INSULIN AS A CASE STUDY.
El Hage, Krystel; Pandyarajan, Vijay; Phillips, Nelson B; Smith, Brian J; Menting, John G; Whittaker, Jonathan; Lawrence, Michael C; Meuwly, Markus; Weiss, Michael A
2016-12-30
Insulin, a protein critical for metabolic homeostasis, provides a classical model for protein design with application to human health. Recent efforts to improve its pharmaceutical formulation demonstrated that iodination of a conserved tyrosine (Tyr B26 ) enhances key properties of a rapid-acting clinical analog. Moreover, the broad utility of halogens in medicinal chemistry has motivated the use of hybrid quantum- and molecular-mechanical methods to study proteins. Here, we (i) undertook quantitative atomistic simulations of 3-[iodo-Tyr B26 ]insulin to predict its structural features, and (ii) tested these predictions by X-ray crystallography. Using an electrostatic model of the modified aromatic ring based on quantum chemistry, the calculations suggested that the analog, as a dimer and hexamer, exhibits subtle differences in aromatic-aromatic interactions at the dimer interface. Aromatic rings (Tyr B16 , Phe B24 , Phe B25 , 3-I-Tyr B26 , and their symmetry-related mates) at this interface adjust to enable packing of the hydrophobic iodine atoms within the core of each monomer. Strikingly, these features were observed in the crystal structure of a 3-[iodo-Tyr B26 ]insulin analog (determined as an R 6 zinc hexamer). Given that residues B24-B30 detach from the core on receptor binding, the environment of 3-I-Tyr B26 in a receptor complex must differ from that in the free hormone. Based on the recent structure of a "micro-receptor" complex, we predict that 3-I-Tyr B26 engages the receptor via directional halogen bonding and halogen-directed hydrogen bonding as follows: favorable electrostatic interactions exploiting, respectively, the halogen's electron-deficient σ-hole and electronegative equatorial band. Inspired by quantum chemistry and molecular dynamics, such "halogen engineering" promises to extend principles of medicinal chemistry to proteins. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Informatics Approaches for Predicting, Understanding, and Testing Cancer Drug Combinations.
Tang, Jing
2017-01-01
Making cancer treatment more effective is one of the grand challenges in our health care system. However, many drugs have entered clinical trials but so far showed limited efficacy or induced rapid development of resistance. We urgently need multi-targeted drug combinations, which shall selectively inhibit the cancer cells and block the emergence of drug resistance. The book chapter focuses on mathematical and computational tools to facilitate the discovery of the most promising drug combinations to improve efficacy and prevent resistance. Data integration approaches that leverage drug-target interactions, cancer molecular features, and signaling pathways for predicting, understanding, and testing drug combinations are critically reviewed.
Schwalbe, Edward C; Lindsey, Janet C; Nakjang, Sirintra; Crosier, Stephen; Smith, Amanda J; Hicks, Debbie; Rafiee, Gholamreza; Hill, Rebecca M; Iliasova, Alice; Stone, Thomas; Pizer, Barry; Michalski, Antony; Joshi, Abhijit; Wharton, Stephen B; Jacques, Thomas S; Bailey, Simon; Williamson, Daniel; Clifford, Steven C
2017-07-01
International consensus recognises four medulloblastoma molecular subgroups: WNT (MB WNT ), SHH (MB SHH ), group 3 (MB Grp3 ), and group 4 (MB Grp4 ), each defined by their characteristic genome-wide transcriptomic and DNA methylomic profiles. These subgroups have distinct clinicopathological and molecular features, and underpin current disease subclassification and initial subgroup-directed therapies that are underway in clinical trials. However, substantial biological heterogeneity and differences in survival are apparent within each subgroup, which remain to be resolved. We aimed to investigate whether additional molecular subgroups exist within childhood medulloblastoma and whether these could be used to improve disease subclassification and prognosis predictions. In this retrospective cohort study, we assessed 428 primary medulloblastoma samples collected from UK Children's Cancer and Leukaemia Group (CCLG) treatment centres (UK), collaborating European institutions, and the UKCCSG-SIOP-PNET3 European clinical trial. An independent validation cohort (n=276) of archival tumour samples was also analysed. We analysed samples from patients with childhood medulloblastoma who were aged 0-16 years at diagnosis, and had central review of pathology and comprehensive clinical data. We did comprehensive molecular profiling, including DNA methylation microarray analysis, and did unsupervised class discovery of test and validation cohorts to identify consensus primary molecular subgroups and characterise their clinical and biological significance. We modelled survival of patients aged 3-16 years in patients (n=215) who had craniospinal irradiation and had been treated with a curative intent. Seven robust and reproducible primary molecular subgroups of childhood medulloblastoma were identified. MB WNT remained unchanged and each remaining consensus subgroup was split in two. MB SHH was split into age-dependent subgroups corresponding to infant (<4·3 years; MB SHH-Infant ; n=65) and childhood patients (≥4·3 years; MB SHH-Child ; n=38). MB Grp3 and MB Grp4 were each split into high-risk (MB Grp3-HR [n=65] and MB Grp4-HR [n=85]) and low-risk (MB Grp3-LR [n=50] and MB Grp4-LR [n=73]) subgroups. These biological subgroups were validated in the independent cohort. We identified features of the seven subgroups that were predictive of outcome. Cross-validated subgroup-dependent survival models, incorporating these novel subgroups along with secondary clinicopathological and molecular features and established disease risk-factors, outperformed existing disease risk-stratification schemes. These subgroup-dependent models stratified patients into four clinical risk groups for 5-year progression-free survival: favourable risk (54 [25%] of 215 patients; 91% survival [95% CI 82-100]); standard risk (50 [23%] patients; 81% survival [70-94]); high-risk (82 [38%] patients; 42% survival [31-56]); and very high-risk (29 [13%] patients; 28% survival [14-56]). The discovery of seven novel, clinically significant subgroups improves disease risk-stratification and could inform treatment decisions. These data provide a new foundation for future research and clinical investigations. Cancer Research UK, The Tom Grahame Trust, Star for Harris, Action Medical Research, SPARKS, The JGW Patterson Foundation, The INSTINCT network (co-funded by The Brain Tumour Charity, Great Ormond Street Children's Charity, and Children with Cancer UK). Copyright © 2017 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.
HMPAS: Human Membrane Protein Analysis System
2013-01-01
Background Membrane proteins perform essential roles in diverse cellular functions and are regarded as major pharmaceutical targets. The significance of membrane proteins has led to the developing dozens of resources related with membrane proteins. However, most of these resources are built for specific well-known membrane protein groups, making it difficult to find common and specific features of various membrane protein groups. Methods We collected human membrane proteins from the dispersed resources and predicted novel membrane protein candidates by using ortholog information and our membrane protein classifiers. The membrane proteins were classified according to the type of interaction with the membrane, subcellular localization, and molecular function. We also made new feature dataset to characterize the membrane proteins in various aspects including membrane protein topology, domain, biological process, disease, and drug. Moreover, protein structure and ICD-10-CM based integrated disease and drug information was newly included. To analyze the comprehensive information of membrane proteins, we implemented analysis tools to identify novel sequence and functional features of the classified membrane protein groups and to extract features from protein sequences. Results We constructed HMPAS with 28,509 collected known membrane proteins and 8,076 newly predicted candidates. This system provides integrated information of human membrane proteins individually and in groups organized by 45 subcellular locations and 1,401 molecular functions. As a case study, we identified associations between the membrane proteins and diseases and present that membrane proteins are promising targets for diseases related with nervous system and circulatory system. A web-based interface of this system was constructed to facilitate researchers not only to retrieve organized information of individual proteins but also to use the tools to analyze the membrane proteins. Conclusions HMPAS provides comprehensive information about human membrane proteins including specific features of certain membrane protein groups. In this system, user can acquire the information of individual proteins and specified groups focused on their conserved sequence features, involved cellular processes, and diseases. HMPAS may contribute as a valuable resource for the inference of novel cellular mechanisms and pharmaceutical targets associated with the human membrane proteins. HMPAS is freely available at http://fcode.kaist.ac.kr/hmpas. PMID:24564858
On the Certain Topological Indices of Titania Nanotube TiO2[m, n
NASA Astrophysics Data System (ADS)
Javaid, M.; Liu, Jia-Bao; Rehman, M. A.; Wang, Shaohui
2017-07-01
A numeric quantity that characterises the whole structure of a molecular graph is called the topological index that predicts the physical features, chemical reactivities, and boiling activities of the involved chemical compound in the molecular graph. In this article, we give new mathematical expressions for the multiple Zagreb indices, the generalised Zagreb index, the fourth version of atom-bond connectivity (ABC4) index, and the fifth version of geometric-arithmetic (GA5) index of TiO2[m, n]. In addition, we compute the latest developed topological index called by Sanskruti index. At the end, a comparison is also included to estimate the efficiency of the computed indices. Our results extended some known conclusions.
Kulik, Natallia; Slámová, Kristýna; Ettrich, Rüdiger; Křen, Vladimír
2015-01-28
β-N-Acetylhexosaminidase (GH20) from the filamentous fungus Talaromyces flavus, previously identified as a prominent enzyme in the biosynthesis of modified glycosides, lacks a high resolution three-dimensional structure so far. Despite of high sequence identity to previously reported Aspergillus oryzae and Penicilluim oxalicum β-N-acetylhexosaminidases, this enzyme tolerates significantly better substrate modification. Understanding of key structural features, prediction of effective mutants and potential substrate characteristics prior to their synthesis are of general interest. Computational methods including homology modeling and molecular dynamics simulations were applied to shad light on the structure-activity relationship in the enzyme. Primary sequence analysis revealed some variable regions able to influence difference in substrate affinity of hexosaminidases. Moreover, docking in combination with consequent molecular dynamics simulations of C-6 modified glycosides enabled us to identify the structural features required for accommodation and processing of these bulky substrates in the active site of hexosaminidase from T. flavus. To access the reliability of predictions on basis of the reported model, all results were confronted with available experimental data that demonstrated the principal correctness of the predictions as well as the model. The main variable regions in β-N-acetylhexosaminidases determining difference in modified substrate affinity are located close to the active site entrance and engage two loops. Differences in primary sequence and the spatial arrangement of these loops and their interplay with active site amino acids, reflected by interaction energies and dynamics, account for the different catalytic activity and substrate specificity of the various fungal and bacterial β-N-acetylhexosaminidases.
Yao, Zhi-Jiang; Dong, Jie; Che, Yu-Jing; Zhu, Min-Feng; Wen, Ming; Wang, Ning-Ning; Wang, Shan; Lu, Ai-Ping; Cao, Dong-Sheng
2016-05-01
Drug-target interactions (DTIs) are central to current drug discovery processes and public health fields. Analyzing the DTI profiling of the drugs helps to infer drug indications, adverse drug reactions, drug-drug interactions, and drug mode of actions. Therefore, it is of high importance to reliably and fast predict DTI profiling of the drugs on a genome-scale level. Here, we develop the TargetNet server, which can make real-time DTI predictions based only on molecular structures, following the spirit of multi-target SAR methodology. Naïve Bayes models together with various molecular fingerprints were employed to construct prediction models. Ensemble learning from these fingerprints was also provided to improve the prediction ability. When the user submits a molecule, the server will predict the activity of the user's molecule across 623 human proteins by the established high quality SAR model, thus generating a DTI profiling that can be used as a feature vector of chemicals for wide applications. The 623 SAR models related to 623 human proteins were strictly evaluated and validated by several model validation strategies, resulting in the AUC scores of 75-100 %. We applied the generated DTI profiling to successfully predict potential targets, toxicity classification, drug-drug interactions, and drug mode of action, which sufficiently demonstrated the wide application value of the potential DTI profiling. The TargetNet webserver is designed based on the Django framework in Python, and is freely accessible at http://targetnet.scbdd.com .
NASA Astrophysics Data System (ADS)
Yao, Zhi-Jiang; Dong, Jie; Che, Yu-Jing; Zhu, Min-Feng; Wen, Ming; Wang, Ning-Ning; Wang, Shan; Lu, Ai-Ping; Cao, Dong-Sheng
2016-05-01
Drug-target interactions (DTIs) are central to current drug discovery processes and public health fields. Analyzing the DTI profiling of the drugs helps to infer drug indications, adverse drug reactions, drug-drug interactions, and drug mode of actions. Therefore, it is of high importance to reliably and fast predict DTI profiling of the drugs on a genome-scale level. Here, we develop the TargetNet server, which can make real-time DTI predictions based only on molecular structures, following the spirit of multi-target SAR methodology. Naïve Bayes models together with various molecular fingerprints were employed to construct prediction models. Ensemble learning from these fingerprints was also provided to improve the prediction ability. When the user submits a molecule, the server will predict the activity of the user's molecule across 623 human proteins by the established high quality SAR model, thus generating a DTI profiling that can be used as a feature vector of chemicals for wide applications. The 623 SAR models related to 623 human proteins were strictly evaluated and validated by several model validation strategies, resulting in the AUC scores of 75-100 %. We applied the generated DTI profiling to successfully predict potential targets, toxicity classification, drug-drug interactions, and drug mode of action, which sufficiently demonstrated the wide application value of the potential DTI profiling. The TargetNet webserver is designed based on the Django framework in Python, and is freely accessible at http://targetnet.scbdd.com.
NASA Astrophysics Data System (ADS)
Ennis, C.; Auchettl, R.; Appadoo, D. R. T.; Robertson, E. G.
2017-11-01
Solid-state density functional theory code has been implemented for the structure optimization of crystalline methanol, acetaldehyde and acetic acid and for the calculation of infrared frequencies. The results are compared to thin film spectra obtained from low-temperature experiments performed at the Australian Synchrotron. Harmonic frequency calculations of the internal modes calculated at the B3LYP-D3/m-6-311G(d) level shows higher deviation from infrared experiment than more advanced theory applied to the gas phase. Importantly for the solid-state, the simulation of low-frequency molecular lattice modes closely resembles the observed far-infrared features after application of a 0.92 scaling factor. This allowed experimental peaks to be assigned to specific translation and libration modes, including acetaldehyde and acetic acid lattice features for the first time. These frequency calculations have been performed without the need for supercomputing resources that are required for large molecular clusters using comparable levels of theory. This new theoretical approach will find use for the rapid characterization of intermolecular interactions and bonding in crystals, and the assignment of far-infrared spectra for crystalline samples such as pharmaceuticals and molecular ices. One interesting application may be for the detection of species of prebiotic interest on the surfaces of Kuiper-Belt and Trans-Neptunian Objects. At such locations, the three small organic molecules studied here could reside in their crystalline phase. The far-infrared spectra for their low-temperature solid phases are collected under planetary conditions, allowing us to compile and assign their most intense spectral features to assist future far-infrared surveys of icy Solar system surfaces.
NASA Astrophysics Data System (ADS)
Kadoura, Ahmad; Sun, Shuyu; Salama, Amgad
2014-08-01
Accurate determination of thermodynamic properties of petroleum reservoir fluids is of great interest to many applications, especially in petroleum engineering and chemical engineering. Molecular simulation has many appealing features, especially its requirement of fewer tuned parameters but yet better predicting capability; however it is well known that molecular simulation is very CPU expensive, as compared to equation of state approaches. We have recently introduced an efficient thermodynamically consistent technique to regenerate rapidly Monte Carlo Markov Chains (MCMCs) at different thermodynamic conditions from the existing data points that have been pre-computed with expensive classical simulation. This technique can speed up the simulation more than a million times, making the regenerated molecular simulation almost as fast as equation of state approaches. In this paper, this technique is first briefly reviewed and then numerically investigated in its capability of predicting ensemble averages of primary quantities at different neighboring thermodynamic conditions to the original simulated MCMCs. Moreover, this extrapolation technique is extended to predict second derivative properties (e.g. heat capacity and fluid compressibility). The method works by reweighting and reconstructing generated MCMCs in canonical ensemble for Lennard-Jones particles. In this paper, system's potential energy, pressure, isochoric heat capacity and isothermal compressibility along isochors, isotherms and paths of changing temperature and density from the original simulated points were extrapolated. Finally, an optimized set of Lennard-Jones parameters (ε, σ) for single site models were proposed for methane, nitrogen and carbon monoxide.
Disfani, Fatemeh Miri; Hsu, Wei-Lun; Mizianty, Marcin J.; Oldfield, Christopher J.; Xue, Bin; Dunker, A. Keith; Uversky, Vladimir N.; Kurgan, Lukasz
2012-01-01
Motivation: Molecular recognition features (MoRFs) are short binding regions located within longer intrinsically disordered regions that bind to protein partners via disorder-to-order transitions. MoRFs are implicated in important processes including signaling and regulation. However, only a limited number of experimentally validated MoRFs is known, which motivates development of computational methods that predict MoRFs from protein chains. Results: We introduce a new MoRF predictor, MoRFpred, which identifies all MoRF types (α, β, coil and complex). We develop a comprehensive dataset of annotated MoRFs to build and empirically compare our method. MoRFpred utilizes a novel design in which annotations generated by sequence alignment are fused with predictions generated by a Support Vector Machine (SVM), which uses a custom designed set of sequence-derived features. The features provide information about evolutionary profiles, selected physiochemical properties of amino acids, and predicted disorder, solvent accessibility and B-factors. Empirical evaluation on several datasets shows that MoRFpred outperforms related methods: α-MoRF-Pred that predicts α-MoRFs and ANCHOR which finds disordered regions that become ordered when bound to a globular partner. We show that our predicted (new) MoRF regions have non-random sequence similarity with native MoRFs. We use this observation along with the fact that predictions with higher probability are more accurate to identify putative MoRF regions. We also identify a few sequence-derived hallmarks of MoRFs. They are characterized by dips in the disorder predictions and higher hydrophobicity and stability when compared to adjacent (in the chain) residues. Availability: http://biomine.ece.ualberta.ca/MoRFpred/; http://biomine.ece.ualberta.ca/MoRFpred/Supplement.pdf Contact: lkurgan@ece.ualberta.ca Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22689782
Cheng, Peng; Li, Jiaojiao; Wang, Juan; Zhang, Xiaoyun; Zhai, Honglin
2018-05-01
Focal adhesion kinase (FAK) is one kind of tyrosine kinases that modulates integrin and growth factor signaling pathways, which is a promising therapeutic target because of involving in cancer cell migration, proliferation, and survival. To investigate the mechanism between FAK and triazinic inhibitors and design high activity inhibitors, a molecular modeling integrated with 3D-QSAR, molecular docking, molecular dynamics simulations, and binding free energy calculations was performed. The optimum CoMFA and CoMSIA models showed good reliability and satisfactory predictability (with Q 2 = 0.663, R 2 = 0.987, [Formula: see text] = 0.921 and Q 2 = 0.670, R 2 = 0.981, [Formula: see text] = 0.953). Its contour maps could provide structural features to improve inhibitory activity. Furthermore, a good consistency between contour maps, docking, and molecular dynamics simulations strongly demonstrates that the molecular modeling is reliable. Based on it, we designed several new compounds and their inhibitory activities were validated by the molecular models. We expect our studies could bring new ideas to promote the development of novel inhibitors with higher inhibitory activity for FAK.
Advances in the molecular genetics of gliomas - implications for classification and therapy.
Reifenberger, Guido; Wirsching, Hans-Georg; Knobbe-Thomsen, Christiane B; Weller, Michael
2017-07-01
Genome-wide molecular-profiling studies have revealed the characteristic genetic alterations and epigenetic profiles associated with different types of gliomas. These molecular characteristics can be used to refine glioma classification, to improve prediction of patient outcomes, and to guide individualized treatment. Thus, the WHO Classification of Tumours of the Central Nervous System was revised in 2016 to incorporate molecular biomarkers - together with classic histological features - in an integrated diagnosis, in order to define distinct glioma entities as precisely as possible. This paradigm shift is markedly changing how glioma is diagnosed, and has important implications for future clinical trials and patient management in daily practice. Herein, we highlight the developments in our understanding of the molecular genetics of gliomas, and review the current landscape of clinically relevant molecular biomarkers for use in classification of the disease subtypes. Novel approaches to the genetic characterization of gliomas based on large-scale DNA-methylation profiling and next-generation sequencing are also discussed. In addition, we illustrate how advances in the molecular genetics of gliomas can promote the development and clinical translation of novel pathogenesis-based therapeutic approaches, thereby paving the way towards precision medicine in neuro-oncology.
The Spike-and-Slab Lasso Generalized Linear Models for Prediction and Associated Genes Detection.
Tang, Zaixiang; Shen, Yueping; Zhang, Xinyan; Yi, Nengjun
2017-01-01
Large-scale "omics" data have been increasingly used as an important resource for prognostic prediction of diseases and detection of associated genes. However, there are considerable challenges in analyzing high-dimensional molecular data, including the large number of potential molecular predictors, limited number of samples, and small effect of each predictor. We propose new Bayesian hierarchical generalized linear models, called spike-and-slab lasso GLMs, for prognostic prediction and detection of associated genes using large-scale molecular data. The proposed model employs a spike-and-slab mixture double-exponential prior for coefficients that can induce weak shrinkage on large coefficients, and strong shrinkage on irrelevant coefficients. We have developed a fast and stable algorithm to fit large-scale hierarchal GLMs by incorporating expectation-maximization (EM) steps into the fast cyclic coordinate descent algorithm. The proposed approach integrates nice features of two popular methods, i.e., penalized lasso and Bayesian spike-and-slab variable selection. The performance of the proposed method is assessed via extensive simulation studies. The results show that the proposed approach can provide not only more accurate estimates of the parameters, but also better prediction. We demonstrate the proposed procedure on two cancer data sets: a well-known breast cancer data set consisting of 295 tumors, and expression data of 4919 genes; and the ovarian cancer data set from TCGA with 362 tumors, and expression data of 5336 genes. Our analyses show that the proposed procedure can generate powerful models for predicting outcomes and detecting associated genes. The methods have been implemented in a freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/). Copyright © 2017 by the Genetics Society of America.
Crossover from equilibration to aging: Nonequilibrium theory versus simulations.
Mendoza-Méndez, P; Lázaro-Lázaro, E; Sánchez-Díaz, L E; Ramírez-González, P E; Pérez-Ángel, G; Medina-Noyola, M
2017-08-01
Understanding glasses and the glass transition requires comprehending the nature of the crossover from the ergodic (or equilibrium) regime, in which the stationary properties of the system have no history dependence, to the mysterious glass transition region, where the measured properties are nonstationary and depend on the protocol of preparation. In this work we use nonequilibrium molecular dynamics simulations to test the main features of the crossover predicted by the molecular version of the recently developed multicomponent nonequilibrium self-consistent generalized Langevin equation theory. According to this theory, the glass transition involves the abrupt passage from the ordinary pattern of full equilibration to the aging scenario characteristic of glass-forming liquids. The same theory explains that this abrupt transition will always be observed as a blurred crossover due to the unavoidable finiteness of the time window of any experimental observation. We find that within their finite waiting-time window, the simulations confirm the general trends predicted by the theory.
Rodrigo, Guillermo; Jaramillo, Alfonso; Blázquez, Miguel A
2011-08-17
The interplay between hormone signaling and gene regulatory networks is instrumental in promoting the development of living organisms. In particular, plants have evolved mechanisms to sense gravity and orient themselves accordingly. Here, we present a mathematical model that reproduces plant gravitropic responses based on known molecular genetic interactions for auxin signaling coupled with a physical description of plant reorientation. The model allows one to analyze the spatiotemporal dynamics of the system, triggered by an auxin gradient that induces differential growth of the plant with respect to the gravity vector. Our model predicts two important features with strong biological implications: 1), robustness of the regulatory circuit as a consequence of integral control; and 2), a higher degree of plasticity generated by the molecular interplay between two classes of hormones. Our model also predicts the ability of gibberellins to modulate the tropic response and supports the integration of the hormonal role at the level of gene regulation. Copyright © 2011 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Modelling morphology evolution during solidification of IPP in processing conditions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pantani, R., E-mail: rpantani@unisa.it, E-mail: fedesantis@unisa.it, E-mail: vsperanza@unisa.it, E-mail: gtitomanlio@unisa.it; De Santis, F., E-mail: rpantani@unisa.it, E-mail: fedesantis@unisa.it, E-mail: vsperanza@unisa.it, E-mail: gtitomanlio@unisa.it; Speranza, V., E-mail: rpantani@unisa.it, E-mail: fedesantis@unisa.it, E-mail: vsperanza@unisa.it, E-mail: gtitomanlio@unisa.it
During polymer processing, crystallization takes place during or soon after flow. In most of cases, the flow field dramatically influences both the crystallization kinetics and the crystal morphology. On their turn, crystallinity and morphology affect product properties. Consequently, in the last decade, researchers tried to identify the main parameters determining crystallinity and morphology evolution during solidification In processing conditions. In this work, we present an approach to model flow-induced crystallization with the aim of predicting the morphology after processing. The approach is based on: interpretation of the FIC as the effect of molecular stretch on the thermodynamic crystallization temperature; modelingmore » the molecular stretch evolution by means of a model simple and easy to be implemented in polymer processing simulation codes; identification of the effect of flow on nucleation density and spherulites growth rate by means of simple experiments; determination of the condition under which fibers form instead of spherulites. Model predictions reproduce most of the features of final morphology observed in the samples after solidification.« less
Choi, Ickwon; Kattan, Michael W; Wells, Brian J; Yu, Changhong
2012-01-01
In medical society, the prognostic models, which use clinicopathologic features and predict prognosis after a certain treatment, have been externally validated and used in practice. In recent years, most research has focused on high dimensional genomic data and small sample sizes. Since clinically similar but molecularly heterogeneous tumors may produce different clinical outcomes, the combination of clinical and genomic information, which may be complementary, is crucial to improve the quality of prognostic predictions. However, there is a lack of an integrating scheme for clinic-genomic models due to the P ≥ N problem, in particular, for a parsimonious model. We propose a methodology to build a reduced yet accurate integrative model using a hybrid approach based on the Cox regression model, which uses several dimension reduction techniques, L₂ penalized maximum likelihood estimation (PMLE), and resampling methods to tackle the problem. The predictive accuracy of the modeling approach is assessed by several metrics via an independent and thorough scheme to compare competing methods. In breast cancer data studies on a metastasis and death event, we show that the proposed methodology can improve prediction accuracy and build a final model with a hybrid signature that is parsimonious when integrating both types of variables.
A comparison of biophysical characterization techniques in predicting monoclonal antibody stability.
Thiagarajan, Geetha; Semple, Andrew; James, Jose K; Cheung, Jason K; Shameem, Mohammed
2016-01-01
With the rapid growth of biopharmaceutical product development, knowledge of therapeutic protein stability has become increasingly important. We evaluated assays that measure solution-mediated interactions and key molecular characteristics of 9 formulated monoclonal antibody (mAb) therapeutics, to predict their stability behavior. Colloidal interactions, self-association propensity and conformational stability were measured using effective surface charge via zeta potential, diffusion interaction parameter (kD) and differential scanning calorimetry (DSC), respectively. The molecular features of all 9 mAbs were compared to their stability at accelerated (25°C and 40°C) and long-term storage conditions (2-8°C) as measured by size exclusion chromatography. At accelerated storage conditions, the majority of the mAbs in this study degraded via fragmentation rather than aggregation. Our results show that colloidal stability, self-association propensity and conformational characteristics (exposed tryptophan) provide reasonable prediction of accelerated stability, with limited predictive value at 2-8°C stability. While no correlations to stability behavior were observed with onset-of-melting temperatures or domain unfolding temperatures, by DSC, melting of the Fab domain with the CH2 domain suggests lower stability at stressed conditions. The relevance of identifying appropriate biophysical assays based on the primary degradation pathways is discussed.
Next generation diagnostic molecular pathology: critical appraisal of quality assurance in Europe.
Dubbink, Hendrikus J; Deans, Zandra C; Tops, Bastiaan B J; van Kemenade, Folkert J; Koljenović, S; van Krieken, Han J M; Blokx, Willeke A M; Dinjens, Winand N M; Groenen, Patricia J T A
2014-06-01
Tumor evaluation in pathology is more and more based on a combination of traditional histopathology and molecular analysis. Due to the rapid development of new cancer treatments that specifically target aberrant proteins present in tumor cells, treatment decisions are increasingly based on the molecular features of the tumor. Not only the number of patients eligible for targeted precision medicine, but also the number of molecular targets per patient and tumor type is rising. Diagnostic molecular pathology, the discipline that determines the molecular aberrations present in tumors for diagnostic, prognostic or predictive purposes, is faced with true challenges. The laboratories have to meet the need of comprehensive molecular testing using only limited amount of tumor tissue, mostly fixed in formalin and embedded in paraffin (FFPE), in short turnaround time. Choices must be made for analytical methods that provide accurate, reliable and cost-effective results. Validation of the test procedures and results is essential. In addition, participation and good performance in internal (IQA) and external quality assurance (EQA) schemes is mandatory. In this review, we critically evaluate the validation procedure for comprehensive molecular tests as well as the organization of quality assurance and assessment of competence of diagnostic molecular pathology laboratories within Europe. Copyright © 2014 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Ward, Keith W; Erhardt, Paul; Bachmann, Kenneth
2005-01-01
Previous publications from GlaxoSmithKline and University of Toledo laboratories convey our independent attempts to predict the half-lives of xenobiotics in humans using data obtained from rats. The present investigation was conducted to compare the performance of our published models against a common dataset obtained by merging the two sets of rat versus human half-life (hHL) data previously used by each laboratory. After combining data, mathematical analyses were undertaken by deploying both of our previous models, namely the use of an empirical algorithm based on a best-fit model and the use of rat-to-human liver blood flow ratios as a half-life correction factor. Both qualitative and quantitative analyses were performed, as well as evaluation of the impact of molecular properties on predictability. The merged dataset was remarkably diverse with respect to physiochemical and pharmacokinetic (PK) properties. Application of both models revealed similar predictability, depending upon the measure of stipulated accuracy. Certain molecular features, particularly rotatable bond count and pK(a), appeared to influence the accuracy of prediction. This collaborative effort has resulted in an improved understanding and appreciation of the value of rats to serve as a surrogate for the prediction of xenobiotic half-lives in humans when clinical pharmacokinetic studies are not possible or practicable.
Jiang, Tao; Li, Xuefei; Wang, Jianfei; Su, Chunxia; Han, Wenbo; Zhao, Chao; Wu, Fengying; Gao, Guanghui; Li, Wei; Chen, Xiaoxia; Li, Jiayu; Zhou, Fei; Zhao, Jing; Cai, Weijing; Zhang, Henghui; Du, Bo; Zhang, Jun; Ren, Shengxiang; Zhou, Caicun; Yu, Hui; Hirsch, Fred R.
2017-01-01
Rationale To investigate whether the mutational landscape of circulating cell-free DNA (cfDNA) could predict and dynamically monitor the response to first-line platinum-based chemotherapy in patients with advanced non-small-cell lung cancer (NSCLC). Methods Eligible patients were included and blood samples were collected from a phase III trial. Both cfDNA fragments and fragmented genomic DNA were extracted for enrichment in a 1.15M size panel covering exon regions of 1,086 genes. Molecular mutational burden (MMB) was calculated to investigate the relationship between molecular features of cfDNA and response to chemotherapy. Results In total, 52 eligible cases were enrolled and their blood samples were prospectively collected at baseline, every cycle of chemotherapy and time of disease progression. At baseline, alterations of 17 genes were found. Patients with partial response (PR) had significantly lower baseline MMB of these genes than those patients with either stable disease (SD) (P = 0.0006) or progression disease (PD) (P = 0.0074). Further analysis revealed that the mutational landscape of cfDNA from pretreatment blood samples were distinctly different among patients with PR vs. SD/PD. For patients with baseline TP53 mutation, those with PR experienced a significant reduction in MMB whereas patients with SD or PD experienced an increase after two, three or four cycles of chemotherapy. Furthermore, patients with low MMB had superior response rate and significantly longer progression-free survival than those with high MMB. Conclusion This study indicated that the mutational landscape of cfDNA has potential clinical value to predict the therapeutic response to first-line platinum-based doublet chemotherapy in NSCLC patients. At the single gene level, dynamic change of molecular mutational burden of TP53 is valuable to monitor efficacy (and, therefore, might aid in early recognition of resistance and relapse) in patients harboring this mutation at baseline. PMID:29187901
Dai, Hanjun; Umarov, Ramzan; Kuwahara, Hiroyuki; Li, Yu; Song, Le; Gao, Xin
2017-11-15
An accurate characterization of transcription factor (TF)-DNA affinity landscape is crucial to a quantitative understanding of the molecular mechanisms underpinning endogenous gene regulation. While recent advances in biotechnology have brought the opportunity for building binding affinity prediction methods, the accurate characterization of TF-DNA binding affinity landscape still remains a challenging problem. Here we propose a novel sequence embedding approach for modeling the transcription factor binding affinity landscape. Our method represents DNA binding sequences as a hidden Markov model which captures both position specific information and long-range dependency in the sequence. A cornerstone of our method is a novel message passing-like embedding algorithm, called Sequence2Vec, which maps these hidden Markov models into a common nonlinear feature space and uses these embedded features to build a predictive model. Our method is a novel combination of the strength of probabilistic graphical models, feature space embedding and deep learning. We conducted comprehensive experiments on over 90 large-scale TF-DNA datasets which were measured by different high-throughput experimental technologies. Sequence2Vec outperforms alternative machine learning methods as well as the state-of-the-art binding affinity prediction methods. Our program is freely available at https://github.com/ramzan1990/sequence2vec. xin.gao@kaust.edu.sa or lsong@cc.gatech.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Hu, Wen-Qing; Fang, Min; Zhao, Hao-Liang; Yan, Shu-Guang; Yuan, Jing-Ping; Peng, Chun-Wei; Yang, Gui-Fang; Li, Yan; Li, Jian-Ding
2014-04-01
In tumor tissues, cancer cells, tumor infiltrating macrophages and tumor neo-vessels in close spatial vicinity with one another form tumor invasion unit, which is a biologically important tumor microenvironment of metastasis to facilitate cancer invasion and metastasis. Establishing an in situ molecular imaging technology to simultaneously reveal these three components is essential for the in-depth investigation of tumor invasion unit. In this report, we have developed a computer-aided algorithm by quantum dots (QDs)-based multiplexed molecular imaging technique for such purpose. A series of studies on gastric cancer tumor tissues demonstrated that the tumor invasion unit was correlated with major unfavorable pathological features and worse clinical outcomes, which illustrated the significantly negative impacts and predictive power of tumor invasion unit on patient overall survival. This study confirmed the technical advantages of QDs-based in situ and simultaneous molecular imaging of key cancer molecules to gain deeper insights into the biology of cancer invasion. Copyright © 2014 Elsevier Ltd. All rights reserved.
Nagarajan, Ramanathan
2017-06-01
Low molecular weight surfactants and high molecular weight block copolymers display analogous self-assembly behavior in solutions and at interfaces, generating nanoscale structures of different shapes. Understanding the link between the molecular structure of these amphiphiles and their self-assembly behavior has been the goal of theoretical studies. Despite the analogies between surfactants and block copolymers, models predicting their self-assembly behavior have evolved independent of one another, each overlooking the molecular feature considered critical to the other. In this review, we focus on the interplay of ideas pertaining to surfactants and block copolymers in three areas of self-assembly. First, we show how improved free energy models have evolved by applying ideas from surfactants to block copolymers and vice versa, giving rise to a unitary theoretical framework and better predictive capabilities for both classes of amphiphiles. Second we show that even though molecular packing arguments are often used to explain aggregate shape transitions resulting from self-assembly, the molecular packing considerations are more relevant in the case of surfactants whereas free energy criteria are relevant for block copolymers. Third, we show that even though the surfactant and block copolymer aggregates are small nanostructures, the size differences between them is significant enough to make the interfacial effects control the solubilization of molecules in surfactant micelles while the bulk interactions control the solubilization in block copolymer micelles. Finally, we conclude by identifying recent theoretical progress in adapting the micelle model to a wide variety of self-assembly phenomena and the challenges to modeling posed by emerging novel classes of amphiphiles with complex biological, inorganic or nanoparticle moieties. Published by Elsevier B.V.
Protein features as determinants of wild-type glycoside hydrolase thermostability.
Geertz-Hansen, Henrik Marcus; Kiemer, Lars; Nielsen, Morten; Stanchev, Kiril; Blom, Nikolaj; Brunak, Søren; Petersen, Thomas Nordahl
2017-11-01
Thermostable enzymes for conversion of lignocellulosic biomass into biofuels have significant advantages over enzymes with more moderate themostability due to the challenging application conditions. Experimental discovery of thermostable enzymes is highly cost intensive, and the development of in-silico methods guiding the discovery process would be of high value. To develop such an in-silico method and provide the data foundation of it, we determined the melting temperatures of 602 fungal glycoside hydrolases from the families GH5, 6, 7, 10, 11, 43, and AA9 (formerly GH61). We, then used sequence and homology modeled structure information of these enzymes to develop the ThermoP melting temperature prediction method. Futhermore, in the context of thermostability, we determined the relative importance of 160 molecular features, such as amino acid frequencies and spatial interactions, and exemplified their biological significance. The presented prediction method is made publicly available at http://www.cbs.dtu.dk/services/ThermoP. © 2017 Wiley Periodicals, Inc.
2014-01-01
Background KRAS mutations in codons 12 and 13 are established predictive biomarkers for anti-EGFR therapy in colorectal cancer. Previous studies suggest that KRAS codon 61 and 146 mutations may also predict resistance to anti-EGFR therapy in colorectal cancer. However, clinicopathological, molecular, and prognostic features of colorectal carcinoma with KRAS codon 61 or 146 mutation remain unclear. Methods We utilized a molecular pathological epidemiology database of 1267 colon and rectal cancers in the Nurse’s Health Study and the Health Professionals Follow-up Study. We examined KRAS mutations in codons 12, 13, 61 and 146 (assessed by pyrosequencing), in relation to clinicopathological features, and tumor molecular markers, including BRAF and PIK3CA mutations, CpG island methylator phenotype (CIMP), LINE-1 methylation, and microsatellite instability (MSI). Survival analyses were performed in 1067 BRAF-wild-type cancers to avoid confounding by BRAF mutation. Cox proportional hazards models were used to compute mortality hazard ratio, adjusting for potential confounders, including disease stage, PIK3CA mutation, CIMP, LINE-1 hypomethylation, and MSI. Results KRAS codon 61 mutations were detected in 19 cases (1.5%), and codon 146 mutations in 40 cases (3.2%). Overall KRAS mutation prevalence in colorectal cancers was 40% (=505/1267). Of interest, compared to KRAS-wild-type, overall, KRAS-mutated cancers more frequently exhibited cecal location (24% vs. 12% in KRAS-wild-type; P < 0.0001), CIMP-low (49% vs. 32% in KRAS-wild-type; P < 0.0001), and PIK3CA mutations (24% vs. 11% in KRAS-wild-type; P < 0.0001). These trends were evident irrespective of mutated codon, though statistical power was limited for codon 61 mutants. Neither KRAS codon 61 nor codon 146 mutation was significantly associated with clinical outcome or prognosis in univariate or multivariate analysis [colorectal cancer-specific mortality hazard ratio (HR) = 0.81, 95% confidence interval (CI) = 0.29-2.26 for codon 61 mutation; colorectal cancer-specific mortality HR = 0.86, 95% CI = 0.42-1.78 for codon 146 mutation]. Conclusions Tumors with KRAS mutations in codons 61 and 146 account for an appreciable proportion (approximately 5%) of colorectal cancers, and their clinicopathological and molecular features appear generally similar to KRAS codon 12 or 13 mutated cancers. To further assess clinical utility of KRAS codon 61 and 146 testing, large-scale trials are warranted. PMID:24885062
Papagiorgis, Petros Christakis
2016-05-01
Proximal and distal colorectal cancers (CRCs) are regarded as distinct disease entities, evolving through different genetic pathways and showing multiple clinicopathological and molecular differences. Segmental distribution of some common markers (e.g., KRAS, EGFR, Ki-67, Bcl-2, COX-2) is clinically important, potentially affecting their prognostic or predictive value. However, this distribution is influenced by a variety of factors such as the anatomical overlap of tumorigenic molecular events, associations of some markers with other clinicopathological features (stage and/or grade), and wide methodological variability in markers' assessment. All these factors represent principal influences followed by intratumoral heterogeneity and geographic variation in the frequency of detection of particular markers, whereas the role of other potential influences (e.g., pre-adjuvant treatment, interaction between markers) remains rather unclear. Better understanding and elucidation of the various influences may provide a more accurate picture of the segmental distribution of molecular markers in CRC, potentially allowing the application of a novel patient stratification for treatment, based on particular molecular profiles in combination with tumor location.
Predicting ecological roles in the rhizosphere using metabolome and transportome modeling
Larsen, Peter E.; Collart, Frank R.; Dai, Yang; ...
2015-09-02
The ability to obtain complete genome sequences from bacteria in environmental samples, such as soil samples from the rhizosphere, has highlighted the microbial diversity and complexity of environmental communities. New algorithms to analyze genome sequence information in the context of community structure are needed to enhance our understanding of the specific ecological roles of these organisms in soil environments. We present a machine learning approach using sequenced Pseudomonad genomes coupled with outputs of metabolic and transportomic computational models for identifying the most predictive molecular mechanisms indicative of a Pseudomonad’s ecological role in the rhizosphere: a biofilm, biocontrol agent, promoter ofmore » plant growth, or plant pathogen. Computational predictions of ecological niche were highly accurate overall with models trained on transportomic model output being the most accurate (Leave One Out Validation F-scores between 0.82 and 0.89). The strongest predictive molecular mechanism features for rhizosphere ecological niche overlap with many previously reported analyses of Pseudomonad interactions in the rhizosphere, suggesting that this approach successfully informs a system-scale level understanding of how Pseudomonads sense and interact with their environments. The observation that an organism’s transportome is highly predictive of its ecological niche is a novel discovery and may have implications in our understanding microbial ecology. The framework developed here can be generalized to the analysis of any bacteria across a wide range of environments and ecological niches making this approach a powerful tool for providing insights into functional predictions from bacterial genomic data.« less
Predicting Ecological Roles in the Rhizosphere Using Metabolome and Transportome Modeling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, Peter E.; Collart, Frank R.; Dai, Yang
2015-09-02
The ability to obtain complete genome sequences from bacteria in environmental samples, such as soil samples from the rhizosphere, has highlighted the microbial diversity and complexity of environmental communities. However, new algorithms to analyze genome sequence information in the context of community structure are needed to enhance our understanding of the specific ecological roles of these organisms in soil environments. We present a machine learning approach using sequenced Pseudomonad genomes coupled with outputs of metabolic and transportomic computational models for identifying the most predictive molecular mechanisms indicative of a Pseudomonad's ecological role in the rhizosphere: a biofilm, biocontrol agent, promotermore » of plant growth, or plant pathogen. Computational predictions of ecological niche were highly accurate overall with models trained on transportomic model output being the most accurate (Leave One Out Validation F-scores between 0.82 and 0.89). The strongest predictive molecular mechanism features for rhizosphere ecological niche overlap with many previously reported analyses of Pseudomonad interactions in the rhizosphere, suggesting that this approach successfully informs a system-scale level understanding of how Pseudomonads sense and interact with their environments. The observation that an organism's transportome is highly predictive of its ecological niche is a novel discovery and may have implications in our understanding microbial ecology. The framework developed here can be generalized to the analysis of any bacteria across a wide range of environments and ecological niches making this approach a powerful tool for providing insights into functional predictions from bacterial genomic data.« less
NASA Astrophysics Data System (ADS)
Guha, Rajarshi; Schürer, Stephan C.
2008-06-01
Computational toxicology is emerging as an encouraging alternative to experimental testing. The Molecular Libraries Screening Center Network (MLSCN) as part of the NIH Molecular Libraries Roadmap has recently started generating large and diverse screening datasets, which are publicly available in PubChem. In this report, we investigate various aspects of developing computational models to predict cell toxicity based on cell proliferation screening data generated in the MLSCN. By capturing feature-based information in those datasets, such predictive models would be useful in evaluating cell-based screening results in general (for example from reporter assays) and could be used as an aid to identify and eliminate potentially undesired compounds. Specifically we present the results of random forest ensemble models developed using different cell proliferation datasets and highlight protocols to take into account their extremely imbalanced nature. Depending on the nature of the datasets and the descriptors employed we were able to achieve percentage correct classification rates between 70% and 85% on the prediction set, though the accuracy rate dropped significantly when the models were applied to in vivo data. In this context we also compare the MLSCN cell proliferation results with animal acute toxicity data to investigate to what extent animal toxicity can be correlated and potentially predicted by proliferation results. Finally, we present a visualization technique that allows one to compare a new dataset to the training set of the models to decide whether the new dataset may be reliably predicted.
van Rossum, Peter S N; Fried, David V; Zhang, Lifei; Hofstetter, Wayne L; van Vulpen, Marco; Meijer, Gert J; Court, Laurence E; Lin, Steven H
2016-05-01
A reliable prediction of a pathologic complete response (pathCR) to chemoradiotherapy before surgery for esophageal cancer would enable investigators to study the feasibility and outcome of an organ-preserving strategy after chemoradiotherapy. So far no clinical parameters or diagnostic studies are able to accurately predict which patients will achieve a pathCR. The aim of this study was to determine whether subjective and quantitative assessment of baseline and postchemoradiation (18)F-FDG PET can improve the accuracy of predicting pathCR to preoperative chemoradiotherapy in esophageal cancer beyond clinical predictors. This retrospective study was approved by the institutional review board, and the need for written informed consent was waived. Clinical parameters along with subjective and quantitative parameters from baseline and postchemoradiation (18)F-FDG PET were derived from 217 esophageal adenocarcinoma patients who underwent chemoradiotherapy followed by surgery. The associations between these parameters and pathCR were studied in univariable and multivariable logistic regression analysis. Four prediction models were constructed and internally validated using bootstrapping to study the incremental predictive values of subjective assessment of (18)F-FDG PET, conventional quantitative metabolic features, and comprehensive (18)F-FDG PET texture/geometry features, respectively. The clinical benefit of (18)F-FDG PET was determined using decision-curve analysis. A pathCR was found in 59 (27%) patients. A clinical prediction model (corrected c-index, 0.67) was improved by adding (18)F-FDG PET-based subjective assessment of response (corrected c-index, 0.72). This latter model was slightly improved by the addition of 1 conventional quantitative metabolic feature only (i.e., postchemoradiation total lesion glycolysis; corrected c-index, 0.73), and even more by subsequently adding 4 comprehensive (18)F-FDG PET texture/geometry features (corrected c-index, 0.77). However, at a decision threshold of 0.9 or higher, representing a clinically relevant predictive value for pathCR at which one may be willing to omit surgery, there was no clear incremental value. Subjective and quantitative assessment of (18)F-FDG PET provides statistical incremental value for predicting pathCR after preoperative chemoradiotherapy in esophageal cancer. However, the discriminatory improvement beyond clinical predictors does not translate into a clinically relevant benefit that could change decision making. © 2016 by the Society of Nuclear Medicine and Molecular Imaging, Inc.
Realistic molecular model of kerogen's nanostructure
NASA Astrophysics Data System (ADS)
Bousige, Colin; Ghimbeu, Camélia Matei; Vix-Guterl, Cathie; Pomerantz, Andrew E.; Suleimenova, Assiya; Vaughan, Gavin; Garbarino, Gaston; Feygenson, Mikhail; Wildgruber, Christoph; Ulm, Franz-Josef; Pellenq, Roland J.-M.; Coasne, Benoit
2016-05-01
Despite kerogen's importance as the organic backbone for hydrocarbon production from source rocks such as gas shale, the interplay between kerogen's chemistry, morphology and mechanics remains unexplored. As the environmental impact of shale gas rises, identifying functional relations between its geochemical, transport, elastic and fracture properties from realistic molecular models of kerogens becomes all the more important. Here, by using a hybrid experimental-simulation method, we propose a panel of realistic molecular models of mature and immature kerogens that provide a detailed picture of kerogen's nanostructure without considering the presence of clays and other minerals in shales. We probe the models' strengths and limitations, and show that they predict essential features amenable to experimental validation, including pore distribution, vibrational density of states and stiffness. We also show that kerogen's maturation, which manifests itself as an increase in the sp2/sp3 hybridization ratio, entails a crossover from plastic-to-brittle rupture mechanisms.
Molecular Origin of the Vibrational Structure of Ice Ih.
Moberg, Daniel R; Straight, Shelby C; Knight, Christopher; Paesani, Francesco
2017-06-15
An unambiguous assignment of the vibrational spectra of ice I h remains a matter of debate. This study demonstrates that an accurate representation of many-body interactions between water molecules, combined with an explicit treatment of nuclear quantum effects through many-body molecular dynamics (MB-MD), leads to a unified interpretation of the vibrational spectra of ice I h in terms of the structure and dynamics of the underlying hydrogen-bond network. All features of the infrared and Raman spectra in the OH stretching region can be unambiguously assigned by taking into account both the symmetry and the delocalized nature of the lattice vibrations as well as the local electrostatic environment experienced by each water molecule within the crystal. The high level of agreement with experiment raises prospects for predictive MB-MD simulations that, complementing analogous measurements, will provide molecular-level insights into fundamental processes taking place in bulk ice and on ice surfaces under different thermodynamic conditions.
The interstellar N2 abundance towards HD 124314 from far-ultraviolet observations.
Knauth, David C; Andersson, B-G; McCandliss, Stephan R; Moos, H Warren
2004-06-10
The abundance of interstellar molecular nitrogen (N2) is of considerable importance: models of steady-state gas-phase interstellar chemistry, together with millimetre-wavelength observations of interstellar N2H+ in dense molecular clouds predict that N2 should be the most abundant nitrogen-bearing molecule in the interstellar medium. Previous attempts to detect N2 absorption in the far-ultraviolet or infrared (ice features) have hitherto been unsuccessful. Here we report the detection of interstellar N2 at far-ultraviolet wavelengths towards the moderately reddened star HD 124314 in the constellation of Centaurus. The N2 column density is larger than expected from models of diffuse clouds and significantly smaller than expected for dense molecular clouds. Moreover, the N2 abundance does not explain the observed variations in the abundance of atomic nitrogen (N I) towards high-column-density sightlines, implying that the models of nitrogen chemistry in the interstellar medium are incomplete.
Realistic molecular model of kerogen's nanostructure.
Bousige, Colin; Ghimbeu, Camélia Matei; Vix-Guterl, Cathie; Pomerantz, Andrew E; Suleimenova, Assiya; Vaughan, Gavin; Garbarino, Gaston; Feygenson, Mikhail; Wildgruber, Christoph; Ulm, Franz-Josef; Pellenq, Roland J-M; Coasne, Benoit
2016-05-01
Despite kerogen's importance as the organic backbone for hydrocarbon production from source rocks such as gas shale, the interplay between kerogen's chemistry, morphology and mechanics remains unexplored. As the environmental impact of shale gas rises, identifying functional relations between its geochemical, transport, elastic and fracture properties from realistic molecular models of kerogens becomes all the more important. Here, by using a hybrid experimental-simulation method, we propose a panel of realistic molecular models of mature and immature kerogens that provide a detailed picture of kerogen's nanostructure without considering the presence of clays and other minerals in shales. We probe the models' strengths and limitations, and show that they predict essential features amenable to experimental validation, including pore distribution, vibrational density of states and stiffness. We also show that kerogen's maturation, which manifests itself as an increase in the sp(2)/sp(3) hybridization ratio, entails a crossover from plastic-to-brittle rupture mechanisms.
NCCN Guidelines Insights: Central Nervous System Cancers, Version 1.2017.
Nabors, Louis Burt; Portnow, Jana; Ammirati, Mario; Baehring, Joachim; Brem, Henry; Butowski, Nicholas; Fenstermaker, Robert A; Forsyth, Peter; Hattangadi-Gluth, Jona; Holdhoff, Matthias; Howard, Steven; Junck, Larry; Kaley, Thomas; Kumthekar, Priya; Loeffler, Jay S; Moots, Paul L; Mrugala, Maciej M; Nagpal, Seema; Pandey, Manjari; Parney, Ian; Peters, Katherine; Puduvalli, Vinay K; Ragsdale, John; Rockhill, Jason; Rogers, Lisa; Rusthoven, Chad; Shonka, Nicole; Shrieve, Dennis C; Sills, Allen K; Swinnen, Lode J; Tsien, Christina; Weiss, Stephanie; Wen, Patrick Yung; Willmarth, Nicole; Bergman, Mary Anne; Engh, Anita
2017-11-01
For many years, the diagnosis and classification of gliomas have been based on histology. Although studies including large populations of patients demonstrated the prognostic value of histologic phenotype, variability in outcomes within histologic groups limited the utility of this system. Nonetheless, histology was the only proven and widely accessible tool available at the time, thus it was used for clinical trial entry criteria, and therefore determined the recommended treatment options. Research to identify molecular changes that underlie glioma progression has led to the discovery of molecular features that have greater diagnostic and prognostic value than histology. Analyses of these molecular markers across populations from randomized clinical trials have shown that some of these markers are also predictive of response to specific types of treatment, which has prompted significant changes to the recommended treatment options for grade III (anaplastic) gliomas. Copyright © 2017 by the National Comprehensive Cancer Network.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Jehoon; Wu, Jianzhong, E-mail: jwu@engr.ucr.edu
Self-assembly of capsid proteins and genome encapsidation are two critical steps in the life cycle of most plant and animal viruses. A theoretical description of such processes from a physiochemical perspective may help better understand viral replication and morphogenesis thus provide fresh insights into the experimental studies of antiviral strategies. In this work, we propose a molecular thermodynamic model for predicting the stability of Hepatitis B virus (HBV) capsids either with or without loading nucleic materials. With the key components represented by coarse-grained thermodynamic models, the theoretical predictions are in excellent agreement with experimental data for the formation free energiesmore » of empty T4 capsids over a broad range of temperature and ion concentrations. The theoretical model predicts T3/T4 dimorphism also in good agreement with the capsid formation at in vivo and in vitro conditions. In addition, we have studied the stability of the viral particles in response to physiological cellular conditions with the explicit consideration of the hydrophobic association of capsid subunits, electrostatic interactions, molecular excluded volume effects, entropy of mixing, and conformational changes of the biomolecular species. The course-grained model captures the essential features of the HBV nucleocapsid stability revealed by recent experiments.« less
Three-Dimensional Molecular Modeling of a Diverse Range of SC Clan Serine Proteases
Laskar, Aparna; Chatterjee, Aniruddha; Chatterjee, Somnath; Rodger, Euan J.
2012-01-01
Serine proteases are involved in a variety of biological processes and are classified into clans sharing structural homology. Although various three-dimensional structures of SC clan proteases have been experimentally determined, they are mostly bacterial and animal proteases, with some from archaea, plants, and fungi, and as yet no structures have been determined for protozoa. To bridge this gap, we have used molecular modeling techniques to investigate the structural properties of different SC clan serine proteases from a diverse range of taxa. Either SWISS-MODEL was used for homology-based structure prediction or the LOOPP server was used for threading-based structure prediction. The predicted models were refined using Insight II and SCRWL and validated against experimental structures. Investigation of secondary structures and electrostatic surface potential was performed using MOLMOL. The structural geometry of the catalytic core shows clear deviations between taxa, but the relative positions of the catalytic triad residues were conserved. Evolutionary divergence was also exhibited by large variation in secondary structure features outside the core, differences in overall amino acid distribution, and unique surface electrostatic potential patterns between species. Encompassing a wide range of taxa, our structural analysis provides an evolutionary perspective on SC clan serine proteases. PMID:23213528
Fluids density functional theory and initializing molecular dynamics simulations of block copolymers
NASA Astrophysics Data System (ADS)
Brown, Jonathan R.; Seo, Youngmi; Maula, Tiara Ann D.; Hall, Lisa M.
2016-03-01
Classical, fluids density functional theory (fDFT), which can predict the equilibrium density profiles of polymeric systems, and coarse-grained molecular dynamics (MD) simulations, which are often used to show both structure and dynamics of soft materials, can be implemented using very similar bead-based polymer models. We aim to use fDFT and MD in tandem to examine the same system from these two points of view and take advantage of the different features of each methodology. Additionally, the density profiles resulting from fDFT calculations can be used to initialize the MD simulations in a close to equilibrated structure, speeding up the simulations. Here, we show how this method can be applied to study microphase separated states of both typical diblock and tapered diblock copolymers in which there is a region with a gradient in composition placed between the pure blocks. Both methods, applied at constant pressure, predict a decrease in total density as segregation strength or the length of the tapered region is increased. The predictions for the density profiles from fDFT and MD are similar across materials with a wide range of interfacial widths.
Kim, Dokyoon; Joung, Je-Gun; Sohn, Kyung-Ah; Shin, Hyunjung; Park, Yu Rang; Ritchie, Marylyn D; Kim, Ju Han
2015-01-01
Objective Cancer can involve gene dysregulation via multiple mechanisms, so no single level of genomic data fully elucidates tumor behavior due to the presence of numerous genomic variations within or between levels in a biological system. We have previously proposed a graph-based integration approach that combines multi-omics data including copy number alteration, methylation, miRNA, and gene expression data for predicting clinical outcome in cancer. However, genomic features likely interact with other genomic features in complex signaling or regulatory networks, since cancer is caused by alterations in pathways or complete processes. Methods Here we propose a new graph-based framework for integrating multi-omics data and genomic knowledge to improve power in predicting clinical outcomes and elucidate interplay between different levels. To highlight the validity of our proposed framework, we used an ovarian cancer dataset from The Cancer Genome Atlas for predicting stage, grade, and survival outcomes. Results Integrating multi-omics data with genomic knowledge to construct pre-defined features resulted in higher performance in clinical outcome prediction and higher stability. For the grade outcome, the model with gene expression data produced an area under the receiver operating characteristic curve (AUC) of 0.7866. However, models of the integration with pathway, Gene Ontology, chromosomal gene set, and motif gene set consistently outperformed the model with genomic data only, attaining AUCs of 0.7873, 0.8433, 0.8254, and 0.8179, respectively. Conclusions Integrating multi-omics data and genomic knowledge to improve understanding of molecular pathogenesis and underlying biology in cancer should improve diagnostic and prognostic indicators and the effectiveness of therapies. PMID:25002459
Kim, Dokyoon; Joung, Je-Gun; Sohn, Kyung-Ah; Shin, Hyunjung; Park, Yu Rang; Ritchie, Marylyn D; Kim, Ju Han
2015-01-01
Cancer can involve gene dysregulation via multiple mechanisms, so no single level of genomic data fully elucidates tumor behavior due to the presence of numerous genomic variations within or between levels in a biological system. We have previously proposed a graph-based integration approach that combines multi-omics data including copy number alteration, methylation, miRNA, and gene expression data for predicting clinical outcome in cancer. However, genomic features likely interact with other genomic features in complex signaling or regulatory networks, since cancer is caused by alterations in pathways or complete processes. Here we propose a new graph-based framework for integrating multi-omics data and genomic knowledge to improve power in predicting clinical outcomes and elucidate interplay between different levels. To highlight the validity of our proposed framework, we used an ovarian cancer dataset from The Cancer Genome Atlas for predicting stage, grade, and survival outcomes. Integrating multi-omics data with genomic knowledge to construct pre-defined features resulted in higher performance in clinical outcome prediction and higher stability. For the grade outcome, the model with gene expression data produced an area under the receiver operating characteristic curve (AUC) of 0.7866. However, models of the integration with pathway, Gene Ontology, chromosomal gene set, and motif gene set consistently outperformed the model with genomic data only, attaining AUCs of 0.7873, 0.8433, 0.8254, and 0.8179, respectively. Integrating multi-omics data and genomic knowledge to improve understanding of molecular pathogenesis and underlying biology in cancer should improve diagnostic and prognostic indicators and the effectiveness of therapies. © The Author 2014. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Dolezal, Rafael; Korabecny, Jan; Malinak, David; Honegr, Jan; Musilek, Kamil; Kuca, Kamil
2015-03-01
To predict unknown reactivation potencies of 12 mono- and bis-pyridinium aldoximes for VX-inhibited rat acetylcholinesterase (rAChE), three-dimensional quantitative structure-activity relationship (3D QSAR) analysis has been carried out. Utilizing molecular interaction fields (MIFs) calculated by molecular mechanical (MMFF94) and quantum chemical (B3LYP/6-31G*) methods, two satisfactory ligand-based CoMFA models have been developed: 1. R(2)=0.9989, Q(LOO)(2)=0.9090, Q(LTO)(2)=0.8921, Q(LMO(20%))(2)=0.8853, R(ext)(2)=0.9259, SDEP(ext)=6.8938; 2. R(2)=0.9962, Q(LOO)(2)=0.9368, Q(LTO)(2)=0.9298, Q(LMO(20%))(2)=0.9248, R(ext)(2)=0.8905, SDEP(ext)=6.6756. High statistical significance of the 3D QSAR models has been achieved through the application of several data noise reduction techniques (i.e. smart region definition SRD, fractional factor design FFD, uninformative/iterative variable elimination UVE/IVE) on the original MIFs. Besides the ligand-based CoMFA models, an alignment molecular set constructed by flexible molecular docking has been also studied. The contour maps as well as the predicted reactivation potencies resulting from 3D QSAR analyses help better understand which structural features are associated with increased reactivation potency of studied compounds. Copyright © 2014 Elsevier Inc. All rights reserved.
Pietsch, Torsten; Schmidt, Rene; Remke, Marc; Korshunov, Andrey; Hovestadt, Volker; Jones, David TW; Felsberg, Jörg; Kaulich, Kerstin; Goschzik, Tobias; Kool, Marcel; Northcott, Paul A.; von Hoff, Katja; von Bueren, André O.; Friedrich, Carsten; Skladny, Heyko; Fleischhack, Gudrun; Taylor, Michael D.; Cremer, Friedrich; Lichter, Peter; Faldum, Andreas; Reifenberger, Guido; Rutkowski, Stefan; Pfister, Stefan M.
2014-01-01
BACKGROUND: This study aimed to prospectively evaluate clinical, histopathological and molecular variables for outcome prediction in medulloblastoma patients. METHODS: Patients from the HIT2000 cooperative clinical trial were prospectively enrolled based on the availability of sufficient tumor material and complete clinical information. This revealed a cohort of 184 patients (median age 7.6 years), which was randomly split at a 2:1 ratio into a training (n = 127), and a validation (n = 57) dataset. All samples were subjected to thorough histopathological investigation, CTNNB1 mutation analysis, quantitative PCR, MLPA and FISH analyses for cytogenetic variables, and methylome analysis. RESULTS: By univariable analysis, clinical factors (M-stage), histopathological variables (large cell component, endothelial proliferation, synaptophysin pattern), and molecular features (chromosome 6q status, MYC amplification, TOP2A copy-number, subgrouping) were found to be prognostic. Molecular consensus subgrouping (WNT, SHH, Group 3, Group 4) was validated as an independent feature to stratify patients into different risk groups. When comparing methods for the identification of WNT-driven medulloblastoma, this study identified CTNNB1 sequencing and methylation profiling to most reliably identify these patients. After removing patients with particularly favorable (CTNNB1 mutation, extensive nodularity) or unfavorable (MYC amplification) markers, a risk score for the remaining “intermediate molecular risk” population dependent on age, M-stage, pattern of synaptophysin expression, and MYCN copy-number status was identified and validated, with speckled synaptophysin expression indicating worse outcome. CONCLUSIONS: Methylation subgrouping and CTNNB1 mutation status represent robust tools for the risk-stratification of medulloblastoma. A simple clinico-pathological risk score for “intermediate molecular risk” patients was identified, which deserves further validation. SECONDARY CATEGORY: Pediatrics.
Molecular properties of food allergens.
Breiteneder, Heimo; Mills, E N Clare
2005-01-01
Plant food allergens belong to a rather limited number of protein families and are also characterized by a number of biochemical and physicochemical properties, many of which are also shared by food allergens of animal origin. These include thermal stability and resistance to proteolysis, which are enhanced by an ability to bind ligands, such as metal ions, lipids, or steroids. Other types of lipid interaction, including membranes or other lipid structures, represent another feature that might promote the allergenic properties of certain food proteins. A structural feature clearly related to stability is intramolecular disulfide bonds alongside posttranslational modifications, such as N-glycosylation. Some plant food allergens, such as the cereal seed storage prolamins, are rheomorphic proteins with polypeptide chains that adopt an ensemble of secondary structures resembling unfolded or partially folded proteins. Other plant food allergens are characterized by the presence of repetitive structures, the ability to form oligomers, and the tendency to aggregate. A summary of our current knowledge regarding the molecular properties of food allergens is presented. Although we cannot as yet predict the allergenicity of a given food protein, understanding of the molecular properties that might predispose them to becoming allergens is an important first step and will undoubtedly contribute to the integrative allergenic risk assessment process being adopted by regulators.
Predictive Toxicology and Computer Simulation of Male ...
The reproductive tract is a complex, integrated organ system with diverse embryology and unique sensitivity to prenatal environmental exposures that disrupt morphoregulatory processes and endocrine signaling. U.S. EPA’s in vitro high-throughput screening (HTS) database (ToxCastDB) was used to profile the bioactivity of 54 chemicals with male developmental consequences across ~800 molecular and cellular features. The in vitro bioactivity on molecular targets could be condensed into 156 gene annotations in a bipartite network. These results highlighted the role of estrogen and androgen signaling pathways in male reproductive tract development, and importantly, broadened the list of molecular targets to include GPCRs, cytochrome-P450s, vascular remodeling proteins, and retinoic acid signaling. A multicellular agent-based model was used to simulate the complex interactions between morphoregulatory, endocrine, and environmental influences during genital tubercle (GT) development. Spatially dynamic signals (e.g., SHH, FGF10, and androgen) were implemented in the model to address differential adhesion, cell motility, proliferation, and apoptosis. Under control of androgen signaling, urethral tube closure was an emergent feature of the model that was linked to gender-specific rates of ventral mesenchymal proliferation and urethral plate endodermal apoptosis. A systemic parameter sweep was used to examine the sensitivity of crosstalk between genetic deficiency and envi
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kremer, Manuel; Fischer, Bettina; Feuerstein, Bernold
2009-11-20
Fully differential data for H{sub 2} dissociation in ultrashort (6 fs, 760 nm), linearly polarized, intense (0.44 PW/cm{sup 2}) laser pulses with a stabilized carrier-envelope phase (CEP) were recorded with a reaction microscope. Depending on the CEP, the molecular orientation, and the kinetic energy release (KER), we find asymmetric proton emission at low KERs (0-3 eV), basically predicted by Roudnev and Esry, and much stronger than reported by Kling et al. Wave packet propagation calculations reproduce the salient features and discard, together with the observed KER-independent electron asymmetry, the first ionization step to be the reason for the asymmetric protonmore » emission.« less
Ryan, J E; Warrier, S K; Lynch, A C; Ramsay, R G; Phillips, W A; Heriot, A G
2016-03-01
Approximately 20% of patients treated with neoadjuvant chemoradiotherapy (nCRT) for locally advanced rectal cancer achieve a pathological complete response (pCR) while the remainder derive the benefit of improved local control and downstaging and a small proportion show a minimal response. The ability to predict which patients will benefit would allow for improved patient stratification directing therapy to those who are likely to achieve a good response, thereby avoiding ineffective treatment in those unlikely to benefit. A systematic review of the English language literature was conducted to identify pathological factors, imaging modalities and molecular factors that predict pCR following chemoradiotherapy. PubMed, MEDLINE and Cochrane Database searches were conducted with the following keywords and MeSH search terms: 'rectal neoplasm', 'response', 'neoadjuvant', 'preoperative chemoradiation', 'tumor response'. After review of title and abstracts, 85 articles addressing the prediction of pCR were selected. Clear methods to predict pCR before chemoradiotherapy have not been defined. Clinical and radiological features of the primary cancer have limited ability to predict response. Molecular profiling holds the greatest potential to predict pCR but adoption of this technology will require greater concordance between cohorts for the biomarkers currently under investigation. At present no robust markers of the prediction of pCR have been identified and the topic remains an area for future research. This review critically evaluates existing literature providing an overview of the methods currently available to predict pCR to nCRT for locally advanced rectal cancer. The review also provides a comprehensive comparison of the accuracy of each modality. Colorectal Disease © 2015 The Association of Coloproctology of Great Britain and Ireland.
Vyas, V K; Gupta, N; Ghate, M; Patel, S
2014-01-01
In this study we designed novel substituted benzimidazole derivatives and predicted their absorption, distribution, metabolism, excretion and toxicity (ADMET) properties, based on a predictive 3D QSAR study on 132 substituted benzimidazoles as AngII-AT1 receptor antagonists. The two best predicted compounds were synthesized and evaluated for AngII-AT1 receptor antagonism. Three different alignment tools for comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were used. The best 3D QSAR models were obtained using the rigid body (Distill) alignment method. CoMFA and CoMSIA models were found to be statistically significant with leave-one-out correlation coefficients (q(2)) of 0.630 and 0.623, respectively, cross-validated coefficients (r(2)cv) of 0.651 and 0.630, respectively, and conventional coefficients of determination (r(2)) of 0.848 and 0.843, respectively. 3D QSAR models were validated using a test set of 24 compounds, giving satisfactory predicted results (r(2)pred) of 0.727 and 0.689 for the CoMFA and CoMSIA models, respectively. We have identified some key features in substituted benzimidazole derivatives, such as lipophilicity and H-bonding at the 2- and 5-positions of the benzimidazole nucleus, respectively, for AT1 receptor antagonistic activity. We designed 20 novel substituted benzimidazole derivatives and predicted their activity. In silico ADMET properties were also predicted for these designed molecules. Finally, the compounds with best predicted activity were synthesized and evaluated for in vitro angiotensin II-AT1 receptor antagonism.
Cario, Gunnar; Stanulla, Martin; Fine, Bernard M; Teuffel, Oliver; Neuhoff, Nils V; Schrauder, André; Flohr, Thomas; Schäfer, Beat W; Bartram, Claus R; Welte, Karl; Schlegelberger, Brigitte; Schrappe, Martin
2005-01-15
Treatment resistance, as indicated by the presence of high levels of minimal residual disease (MRD) after induction therapy and induction consolidation, is associated with a poor prognosis in childhood acute lymphoblastic leukemia (ALL). We hypothesized that treatment resistance is an intrinsic feature of ALL cells reflected in the gene expression pattern and that resistance to chemotherapy can be predicted before treatment. To test these hypotheses, gene expression signatures of ALL samples with high MRD load were compared with those of samples without measurable MRD during treatment. We identified 54 genes that clearly distinguished resistant from sensitive ALL samples. Genes with low expression in resistant samples were predominantly associated with cell-cycle progression and apoptosis, suggesting that impaired cell proliferation and apoptosis are involved in treatment resistance. Prediction analysis using randomly selected samples as a training set and the remaining samples as a test set revealed an accuracy of 84%. We conclude that resistance to chemotherapy seems at least in part to be an intrinsic feature of ALL cells. Because treatment response could be predicted with high accuracy, gene expression profiling could become a clinically relevant tool for treatment stratification in the early course of childhood ALL.
Mechanism of voltage-gated channel formation in lipid membranes.
Guidelli, Rolando; Becucci, Lucia
2016-04-01
Although several molecular models for voltage-gated ion channels in lipid membranes have been proposed, a detailed mechanism accounting for the salient features of experimental data is lacking. A general treatment accounting for peptide dipole orientation in the electric field and their nucleation and growth kinetics with ion channel formation is provided. This is the first treatment that explains all the main features of the experimental current-voltage curves of peptides forming voltage-gated channels available in the literature. It predicts a regime of weakly voltage-dependent conductance, followed by one of strong voltage-dependent conductance at higher voltages. It also predicts values of the parameters expressing the exponential dependence of conductance upon voltage and peptide bulk concentration for both regimes, in good agreement with those reported in the literature. Most importantly, the only two adjustable parameters involved in the kinetics of nucleation and growth of ion channels can be varied over broad ranges without affecting the above predictions to a significant extent. Thus, the fitting of experimental current-voltage curves stems naturally from the treatment and depends only slightly upon the choice of the kinetic parameters. Copyright © 2015 Elsevier B.V. All rights reserved.
VarMod: modelling the functional effects of non-synonymous variants
Pappalardo, Morena; Wass, Mark N.
2014-01-01
Unravelling the genotype–phenotype relationship in humans remains a challenging task in genomics studies. Recent advances in sequencing technologies mean there are now thousands of sequenced human genomes, revealing millions of single nucleotide variants (SNVs). For non-synonymous SNVs present in proteins the difficulties of the problem lie in first identifying those nsSNVs that result in a functional change in the protein among the many non-functional variants and in turn linking this functional change to phenotype. Here we present VarMod (Variant Modeller) a method that utilises both protein sequence and structural features to predict nsSNVs that alter protein function. VarMod develops recent observations that functional nsSNVs are enriched at protein–protein interfaces and protein–ligand binding sites and uses these characteristics to make predictions. In benchmarking on a set of nearly 3000 nsSNVs VarMod performance is comparable to an existing state of the art method. The VarMod web server provides extensive resources to investigate the sequence and structural features associated with the predictions including visualisation of protein models and complexes via an interactive JSmol molecular viewer. VarMod is available for use at http://www.wasslab.org/varmod. PMID:24906884
Molecular markers in pediatric neuro-oncology.
Ichimura, Koichi; Nishikawa, Ryo; Matsutani, Masao
2012-09-01
Pediatric molecular neuro-oncology is a fast developing field. A multitude of molecular profiling studies in recent years has unveiled a number of genetic abnormalities unique to pediatric brain tumors. It has now become clear that brain tumors that arise in children have distinct pathogenesis and biology, compared with their adult counterparts, even for those with indistinguishable histopathology. Some of the molecular features are so specific to a particular type of tumors, such as the presence of the KIAA1549-BRAF fusion gene for pilocytic astrocytomas or SMARCB1 mutations for atypical teratoid/rhabdoid tumors, that they could practically serve as a diagnostic marker on their own. Expression profiling has resolved the existence of 4 molecular subgroups in medulloblastomas, which positively translated into improved prognostication for the patients. The currently available molecular markers, however, do not cover all tumors even within a single tumor entity. The molecular pathogenesis of a large number of pediatric brain tumors is still unaccounted for, and the hierarchy of tumors is likely to be more complex and intricate than currently acknowledged. One of the main tasks of future molecular analyses in pediatric neuro-oncology, including the ongoing genome sequencing efforts, is to elucidate the biological basis of those orphan tumors. The ultimate goal of molecular diagnostics is to accurately predict the clinical and biological behavior of any tumor by means of their molecular characteristics, which is hoped to eventually pave the way for individualized treatment.
Extending Halogen-based Medicinal Chemistry to Proteins
El Hage, Krystel; Pandyarajan, Vijay; Phillips, Nelson B.; Smith, Brian J.; Menting, John G.; Whittaker, Jonathan; Lawrence, Michael C.; Meuwly, Markus; Weiss, Michael A.
2016-01-01
Insulin, a protein critical for metabolic homeostasis, provides a classical model for protein design with application to human health. Recent efforts to improve its pharmaceutical formulation demonstrated that iodination of a conserved tyrosine (TyrB26) enhances key properties of a rapid-acting clinical analog. Moreover, the broad utility of halogens in medicinal chemistry has motivated the use of hybrid quantum- and molecular-mechanical methods to study proteins. Here, we (i) undertook quantitative atomistic simulations of 3-[iodo-TyrB26]insulin to predict its structural features, and (ii) tested these predictions by X-ray crystallography. Using an electrostatic model of the modified aromatic ring based on quantum chemistry, the calculations suggested that the analog, as a dimer and hexamer, exhibits subtle differences in aromatic-aromatic interactions at the dimer interface. Aromatic rings (TyrB16, PheB24, PheB25, 3-I-TyrB26, and their symmetry-related mates) at this interface adjust to enable packing of the hydrophobic iodine atoms within the core of each monomer. Strikingly, these features were observed in the crystal structure of a 3-[iodo-TyrB26]insulin analog (determined as an R6 zinc hexamer). Given that residues B24–B30 detach from the core on receptor binding, the environment of 3-I-TyrB26 in a receptor complex must differ from that in the free hormone. Based on the recent structure of a “micro-receptor” complex, we predict that 3-I-TyrB26 engages the receptor via directional halogen bonding and halogen-directed hydrogen bonding as follows: favorable electrostatic interactions exploiting, respectively, the halogen's electron-deficient σ-hole and electronegative equatorial band. Inspired by quantum chemistry and molecular dynamics, such “halogen engineering” promises to extend principles of medicinal chemistry to proteins. PMID:27875310
Alignment-independent technique for 3D QSAR analysis
NASA Astrophysics Data System (ADS)
Wilkes, Jon G.; Stoyanova-Slavova, Iva B.; Buzatu, Dan A.
2016-04-01
Molecular biochemistry is controlled by 3D phenomena but structure-activity models based on 3D descriptors are infrequently used for large data sets because of the computational overhead for determining molecular conformations. A diverse dataset of 146 androgen receptor binders was used to investigate how different methods for defining molecular conformations affect the performance of 3D-quantitative spectral data activity relationship models. Molecular conformations tested: (1) global minimum of molecules' potential energy surface; (2) alignment-to-templates using equal electronic and steric force field contributions; (3) alignment using contributions "Best-for-Each" template; (4) non-energy optimized, non-aligned (2D > 3D). Aggregate predictions from models were compared. Highest average coefficients of determination ranged from R Test 2 = 0.56 to 0.61. The best model using 2D > 3D (imported directly from ChemSpider) produced R Test 2 = 0.61. It was superior to energy-minimized and conformation-aligned models and was achieved in only 3-7 % of the time required using the other conformation strategies. Predictions averaged from models built on different conformations achieved a consensus R Test 2 = 0.65. The best 2D > 3D model was analyzed for underlying structure-activity relationships. For the compound strongest binding to the androgen receptor, 10 substructural features contributing to binding were flagged. Utility of 2D > 3D was compared for two other activity endpoints, each modeling a medium sized data set. Results suggested that large scale, accurate predictions using 2D > 3D SDAR descriptors may be produced for interactions involving endocrine system nuclear receptors and other data sets in which strongest activities are produced by fairly inflexible substrates.
Development of a clinical diagnostic matrix for characterizing inherited epidermolysis bullosa.
Yenamandra, V K; Moss, C; Sreenivas, V; Khan, M; Sivasubbu, S; Sharma, V K; Sethuraman, G
2017-06-01
Accurately diagnosing the subtype of epidermolysis bullosa (EB) is critical for management and genetic counselling. Modern laboratory techniques are largely inaccessible in developing countries, where the diagnosis remains clinical and often inaccurate. To develop a simple clinical diagnostic tool to aid in the diagnosis and subtyping of EB. We developed a matrix indicating presence or absence of a set of distinctive clinical features (as rows) for the nine most prevalent EB subtypes (as columns). To test an individual patient, presence or absence of these features was compared with the findings expected in each of the nine subtypes to see which corresponded best. If two or more diagnoses scored equally, the diagnosis with the greatest number of specific features was selected. The matrix was tested using findings from 74 genetically characterized patients with EB aged > 6 months by an investigator blinded to molecular diagnosis. For concordance, matrix diagnoses were compared with molecular diagnoses. Overall, concordance between the matrix and molecular diagnoses for the four major types of EB was 91·9%, with a kappa coefficient of 0·88 [95% confidence interval (CI) 0·81-0·95; P < 0·001]. The matrix achieved a 75·7% agreement in classifying EB into its nine subtypes, with a kappa coefficient of 0·73 (95% CI 0·69-0·77; P < 0·001). The matrix appears to be simple, valid and useful in predicting the type and subtype of EB. An electronic version will facilitate further testing. © 2016 British Association of Dermatologists.
Deep learning of mutation-gene-drug relations from the literature.
Lee, Kyubum; Kim, Byounggun; Choi, Yonghwa; Kim, Sunkyu; Shin, Wonho; Lee, Sunwon; Park, Sungjoon; Kim, Seongsoon; Tan, Aik Choon; Kang, Jaewoo
2018-01-25
Molecular biomarkers that can predict drug efficacy in cancer patients are crucial components for the advancement of precision medicine. However, identifying these molecular biomarkers remains a laborious and challenging task. Next-generation sequencing of patients and preclinical models have increasingly led to the identification of novel gene-mutation-drug relations, and these results have been reported and published in the scientific literature. Here, we present two new computational methods that utilize all the PubMed articles as domain specific background knowledge to assist in the extraction and curation of gene-mutation-drug relations from the literature. The first method uses the Biomedical Entity Search Tool (BEST) scoring results as some of the features to train the machine learning classifiers. The second method uses not only the BEST scoring results, but also word vectors in a deep convolutional neural network model that are constructed from and trained on numerous documents such as PubMed abstracts and Google News articles. Using the features obtained from both the BEST search engine scores and word vectors, we extract mutation-gene and mutation-drug relations from the literature using machine learning classifiers such as random forest and deep convolutional neural networks. Our methods achieved better results compared with the state-of-the-art methods. We used our proposed features in a simple machine learning model, and obtained F1-scores of 0.96 and 0.82 for mutation-gene and mutation-drug relation classification, respectively. We also developed a deep learning classification model using convolutional neural networks, BEST scores, and the word embeddings that are pre-trained on PubMed or Google News data. Using deep learning, the classification accuracy improved, and F1-scores of 0.96 and 0.86 were obtained for the mutation-gene and mutation-drug relations, respectively. We believe that our computational methods described in this research could be used as an important tool in identifying molecular biomarkers that predict drug responses in cancer patients. We also built a database of these mutation-gene-drug relations that were extracted from all the PubMed abstracts. We believe that our database can prove to be a valuable resource for precision medicine researchers.
Beyond precision surgery: Molecularly motivated precision care for gastric cancer.
Choi, Y Y; Cheong, J-H
2017-05-01
Gastric cancer is one of the leading causes of cancer-related deaths worldwide. Despite the high disease prevalence, gastric cancer research has not gained much attention. Recently, genome-scale technology has made it possible to explore the characteristics of gastric cancer at the molecular level. Accordingly, gastric cancer can be classified into molecular subtypes that convey more detailed information of tumor than histopathological characteristics, and these subtypes are associated with clinical outcomes. Furthermore, this molecular knowledge helps to identify new actionable targets and develop novel therapeutic strategies. To advance the concept of precision patient care in the clinic, patient-derived xenograft (PDX) models have recently been developed. PDX models not only represent histology and genomic features, but also predict responsiveness to investigational drugs in patient tumors. Molecularly curated PDX cohorts will be instrumental in hypothesis generation, biomarker discovery, and drug screening and testing in proof-of-concept preclinical trials for precision therapy. In the era of precision medicine, molecularly tailored therapeutic strategies should be individualized for cancer patients. To improve the overall clinical outcome, a multimodal approach is indispensable for advanced cancer patients. Careful, oncological principle-based surgery, combined with a molecularly guided multidisciplinary approach, will open new horizons in surgical oncology. Copyright © 2017. Published by Elsevier Ltd.
Star formation in evolving molecular clouds
NASA Astrophysics Data System (ADS)
Völschow, M.; Banerjee, R.; Körtgen, B.
2017-09-01
Molecular clouds are the principle stellar nurseries of our universe; they thus remain a focus of both observational and theoretical studies. From observations, some of the key properties of molecular clouds are well known but many questions regarding their evolution and star formation activity remain open. While numerical simulations feature a large number and complexity of involved physical processes, this plethora of effects may hide the fundamentals that determine the evolution of molecular clouds and enable the formation of stars. Purely analytical models, on the other hand, tend to suffer from rough approximations or a lack of completeness, limiting their predictive power. In this paper, we present a model that incorporates central concepts of astrophysics as well as reliable results from recent simulations of molecular clouds and their evolutionary paths. Based on that, we construct a self-consistent semi-analytical framework that describes the formation, evolution, and star formation activity of molecular clouds, including a number of feedback effects to account for the complex processes inside those objects. The final equation system is solved numerically but at much lower computational expense than, for example, hydrodynamical descriptions of comparable systems. The model presented in this paper agrees well with a broad range of observational results, showing that molecular cloud evolution can be understood as an interplay between accretion, global collapse, star formation, and stellar feedback.
Liu, Genyan; Wang, Wenjie; Wan, Youlan; Ju, Xiulian; Gu, Shuangxi
2018-05-11
Diarylpyrimidines (DAPYs), acting as HIV-1 nonnucleoside reverse transcriptase inhibitors (NNRTIs), have been considered to be one of the most potent drug families in the fight against acquired immunodeficiency syndrome (AIDS). To better understand the structural requirements of HIV-1 NNRTIs, three-dimensional quantitative structure⁻activity relationship (3D-QSAR), pharmacophore, and molecular docking studies were performed on 52 DAPY analogues that were synthesized in our previous studies. The internal and external validation parameters indicated that the generated 3D-QSAR models, including comparative molecular field analysis (CoMFA, q 2 = 0.679, R 2 = 0.983, and r pred 2 = 0.884) and comparative molecular similarity indices analysis (CoMSIA, q 2 = 0.734, R 2 = 0.985, and r pred 2 = 0.891), exhibited good predictive abilities and significant statistical reliability. The docking results demonstrated that the phenyl ring at the C₄-position of the pyrimidine ring was better than the cycloalkanes for the activity, as the phenyl group was able to participate in π⁻π stacking interactions with the aromatic residues of the binding site, whereas the cycloalkanes were not. The pharmacophore model and 3D-QSAR contour maps provided significant insights into the key structural features of DAPYs that were responsible for the activity. On the basis of the obtained information, a series of novel DAPY analogues of HIV-1 NNRTIs with potentially higher predicted activity was designed. This work might provide useful information for guiding the rational design of potential HIV-1 NNRTI DAPYs.
Modulated structure and molecular dissociation of solid chlorine at high pressures
NASA Astrophysics Data System (ADS)
Li, Peifang; Gao, Guoying; Ma, Yanming
2012-08-01
Among diatomic molecular halogen solids, high pressure structures of solid chlorine (Cl2) remain elusive and least studied. We here report first-principles structural search on solid Cl2 at high pressures through our developed particle-swarm optimization algorithm. We successfully reproduced the known molecular Cmca phase (phase I) at low pressure and found that it remains stable up to a high pressure 142 GPa. At 150 GPa, our structural searches identified several energetically competitive, structurally similar, and modulated structures. Analysis of the structural results and their similarity with those in solid Br2 and I2, it was suggested that solid Cl2 adopts an incommensurate modulated structure with a modulation wave close to 2/7 in a narrow pressure range 142-157 GPa. Eventually, our simulations at >157 GPa were able to predict the molecular dissociation of solid Cl2 into monatomic phases having body centered orthorhombic (bco) and face-centered cubic (fcc) structures, respectively. One unique monatomic structural feature of solid Cl2 is the absence of intermediate body centered tetragonal (bct) structure during the bco → fcc transition, which however has been observed or theoretically predicted in solid Br2 and I2. Electron-phonon coupling calculations revealed that solid Cl2 becomes superconductors within bco and fcc phases possessing a highest superconducting temperature of 13.03 K at 380 GPa. We further probed the molecular Cmca → incommensurate phase transition mechanism and found that the softening of the Ag vibrational (rotational) Raman mode in the Cmca phase might be the driving force to initiate the transition.
Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome.
Tothill, Richard W; Tinker, Anna V; George, Joshy; Brown, Robert; Fox, Stephen B; Lade, Stephen; Johnson, Daryl S; Trivett, Melanie K; Etemadmoghadam, Dariush; Locandro, Bianca; Traficante, Nadia; Fereday, Sian; Hung, Jillian A; Chiew, Yoke-Eng; Haviv, Izhak; Gertig, Dorota; DeFazio, Anna; Bowtell, David D L
2008-08-15
The study aim to identify novel molecular subtypes of ovarian cancer by gene expression profiling with linkage to clinical and pathologic features. Microarray gene expression profiling was done on 285 serous and endometrioid tumors of the ovary, peritoneum, and fallopian tube. K-means clustering was applied to identify robust molecular subtypes. Statistical analysis identified differentially expressed genes, pathways, and gene ontologies. Laser capture microdissection, pathology review, and immunohistochemistry validated the array-based findings. Patient survival within k-means groups was evaluated using Cox proportional hazards models. Class prediction validated k-means groups in an independent dataset. A semisupervised survival analysis of the array data was used to compare against unsupervised clustering results. Optimal clustering of array data identified six molecular subtypes. Two subtypes represented predominantly serous low malignant potential and low-grade endometrioid subtypes, respectively. The remaining four subtypes represented higher grade and advanced stage cancers of serous and endometrioid morphology. A novel subtype of high-grade serous cancers reflected a mesenchymal cell type, characterized by overexpression of N-cadherin and P-cadherin and low expression of differentiation markers, including CA125 and MUC1. A poor prognosis subtype was defined by a reactive stroma gene expression signature, correlating with extensive desmoplasia in such samples. A similar poor prognosis signature could be found using a semisupervised analysis. Each subtype displayed distinct levels and patterns of immune cell infiltration. Class prediction identified similar subtypes in an independent ovarian dataset with similar prognostic trends. Gene expression profiling identified molecular subtypes of ovarian cancer of biological and clinical importance.
Quantitative imaging as cancer biomarker
NASA Astrophysics Data System (ADS)
Mankoff, David A.
2015-03-01
The ability to assay tumor biologic features and the impact of drugs on tumor biology is fundamental to drug development. Advances in our ability to measure genomics, gene expression, protein expression, and cellular biology have led to a host of new targets for anticancer drug therapy. In translating new drugs into clinical trials and clinical practice, these same assays serve to identify patients most likely to benefit from specific anticancer treatments. As cancer therapy becomes more individualized and targeted, there is an increasing need to characterize tumors and identify therapeutic targets to select therapy most likely to be successful in treating the individual patient's cancer. Thus far assays to identify cancer therapeutic targets or anticancer drug pharmacodynamics have been based upon in vitro assay of tissue or blood samples. Advances in molecular imaging, particularly PET, have led to the ability to perform quantitative non-invasive molecular assays. Imaging has traditionally relied on structural and anatomic features to detect cancer and determine its extent. More recently, imaging has expanded to include the ability to image regional biochemistry and molecular biology, often termed molecular imaging. Molecular imaging can be considered an in vivo assay technique, capable of measuring regional tumor biology without perturbing it. This makes molecular imaging a unique tool for cancer drug development, complementary to traditional assay methods, and a potentially powerful method for guiding targeted therapy in clinical trials and clinical practice. The ability to quantify, in absolute measures, regional in vivo biologic parameters strongly supports the use of molecular imaging as a tool to guide therapy. This review summarizes current and future applications of quantitative molecular imaging as a biomarker for cancer therapy, including the use of imaging to (1) identify patients whose tumors express a specific therapeutic target; (2) determine whether the drug reaches the target; (3) identify an early response to treatment; and (4) predict the impact of therapy on long-term outcomes such as survival. The manuscript reviews basic concepts important in the application of molecular imaging to cancer drug therapy, in general, and will discuss specific examples of studies in humans, and highlight future directions, including ongoing multi-center clinical trials using molecular imaging as a cancer biomarker.
Toxicity prediction of ionic liquids based on Daphnia magna by using density functional theory
NASA Astrophysics Data System (ADS)
Nu’aim, M. N.; Bustam, M. A.
2018-04-01
By using a model called density functional theory, the toxicity of ionic liquids can be predicted and forecast. It is a theory that allowing the researcher to have a substantial tool for computation of the quantum state of atoms, molecules and solids, and molecular dynamics which also known as computer simulation method. It can be done by using structural feature based quantum chemical reactivity descriptor. The identification of ionic liquids and its Log[EC50] data are from literature data that available in Ismail Hossain thesis entitled “Synthesis, Characterization and Quantitative Structure Toxicity Relationship of Imidazolium, Pyridinium and Ammonium Based Ionic Liquids”. Each cation and anion of the ionic liquids were optimized and calculated. The geometry optimization and calculation from the software, produce the value of highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO). From the value of HOMO and LUMO, the value for other toxicity descriptors were obtained according to their formulas. The toxicity descriptor that involves are electrophilicity index, HOMO, LUMO, energy gap, chemical potential, hardness and electronegativity. The interrelation between the descriptors are being determined by using a multiple linear regression (MLR). From this MLR, all descriptors being analyzed and the descriptors that are significant were chosen. In order to develop the finest model equation for toxicity prediction of ionic liquids, the selected descriptors that are significant were used. The validation of model equation was performed with the Log[EC50] data from the literature and the final model equation was developed. A bigger range of ionic liquids which nearly 108 of ionic liquids can be predicted from this model equation.
Alignment-Based Prediction of Sites of Metabolism.
de Bruyn Kops, Christina; Friedrich, Nils-Ole; Kirchmair, Johannes
2017-06-26
Prediction of metabolically labile atom positions in a molecule (sites of metabolism) is a key component of the simulation of xenobiotic metabolism as a whole, providing crucial information for the development of safe and effective drugs. In 2008, an exploratory study was published in which sites of metabolism were derived based on molecular shape- and chemical feature-based alignment to a molecule whose site of metabolism (SoM) had been determined by experiments. We present a detailed analysis of the breadth of applicability of alignment-based SoM prediction, including transfer of the approach from a structure- to ligand-based method and extension of the applicability of the models from cytochrome P450 2C9 to all cytochrome P450 isozymes involved in drug metabolism. We evaluate the effect of molecular similarity of the query and reference molecules on the ability of this approach to accurately predict SoMs. In addition, we combine the alignment-based method with a leading chemical reactivity model to take reactivity into account. The combined model yielded superior performance in comparison to the alignment-based approach and the reactivity models with an average area under the receiver operating characteristic curve of 0.85 in cross-validation experiments. In particular, early enrichment was improved, as evidenced by higher BEDROC scores (mean BEDROC = 0.59 for α = 20.0, mean BEDROC = 0.73 for α = 80.5).
Rusyn, Ivan; Sedykh, Alexander; Guyton, Kathryn Z.; Tropsha, Alexander
2012-01-01
Quantitative structure-activity relationship (QSAR) models are widely used for in silico prediction of in vivo toxicity of drug candidates or environmental chemicals, adding value to candidate selection in drug development or in a search for less hazardous and more sustainable alternatives for chemicals in commerce. The development of traditional QSAR models is enabled by numerical descriptors representing the inherent chemical properties that can be easily defined for any number of molecules; however, traditional QSAR models often have limited predictive power due to the lack of data and complexity of in vivo endpoints. Although it has been indeed difficult to obtain experimentally derived toxicity data on a large number of chemicals in the past, the results of quantitative in vitro screening of thousands of environmental chemicals in hundreds of experimental systems are now available and continue to accumulate. In addition, publicly accessible toxicogenomics data collected on hundreds of chemicals provide another dimension of molecular information that is potentially useful for predictive toxicity modeling. These new characteristics of molecular bioactivity arising from short-term biological assays, i.e., in vitro screening and/or in vivo toxicogenomics data can now be exploited in combination with chemical structural information to generate hybrid QSAR–like quantitative models to predict human toxicity and carcinogenicity. Using several case studies, we illustrate the benefits of a hybrid modeling approach, namely improvements in the accuracy of models, enhanced interpretation of the most predictive features, and expanded applicability domain for wider chemical space coverage. PMID:22387746
First principles molecular dynamics of molten NaCl
NASA Astrophysics Data System (ADS)
Galamba, N.; Costa Cabral, B. J.
2007-03-01
First principles Hellmann-Feynman molecular dynamics (HFMD) results for molten NaCl at a single state point are reported. The effect of induction forces on the structure and dynamics of the system is studied by comparison of the partial radial distribution functions and the velocity and force autocorrelation functions with those calculated from classical MD based on rigid-ion and shell-model potentials. The first principles results reproduce the main structural features of the molten salt observed experimentally, whereas they are incorrectly described by both rigid-ion and shell-model potentials. Moreover, HFMD Green-Kubo self-diffusion coefficients are in closer agreement with experimental data than those predicted by classical MD. A comprehensive discussion of MD results for molten NaCl based on different ab initio parametrized polarizable interionic potentials is also given.
QSAR and 3D QSAR of inhibitors of the epidermal growth factor receptor
NASA Astrophysics Data System (ADS)
Pinto-Bazurco, Mariano; Tsakovska, Ivanka; Pajeva, Ilza
This article reports quantitative structure-activity relationships (QSAR) and 3D QSAR models of 134 structurally diverse inhibitors of the epidermal growth factor receptor (EGFR) tyrosine kinase. Free-Wilson analysis was used to derive the QSAR model. It identified the substituents in aniline, the polycyclic system, and the substituents at the 6- and 7-positions of the polycyclic system as the most important structural features. Comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were used in the 3D QSAR modeling. The steric and electrostatic interactions proved the most important for the inhibitory effect. Both QSAR and 3D QSAR models led to consistent results. On the basis of the statistically significant models, new structures were proposed and their inhibitory activities were predicted.
Environmental Pressure May Change the Composition Protein Disorder in Prokaryotes
Vicedo, Esmeralda; Schlessinger, Avner; Rost, Burkhard
2015-01-01
Many prokaryotic organisms have adapted to incredibly extreme habitats. The genomes of such extremophiles differ from their non-extremophile relatives. For example, some proteins in thermophiles sustain high temperatures by being more compact than homologs in non-extremophiles. Conversely, some proteins have increased volumes to compensate for freezing effects in psychrophiles that survive in the cold. Here, we revealed that some differences in organisms surviving in extreme habitats correlate with a simple single feature, namely the fraction of proteins predicted to have long disordered regions. We predicted disorder with different methods for 46 completely sequenced organisms from diverse habitats and found a correlation between protein disorder and the extremity of the environment. More specifically, the overall percentage of proteins with long disordered regions tended to be more similar between organisms of similar habitats than between organisms of similar taxonomy. For example, predictions tended to detect substantially more proteins with long disordered regions in prokaryotic halophiles (survive high salt) than in their taxonomic neighbors. Another peculiar environment is that of high radiation survived, e.g. by Deinococcus radiodurans. The relatively high fraction of disorder predicted in this extremophile might provide a shield against mutations. Although our analysis fails to establish causation, the observed correlation between such a simplistic, coarse-grained, microscopic molecular feature (disorder content) and a macroscopic variable (habitat) remains stunning. PMID:26252577
Rheology modification with ring polymers
NASA Astrophysics Data System (ADS)
Vlassopoulos, Dimitris
It is now established that experimental unconcatenated ring polymers can be purified effectively by means of fractionation at the critical condition. For molecular weights well above the entanglement threshold, purified rings relax stress via power-law (with an exponent of about -0.4), sharply departing from their linear counterparts. Experimental results are in harmony with modeling predictions and simulations. Here, we present results from recent interdisciplinary efforts and discuss two challenges: (i) the nonlinear shear rheology of purified ring melts is also very different from that of unlinked chains. Whereas the latter exhibit features that can be explained, to a first approach, in the framework in the tube model, the former behave akin to unentangled chains with finite extensibility and exhibit much small deformation at steady state. (ii) blends of rings and linear polymers exhibit unique features in different regimes: The addition of minute amounts of linear chains drastically affects ring dynamics. This relates to ring purity and the ability of unlinked linear chains to thread rings. With the help of simulations, it is possible to rationalize the observed surprisingly slow viscoelastic relaxation, which is attributed to ring-linear and ring-ring penetrations. On the other hand, adding small amounts of rings to linear polymers of different molecular weights influences their linear and nonlinear rheology in an unprecedented way. The blend viscosity exceeds that of the slower component (linear) in this non-interacting mixture, and its dependencies on composition and molecular weight ratio are examined, whereas the role of molecular architecture is also addressed. Consequently, closing the ends of a linear chain can serve as a powerful means for molecular manipulation of its rheology. This presentation reflects collaborative efforts with S. Costanzo, Z-C. Yan, R. Pasquino, M. Kaliva, S. Kamble, Y. Jeong, P. Lutz, J. Allgaier, T. Chang, D. Talikis, V. Mavrantzas and M. Rubinstein.
Structural relaxation in supercooled orthoterphenyl.
Chong, S-H; Sciortino, F
2004-05-01
We report molecular-dynamics simulation results performed for a model of molecular liquid orthoterphenyl in supercooled states, which we then compare with both experimental data and mode-coupling-theory (MCT) predictions, aiming at a better understanding of structural relaxation in orthoterphenyl. We pay special attention to the wave number dependence of the collective dynamics. It is shown that the simulation results for the model share many features with experimental data for real system, and that MCT captures the simulation results at the semiquantitative level except for intermediate wave numbers connected to the overall size of the molecule. Theoretical results at the intermediate wave number region are found to be improved by taking into account the spatial correlation of the molecule's geometrical center. This supports the idea that unusual dynamical properties at the intermediate wave numbers, reported previously in simulation studies for the model and discernible in coherent neutron-scattering experimental data, are basically due to the coupling of the rotational motion to the geometrical-center dynamics. However, there still remain qualitative as well as quantitative discrepancies between theoretical prediction and corresponding simulation results at the intermediate wave numbers, which call for further theoretical investigation.
Liu, Shubin; Rong, Chunying; Lu, Tian
2017-01-04
One of the main tasks of theoretical chemistry is to rationalize computational results with chemical insights. Key concepts of such nature include nucleophilicity, electrophilicity, regioselectivity, and stereoselectivity. While computational tools are available to predict barrier heights and other reactivity properties with acceptable accuracy, a conceptual framework to appreciate above quantities is still lacking. In this work, we introduce the electronic force as the fundamental driving force of chemical processes to understand and predict molecular reactivity. It has three components but only two are independent. These forces, electrostatic and steric, can be employed as reliable descriptors for nucleophilic and electrophilic regioselectivity and stereoselectivity. The advantages of using these forces to evaluate molecular reactivity are that electrophilic and nucleophilic attacks are featured by distinct characteristics in the electrostatic force and no knowledge of quantum effects included in the kinetic and exchange-correlation energies is required. Examples are provided to highlight the validity and general applicability of these reactivity descriptors. Possible applications in ambident reactivity, σ and π holes, frustrated Lewis pairs, and stereoselective reactions are also included in this work.
El Hage Chehade, Hiba; Wazir, Umar; Mokbel, Kinan; Kasem, Abdul; Mokbel, Kefah
2018-01-01
Decision-making regarding adjuvant chemotherapy has been based on clinical and pathological features. However, such decisions are seldom consistent. Web-based predictive models have been developed using data from cancer registries to help determine the need for adjuvant therapy. More recently, with the recognition of the heterogenous nature of breast cancer, genomic assays have been developed to aid in the therapeutic decision-making. We have carried out a comprehensive literature review regarding online prognostication tools and genomic assays to assess whether online tools could be used as valid alternatives to genomic profiling in decision-making regarding adjuvant therapy in early breast cancer. Breast cancer has been recently recognized as a heterogenous disease based on variations in molecular characteristics. Online tools are valuable in guiding adjuvant treatment, especially in resource constrained countries. However, in the era of personalized therapy, molecular profiling appears to be superior in predicting clinical outcome and guiding therapy. Copyright © 2017 Elsevier Inc. All rights reserved.
Avanzini, Francesco; Moro, Giorgio J
2018-03-15
The quantum molecular trajectory is the deterministic trajectory, arising from the Bohm theory, that describes the instantaneous positions of the nuclei of molecules by assuring the agreement with the predictions of quantum mechanics. Therefore, it provides the suitable framework for representing the geometry and the motions of molecules without neglecting their quantum nature. However, the quantum molecular trajectory is extremely demanding from the computational point of view, and this strongly limits its applications. To overcome such a drawback, we derive a stochastic representation of the quantum molecular trajectory, through projection operator techniques, for the degrees of freedom of an open quantum system. The resulting Fokker-Planck operator is parametrically dependent upon the reduced density matrix of the open system. Because of the pilot role played by the reduced density matrix, this stochastic approach is able to represent accurately the main features of the open system motions both at equilibrium and out of equilibrium with the environment. To verify this procedure, the predictions of the stochastic and deterministic representation are compared for a model system of six interacting harmonic oscillators, where one oscillator is taken as the open quantum system of interest. The undeniable advantage of the stochastic approach is that of providing a simplified and self-contained representation of the dynamics of the open system coordinates. Furthermore, it can be employed to study the out of equilibrium dynamics and the relaxation of quantum molecular motions during photoinduced processes, like photoinduced conformational changes and proton transfers.
DNA-mediated nanoparticle crystallization into Wulff polyhedra
NASA Astrophysics Data System (ADS)
Auyeung, Evelyn; Li, Ting I. N. G.; Senesi, Andrew J.; Schmucker, Abrin L.; Pals, Bridget C.; de La Cruz, Monica Olvera; Mirkin, Chad A.
2014-01-01
Crystallization is a fundamental and ubiquitous process much studied over the centuries. But although the crystallization of atoms is fairly well understood, it remains challenging to predict reliably the outcome of molecular crystallization processes that are complicated by various molecular interactions and solvent involvement. This difficulty also applies to nanoparticles: high-quality three-dimensional crystals are mostly produced using drying and sedimentation techniques that are often impossible to rationalize and control to give a desired crystal symmetry, lattice spacing and habit (crystal shape). In principle, DNA-mediated assembly of nanoparticles offers an ideal opportunity for studying nanoparticle crystallization: a well-defined set of rules have been developed to target desired lattice symmetries and lattice constants, and the occurrence of features such as grain boundaries and twinning in DNA superlattices and traditional crystals comprised of molecular or atomic building blocks suggests that similar principles govern their crystallization. But the presence of charged biomolecules, interparticle spacings of tens of nanometres, and the realization so far of only polycrystalline DNA-interconnected nanoparticle superlattices, all suggest that DNA-guided crystallization may differ from traditional crystal growth. Here we show that very slow cooling, over several days, of solutions of complementary-DNA-modified nanoparticles through the melting temperature of the system gives the thermodynamic product with a specific and uniform crystal habit. We find that our nanoparticle assemblies have the Wulff equilibrium crystal structure that is predicted from theoretical considerations and molecular dynamics simulations, thus establishing that DNA hybridization can direct nanoparticle assembly along a pathway that mimics atomic crystallization.
Tumor Heterogeneity in Breast Cancer
Turashvili, Gulisa; Brogi, Edi
2017-01-01
Breast cancer is a heterogeneous disease and differs greatly among different patients (intertumor heterogeneity) and even within each individual tumor (intratumor heterogeneity). Clinical and morphologic intertumor heterogeneity is reflected by staging systems and histopathologic classification of breast cancer. Heterogeneity in the expression of established prognostic and predictive biomarkers, hormone receptors, and human epidermal growth factor receptor 2 oncoprotein is the basis for targeted treatment. Molecular classifications are indicators of genetic tumor heterogeneity, which is probed with multigene assays and can lead to improved stratification into low- and high-risk groups for personalized therapy. Intratumor heterogeneity occurs at the morphologic, genomic, transcriptomic, and proteomic levels, creating diagnostic and therapeutic challenges. Understanding the molecular and cellular mechanisms of tumor heterogeneity that are relevant to the development of treatment resistance is a major area of research. Despite the improved knowledge of the complex genetic and phenotypic features underpinning tumor heterogeneity, there has been only limited advancement in diagnostic, prognostic, or predictive strategies for breast cancer. The current guidelines for reporting of biomarkers aim to maximize patient eligibility for targeted therapy, but do not take into account intratumor heterogeneity. The molecular classification of breast cancer is not implemented in routine clinical practice. Additional studies and in-depth analysis are required to understand the clinical significance of rapidly accumulating data. This review highlights inter- and intratumor heterogeneity of breast carcinoma with special emphasis on pathologic findings, and provides insights into the clinical significance of molecular and cellular mechanisms of heterogeneity. PMID:29276709
NASA Astrophysics Data System (ADS)
Spirina, L. V.; Usynin, Y. A.; Kondakova, I. V.; Yurmazov, Z. A.; Slonimskaya, E. M.; Pikalova, L. V.
2016-08-01
The investigation of molecular mechanisms of tumor cell behavior in small renal masses is required to achieve the better cancer survival. The aim of the study is to find molecular markers associated with outcome of patients with kidney tumors 7 cm or less. A homogenous group of 20 patients T1N0M0-1 (mean age 57.6 ± 2.2 years) with kidney cancer was selected for the present analysis. The content of transcription and growth factors was determined by ELISA. The levels of AKT-mTOR signaling pathway components were measured by Western blotting analysis. The molecular markers associated with unfavorable outcome of patients with kidney tumors 7 cm or less were high levels of NF-kB p50, NF-kB p65, HIF-1, HIF-2, VEGF and CAIX. AKT activation with PTEN loss also correlated with the unfavorable outcome of kidney cancer patients with tumor size 7 cm or less. It is observed that the biological features of kidney cancer could predict the outcome of patients.
Molecular approach to genetic and epigenetic pathogenesis of early-onset colorectal cancer
Tezcan, Gulcin; Tunca, Berrin; Ak, Secil; Cecener, Gulsah; Egeli, Unal
2016-01-01
Colorectal cancer (CRC) is the third most frequent cancer type and the incidence of this disease is increasing gradually per year in individuals younger than 50 years old. The current knowledge is that early-onset CRC (EOCRC) cases are heterogeneous population that includes both hereditary and sporadic forms of the CRC. Although EOCRC cases have some distinguishing clinical and pathological features than elder age CRC, the molecular mechanism underlying the EOCRC is poorly clarified. Given the significance of CRC in the world of medicine, the present review will focus on the recent knowledge in the molecular basis of genetic and epigenetic mechanism of the hereditary forms of EOCRC, which includes Lynch syndrome, Familial CRC type X, Familial adenomatous polyposis, MutYH-associated polyposis, Juvenile polyposis syndrome, Peutz-Jeghers Syndrome and sporadic forms of EOCRC. Recent findings about molecular genetics and epigenetic basis of EOCRC gave rise to new alternative therapy protocols. Although exact diagnosis of these cases still remains complicated, the present review paves way for better predictions and contributes to more accurate diagnostic and therapeutic strategies into clinical approach. PMID:26798439
Bourg, Ian C; Sposito, Garrison
2010-03-15
In this paper, we address the manner in which the continuum-scale diffusive properties of smectite-rich porous media arise from their molecular- and pore-scale features. Our starting point is a successful model of the continuum-scale apparent diffusion coefficient for water tracers and cations, which decomposes it as a sum of pore-scale terms describing diffusion in macropore and interlayer "compartments." We then apply molecular dynamics (MD) simulations to determine molecular-scale diffusion coefficients D(interlayer) of water tracers and representative cations (Na(+), Cs(+), Sr(2+)) in Na-smectite interlayers. We find that a remarkably simple expression relates D(interlayer) to the pore-scale parameter δ(nanopore) ≤ 1, a constrictivity factor that accounts for the lower mobility in interlayers as compared to macropores: δ(nanopore) = D(interlayer)/D(0), where D(0) is the diffusion coefficient in bulk liquid water. Using this scaling expression, we can accurately predict the apparent diffusion coefficients of tracers H(2)0, Na(+), Sr(2+), and Cs(+) in compacted Na-smectite-rich materials.
Martyanov, Viktor; Whitfield, Michael L
2016-01-01
The goal of this review is to summarize recent advances into the pathogenesis and treatment of systemic sclerosis (SSc) from genomic and proteomic studies. Intrinsic gene expression-driven molecular subtypes of SSc are reproducible across three independent datasets. These subsets are a consistent feature of SSc and are found in multiple end-target tissues, such as skin and esophagus. Intrinsic subsets as well as baseline levels of molecular target pathways are potentially predictive of clinical response to specific therapeutics, based on three recent clinical trials. A gene expression-based biomarker of modified Rodnan skin score, a measure of SSc skin severity, can be used as a surrogate outcome metric and has been validated in a recent trial. Proteome analyses have identified novel biomarkers of SSc that correlate with SSc clinical phenotypes. Integrating intrinsic gene expression subset data, baseline molecular pathway information, and serum biomarkers along with surrogate measures of modified Rodnan skin score provides molecular context in SSc clinical trials. With validation, these approaches could be used to match patients with the therapies from which they are most likely to benefit and thus increase the likelihood of clinical improvement.
NASA Astrophysics Data System (ADS)
Polwaththe-Gallage, Hasitha-Nayanajith; Sauret, Emilie; Nguyen, Nam-Trung; Saha, Suvash C.; Gu, YuanTong
2018-01-01
Liquid marbles are liquid droplets coated with superhydrophobic powders whose morphology is governed by the gravitational and surface tension forces. Small liquid marbles take spherical shapes, while larger liquid marbles exhibit puddle shapes due to the dominance of gravitational forces. Liquid marbles coated with hydrophobic magnetic powders respond to an external magnetic field. This unique feature of magnetic liquid marbles is very attractive for digital microfluidics and drug delivery systems. Several experimental studies have reported the behavior of the liquid marbles. However, the complete behavior of liquid marbles under various environmental conditions is yet to be understood. Modeling techniques can be used to predict the properties and the behavior of the liquid marbles effectively and efficiently. A robust liquid marble model will inspire new experiments and provide new insights. This paper presents a novel numerical modeling technique to predict the morphology of magnetic liquid marbles based on coarse grained molecular dynamics concepts. The proposed model is employed to predict the changes in height of a magnetic liquid marble against its width and compared with the experimental data. The model predictions agree well with the experimental findings. Subsequently, the relationship between the morphology of a liquid marble with the properties of the liquid is investigated. Furthermore, the developed model is capable of simulating the reversible process of opening and closing of the magnetic liquid marble under the action of a magnetic force. The scaling analysis shows that the model predictions are consistent with the scaling laws. Finally, the proposed model is used to assess the compressibility of the liquid marbles. The proposed modeling approach has the potential to be a powerful tool to predict the behavior of magnetic liquid marbles serving as bioreactors.
Application of Functional Use Predictions to Aid in Structure ...
Humans are potentially exposed to thousands of anthropogenic chemicals in commerce. Recent work has shown that the bulk of this exposure may occur in near-field indoor environments (e.g., home, school, work, etc.). Advances in suspect screening analyses (SSA) now allow an improved understanding of the chemicals present in these environments. However, due to the nature of suspect screening techniques, investigators are often left with chemical formula predictions, with the possibility of many chemical structures matching to each formula. Here, newly developed quantitative structure-use relationship (QSUR) models are used to identify potential exposure sources for candidate structures. Previously, a suspect screening workflow was introduced and applied to house dust samples collected from the U.S. Department of Housing and Urban Development’s American Healthy Homes Survey (AHHS) [Rager, et al., Env. Int. 88 (2016)]. This workflow utilized the US EPA’s Distributed Structure-Searchable Toxicity (DSSTox) Database to link identified molecular features to molecular formulas, and ultimately chemical structures. Multiple QSUR models were applied to support the evaluation of candidate structures. These QSURs predict the likelihood of a chemical having a functional use commonly associated with consumer products having near-field use. For 3,228 structures identified as possible chemicals in AHHS house dust samples, we were able to obtain the required descriptors to appl
Pérez-Garrido, Alfonso; Helguera, Aliuska Morales; Rodríguez, Francisco Girón; Cordeiro, M Natália D S
2010-05-01
The purpose of this study is to develop a quantitative structure-activity relationship (QSAR) model that can distinguish mutagenic from non-mutagenic species with alpha,beta-unsaturated carbonyl moiety using two endpoints for this activity - Ames test and mammalian cell gene mutation test - and also to gather information about the molecular features that most contribute to eliminate the mutagenic effects of these chemicals. Two data sets were used for modeling the two mutagenicity endpoints: (1) Ames test and (2) mammalian cells mutagenesis. The first one comprised 220 molecules, while the second one 48 substances, ranging from acrylates, methacrylates to alpha,beta-unsaturated carbonyl compounds. The QSAR models were developed by applying linear discriminant analysis (LDA) along with different sets of descriptors computed using the DRAGON software. For both endpoints, there was a concordance of 89% in the prediction and 97% confidentiality by combining the three models for the Ames test mutagenicity. We have also identified several structural alerts to assist the design of new monomers. These individual models and especially their combination are attractive from the point of view of molecular modeling and could be used for the prediction and design of new monomers that do not pose a human health risk. 2010 Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.
Molecular markers in pediatric neuro-oncology
Ichimura, Koichi; Nishikawa, Ryo; Matsutani, Masao
2012-01-01
Pediatric molecular neuro-oncology is a fast developing field. A multitude of molecular profiling studies in recent years has unveiled a number of genetic abnormalities unique to pediatric brain tumors. It has now become clear that brain tumors that arise in children have distinct pathogenesis and biology, compared with their adult counterparts, even for those with indistinguishable histopathology. Some of the molecular features are so specific to a particular type of tumors, such as the presence of the KIAA1549-BRAF fusion gene for pilocytic astrocytomas or SMARCB1 mutations for atypical teratoid/rhabdoid tumors, that they could practically serve as a diagnostic marker on their own. Expression profiling has resolved the existence of 4 molecular subgroups in medulloblastomas, which positively translated into improved prognostication for the patients. The currently available molecular markers, however, do not cover all tumors even within a single tumor entity. The molecular pathogenesis of a large number of pediatric brain tumors is still unaccounted for, and the hierarchy of tumors is likely to be more complex and intricate than currently acknowledged. One of the main tasks of future molecular analyses in pediatric neuro-oncology, including the ongoing genome sequencing efforts, is to elucidate the biological basis of those orphan tumors. The ultimate goal of molecular diagnostics is to accurately predict the clinical and biological behavior of any tumor by means of their molecular characteristics, which is hoped to eventually pave the way for individualized treatment. PMID:23095836
Compact structure and non-Gaussian dynamics of ring polymer melts.
Brás, Ana R; Goossen, Sebastian; Krutyeva, Margarita; Radulescu, Aurel; Farago, Bela; Allgaier, Jürgen; Pyckhout-Hintzen, Wim; Wischnewski, Andreas; Richter, Dieter
2014-05-28
We present a neutron scattering analysis of the structure and dynamics of PEO polymer rings with a molecular weight 2.5 times higher than the entanglement mass. The melt structure was found to be more compact than a Gaussian model would suggest. With increasing time the center of mass (c.o.m.) diffusion undergoes a transition from sub-diffusive to diffusive behavior. The transition time agrees well with the decorrelation time predicted by a mode coupling approach. As a novel feature well pronounced non-Gaussian behavior of the c.o.m. diffusion was found that shows surprising analogies to the cage effect known from glassy systems. Finally, the longest wavelength Rouse modes are suppressed possibly as a consequence of an onset of lattice animal features as hypothesized in theoretical approaches.
Relational Network for Knowledge Discovery through Heterogeneous Biomedical and Clinical Features
Chen, Huaidong; Chen, Wei; Liu, Chenglin; Zhang, Le; Su, Jing; Zhou, Xiaobo
2016-01-01
Biomedical big data, as a whole, covers numerous features, while each dataset specifically delineates part of them. “Full feature spectrum” knowledge discovery across heterogeneous data sources remains a major challenge. We developed a method called bootstrapping for unified feature association measurement (BUFAM) for pairwise association analysis, and relational dependency network (RDN) modeling for global module detection on features across breast cancer cohorts. Discovered knowledge was cross-validated using data from Wake Forest Baptist Medical Center’s electronic medical records and annotated with BioCarta signaling signatures. The clinical potential of the discovered modules was exhibited by stratifying patients for drug responses. A series of discovered associations provided new insights into breast cancer, such as the effects of patient’s cultural background on preferences for surgical procedure. We also discovered two groups of highly associated features, the HER2 and the ER modules, each of which described how phenotypes were associated with molecular signatures, diagnostic features, and clinical decisions. The discovered “ER module”, which was dominated by cancer immunity, was used as an example for patient stratification and prediction of drug responses to tamoxifen and chemotherapy. BUFAM-derived RDN modeling demonstrated unique ability to discover clinically meaningful and actionable knowledge across highly heterogeneous biomedical big data sets. PMID:27427091
Relational Network for Knowledge Discovery through Heterogeneous Biomedical and Clinical Features
NASA Astrophysics Data System (ADS)
Chen, Huaidong; Chen, Wei; Liu, Chenglin; Zhang, Le; Su, Jing; Zhou, Xiaobo
2016-07-01
Biomedical big data, as a whole, covers numerous features, while each dataset specifically delineates part of them. “Full feature spectrum” knowledge discovery across heterogeneous data sources remains a major challenge. We developed a method called bootstrapping for unified feature association measurement (BUFAM) for pairwise association analysis, and relational dependency network (RDN) modeling for global module detection on features across breast cancer cohorts. Discovered knowledge was cross-validated using data from Wake Forest Baptist Medical Center’s electronic medical records and annotated with BioCarta signaling signatures. The clinical potential of the discovered modules was exhibited by stratifying patients for drug responses. A series of discovered associations provided new insights into breast cancer, such as the effects of patient’s cultural background on preferences for surgical procedure. We also discovered two groups of highly associated features, the HER2 and the ER modules, each of which described how phenotypes were associated with molecular signatures, diagnostic features, and clinical decisions. The discovered “ER module”, which was dominated by cancer immunity, was used as an example for patient stratification and prediction of drug responses to tamoxifen and chemotherapy. BUFAM-derived RDN modeling demonstrated unique ability to discover clinically meaningful and actionable knowledge across highly heterogeneous biomedical big data sets.
Interfacial charge transfer absorption: Application to metal molecule assemblies
NASA Astrophysics Data System (ADS)
Creutz, Carol; Brunschwig, Bruce S.; Sutin, Norman
2006-05-01
Optically induced charge transfer between adsorbed molecules and a metal electrode was predicted by Hush to lead to new electronic absorption features, but has been only rarely observed experimentally. Interfacial charge transfer absorption (IFCTA) provides information concerning the barriers to charge transfer between molecules and the metal/semiconductor and the magnitude of the electronic coupling and could thus provide a powerful tool for understanding interfacial charge-transfer kinetics. Here, we utilize a previously published model [C. Creutz, B.S. Brunschwig, N. Sutin, J. Phys. Chem. B 109 (2005) 10251] to predict IFCTA spectra of metal-molecule assemblies and compare the literature observations to these predictions. We conclude that, in general, the electronic coupling between molecular adsorbates and the metal levels is so small that IFCTA is not detectable. However, few experiments designed to detect IFCTA have been done. We suggest approaches to optimizing the conditions for observing the process.
Defining a Cancer Dependency Map.
Tsherniak, Aviad; Vazquez, Francisca; Montgomery, Phil G; Weir, Barbara A; Kryukov, Gregory; Cowley, Glenn S; Gill, Stanley; Harrington, William F; Pantel, Sasha; Krill-Burger, John M; Meyers, Robin M; Ali, Levi; Goodale, Amy; Lee, Yenarae; Jiang, Guozhi; Hsiao, Jessica; Gerath, William F J; Howell, Sara; Merkel, Erin; Ghandi, Mahmoud; Garraway, Levi A; Root, David E; Golub, Todd R; Boehm, Jesse S; Hahn, William C
2017-07-27
Most human epithelial tumors harbor numerous alterations, making it difficult to predict which genes are required for tumor survival. To systematically identify cancer dependencies, we analyzed 501 genome-scale loss-of-function screens performed in diverse human cancer cell lines. We developed DEMETER, an analytical framework that segregates on- from off-target effects of RNAi. 769 genes were differentially required in subsets of these cell lines at a threshold of six SDs from the mean. We found predictive models for 426 dependencies (55%) by nonlinear regression modeling considering 66,646 molecular features. Many dependencies fall into a limited number of classes, and unexpectedly, in 82% of models, the top biomarkers were expression based. We demonstrated the basis behind one such predictive model linking hypermethylation of the UBB ubiquitin gene to a dependency on UBC. Together, these observations provide a foundation for a cancer dependency map that facilitates the prioritization of therapeutic targets. Copyright © 2017 Elsevier Inc. All rights reserved.
Sonographic features of invasive ductal breast carcinomas predictive of malignancy grade.
Gupta, Kanika; Kumaresan, Meenakshisundaram; Venkatesan, Bhuvaneswari; Chandra, Tushar; Patil, Aruna; Menon, Maya
2018-01-01
Assessment of individual sonographic features provides vital clues about the biological behavior of breast masses and can assist in determining histological grade of malignancy and thereby prognosis. Assessment of individual sonographic features of biopsy proven invasive ductal breast carcinomas as predictors of malignancy grade. A retrospective analysis of sonographic findings of 103 biopsy proven invasive ductal breast carcinomas. Tumor characteristics on gray-scale ultrasound and color flow were assessed using American College of Radiology (ACR) Breast Imaging Reporting and Data System (BI-RADS) Atlas Fifth Edition. The sonographic findings of masses were individually correlated with their histopathologic grades. Chi square test, ordinal regression, and Goodman and Kruskal tau test. Breast mass showing reversal/lack of diastolic flow has a high probability of belonging to histological high grade tumor ( β 1.566, P 0.0001 ). The masses with abrupt interface boundary are more likely grade 3 ( β 1.524, P 0.001 ) in comparison to masses with echogenic halos. The suspicious calcifications present in and outside the mass is a finding associated with histologically high grade tumors. The invasive ductal carcinomas (IDCs) with complex solid and cystic echotexture are more likely to be of high histological grade ( β 1.146, P 0.04 ) as compared to masses with hypoechoic echotexture. Certain ultrasound features are associated with tumor grade on histopathology. If the radiologist is cognizant of these sonographic features, ultrasound can be a potent modality for predicting histopathological grade of IDCs of the breast, especially in settings where advanced tests such as receptor and molecular analyses are limited.
Figarella-Branger, Dominique; Labrousse, François; Mohktari, Karima
2012-10-01
Pathological diagnosis plays a major role in the therapeutic management of adult diffuse gliomas. It is based on the histopathological analysis of a representative specimen. Therefore pathologists might be aware of the neuroradiological features of the lesions. Pathologists play a major role in the management of biological resources. Pathologists should classify adult gliomas according to WHO 2007 classification (histological subtype and grade). In addition, in order to provide the histomolecular classification of adult gliomas, search for molecular markers of diagnostic, prognostic or predictive of therapeutic responses must be performed by appropriate and validated immunohistochemical and molecular techniques. In all diffuse gliomas, whatever their grade, search for IDH1 R132H and P53 expression is required. Search for IDH1 minor mutations and IDH2 mutations is required in grade II and III IDH1 R132H negative gliomas whereas 1p19q codeletion should be searched for in grade II and III gliomas with an oligodendroglial component. Search for EGFR amplification and MGMT promoter methylation is recommended. It is strongly recommended to fill the standardized form for pathology and molecular features (validated by the French Society of Neuropathology) in all adult diffuse gliomas. Copyright © 2012. Published by Elsevier Masson SAS.
Lam, Lun Tak; Sun, Yi; Davey, Neil; Adams, Rod; Prapopoulou, Maria; Brown, Marc B; Moss, Gary P
2010-06-01
The aim was to employ Gaussian processes to assess mathematically the nature of a skin permeability dataset and to employ these methods, particularly feature selection, to determine the key physicochemical descriptors which exert the most significant influence on percutaneous absorption, and to compare such models with established existing models. Gaussian processes, including automatic relevance detection (GPRARD) methods, were employed to develop models of percutaneous absorption that identified key physicochemical descriptors of percutaneous absorption. Using MatLab software, the statistical performance of these models was compared with single linear networks (SLN) and quantitative structure-permeability relationships (QSPRs). Feature selection methods were used to examine in more detail the physicochemical parameters used in this study. A range of statistical measures to determine model quality were used. The inherently nonlinear nature of the skin data set was confirmed. The Gaussian process regression (GPR) methods yielded predictive models that offered statistically significant improvements over SLN and QSPR models with regard to predictivity (where the rank order was: GPR > SLN > QSPR). Feature selection analysis determined that the best GPR models were those that contained log P, melting point and the number of hydrogen bond donor groups as significant descriptors. Further statistical analysis also found that great synergy existed between certain parameters. It suggested that a number of the descriptors employed were effectively interchangeable, thus questioning the use of models where discrete variables are output, usually in the form of an equation. The use of a nonlinear GPR method produced models with significantly improved predictivity, compared with SLN or QSPR models. Feature selection methods were able to provide important mechanistic information. However, it was also shown that significant synergy existed between certain parameters, and as such it was possible to interchange certain descriptors (i.e. molecular weight and melting point) without incurring a loss of model quality. Such synergy suggested that a model constructed from discrete terms in an equation may not be the most appropriate way of representing mechanistic understandings of skin absorption.
Cao, Qi; Leung, K M
2014-09-22
Reliable computer models for the prediction of chemical biodegradability from molecular descriptors and fingerprints are very important for making health and environmental decisions. Coupling of the differential evolution (DE) algorithm with the support vector classifier (SVC) in order to optimize the main parameters of the classifier resulted in an improved classifier called the DE-SVC, which is introduced in this paper for use in chemical biodegradability studies. The DE-SVC was applied to predict the biodegradation of chemicals on the basis of extensive sample data sets and known structural features of molecules. Our optimization experiments showed that DE can efficiently find the proper parameters of the SVC. The resulting classifier possesses strong robustness and reliability compared with grid search, genetic algorithm, and particle swarm optimization methods. The classification experiments conducted here showed that the DE-SVC exhibits better classification performance than models previously used for such studies. It is a more effective and efficient prediction model for chemical biodegradability.
Vibrational spectroscopic study of fluticasone propionate
NASA Astrophysics Data System (ADS)
Ali, H. R. H.; Edwards, H. G. M.; Kendrick, J.; Scowen, I. J.
2009-03-01
Fluticasone propionate is a synthetic glucocorticoid with potent anti-inflammatory activity that has been used effectively in the treatment of chronic asthma. The present work reports a vibrational spectroscopic study of fluticasone propionate and gives proposed molecular assignments on the basis of ab initio calculations using BLYP density functional theory with a 6-31G* basis set and vibrational frequencies predicted within the quasi-harmonic approximation. Several spectral features and band intensities are explained. This study generated a library of information that can be employed to aid the process monitoring of fluticasone propionate.
NASA Astrophysics Data System (ADS)
Green, M. A.; Teubner, P. J. O.; Brunger, M. J.; Cartwright, D. C.; Campbell, L.
2001-03-01
We report integral cross sections (ICSs) for electron impact excitation of the sum (c 1Σ-u + A' 3Δu + A 3Σ+u) of the three states that constitute the Herzberg pseudocontinuum in O2. These ICSs were measured at seven incident electron energies in the range 9-20 eV in order to investigate for the existence of the strong resonance feature predicted by earlier R-matrix calculations. No such structure was observed in this letter.
Colorectal tumors: the histology report.
Lanza, Giovanni; Messerini, Luca; Gafà, Roberta; Risio, Mauro
2011-03-01
Epithelial colorectal tumors are common pathologic entities. Their histology report should be comprehensive of a series of pathologic parameters essential for the correct clinical management of the patients. Diagnostic histologic criteria of adenomatous, serrated, inflammatory, and hamartomatous polyps and of polyposis syndromes are discussed. In addition, the pathologic features of early and advanced colorectal cancer are described and a checklist is given. Finally, molecular prognostic and predictive factors currently employed in the treatment of colorectal cancer are discussed. Copyright © 2011 Editrice Gastroenterologica Italiana S.r.l. Published by Elsevier Ltd.. All rights reserved.
NASA Astrophysics Data System (ADS)
Mendoza-Figueroa, Humberto; Martínez-Gudiño, Gelacio; Villanueva-Luna, Jorge E.; Trujillo-Serrato, Joel J.; Morales-Ríos, Martha S.
2017-04-01
In this work, 2-(N-acylaminoalkyl)indoles 1a-1d, that incorporate a pMeOBn group at the 3-position of the indole ring were virtual screened as potential melatoninergic ligands by analog-based design study using pharmacophore modeling. Pharmacophore models for melatoninergic agonist and antagonist activity were developed in order to identify the molecular constraints that define the geometric relationship among chemical features in each model. The best hypothesis consisted of six features for agonists and eight features for antagonists. The models suggest that the agonists and antagonists can share the same 3D arrangement for the six common pharmacophoric elements identified: two hydrogen bond acceptors (HBA), one hydrogen bond donor (HBD), one hydrophobic area (H), and two aromatic rings (AR). The extra hydrofobic interaction might be used as criterion for identified the pharmacological antagonist profile. Based on the pharmacophore fit, it was found that structures 1c and 1d show a good structural overlay that meets the requirements for the antagonistic pharmacophore hypothesis. Molecular modeling studies using the PCM solvation model predicted that the most stable conformers of 1a-1d match the antagonist pharmacophore hypothesis in contrast to those in the gas phase. Structures 1a-1c were synthesized only but the activities were not tested.
Comparison of transcriptomic signature of post-Chernobyl and postradiotherapy thyroid tumors.
Ory, Catherine; Ugolin, Nicolas; Hofman, Paul; Schlumberger, Martin; Likhtarev, Illya A; Chevillard, Sylvie
2013-11-01
We previously identified two highly discriminating and predictive radiation-induced transcriptomic signatures by comparing series of sporadic and postradiotherapy thyroid tumors (322-gene signature), and by reanalyzing a previously published data set of sporadic and post-Chernobyl thyroid tumors (106-gene signature). The aim of the present work was (i) to compare the two signatures in terms of gene expression deregulations and molecular features/pathways, and (ii) to test the capacity of the postradiotherapy signature in classifying the post-Chernobyl series of tumors and reciprocally of the post-Chernobyl signature in classifying the postradiotherapy-induced tumors. We now explored if postradiotherapy and post-Chernobyl papillary thyroid carcinomas (PTC) display common molecular features by comparing molecular pathways deregulated in the two tumor series, and tested the potential of gene subsets of the postradiotherapy signature to classify the post-Chernobyl series (14 sporadic and 12 post-Chernobyl PTC), and reciprocally of gene subsets of the post-Chernobyl signature to classify the postradiotherapy series (15 sporadic and 12 postradiotherapy PTC), by using conventional principal component analysis. We found that the five genes common to the two signatures classified the learning/training tumors (used to search these signatures) of both the postradiotherapy (seven PTC) and the post-Chernobyl (six PTC) thyroid tumor series as compared with the sporadic tumors (seven sporadic PTC in each series). Importantly, these five genes were also effective for classifying independent series of postradiotherapy (five PTC) and post-Chernobyl (six PTC) tumors compared to independent series of sporadic tumors (eight PTC and six PTC respectively; testing tumors). Moreover, part of each postradiotherapy (32 genes) and post-Chernobyl signature (16 genes) cross-classified the respective series of thyroid tumors. Finally, several molecular pathways deregulated in post-Chernobyl tumors matched those found to be deregulated in postradiotherapy tumors. Overall, our data suggest that thyroid tumors that developed following either external exposure or internal (131)I contamination shared common molecular features, related to DNA repair, oxidative and endoplasmic reticulum stresses, allowing their classification as radiation-induced tumors in comparison with sporadic counterparts, independently of doses and dose rates, which suggests there may be a "general" radiation-induced signature of thyroid tumors.
Sparse regressions for predicting and interpreting subcellular localization of multi-label proteins.
Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan
2016-02-24
Predicting protein subcellular localization is indispensable for inferring protein functions. Recent studies have been focusing on predicting not only single-location proteins, but also multi-location proteins. Almost all of the high performing predictors proposed recently use gene ontology (GO) terms to construct feature vectors for classification. Despite their high performance, their prediction decisions are difficult to interpret because of the large number of GO terms involved. This paper proposes using sparse regressions to exploit GO information for both predicting and interpreting subcellular localization of single- and multi-location proteins. Specifically, we compared two multi-label sparse regression algorithms, namely multi-label LASSO (mLASSO) and multi-label elastic net (mEN), for large-scale predictions of protein subcellular localization. Both algorithms can yield sparse and interpretable solutions. By using the one-vs-rest strategy, mLASSO and mEN identified 87 and 429 out of more than 8,000 GO terms, respectively, which play essential roles in determining subcellular localization. More interestingly, many of the GO terms selected by mEN are from the biological process and molecular function categories, suggesting that the GO terms of these categories also play vital roles in the prediction. With these essential GO terms, not only where a protein locates can be decided, but also why it resides there can be revealed. Experimental results show that the output of both mEN and mLASSO are interpretable and they perform significantly better than existing state-of-the-art predictors. Moreover, mEN selects more features and performs better than mLASSO on a stringent human benchmark dataset. For readers' convenience, an online server called SpaPredictor for both mLASSO and mEN is available at http://bioinfo.eie.polyu.edu.hk/SpaPredictorServer/.
Pharmacophore modeling, virtual screening and molecular docking of ATPase inhibitors of HSP70.
Sangeetha, K; Sasikala, R P; Meena, K S
2017-10-01
Heat shock protein 70 is an effective anticancer target as it influences many signaling pathways. Hence the study investigated the important pharmacophore feature required for ATPase inhibitors of HSP70 by generating a ligand based pharmacophore model followed by virtual based screening and subsequent validation by molecular docking in Discovery studio V4.0. The most extrapolative pharmacophore model (hypotheses 8) consisted of four hydrogen bond acceptors. Further validation by external test set prediction identified 200 hits from Mini Maybridge, Drug Diverse, SCPDB compounds and Phytochemicals. Consequently, the screened compounds were refined by rule of five, ADMET and molecular docking to retain the best competitive hits. Finally Phytochemical compounds Muricatetrocin B, Diacetylphiladelphicalactone C, Eleutheroside B and 5-(3-{[1-(benzylsulfonyl)piperidin-4-yl]amino}phenyl)- 4-bromo-3-(carboxymethoxy)thiophene-2-carboxylic acid were obtained as leads to inhibit the ATPase activity of HSP70 in our findings and thus can be proposed for further in vitro and in vivo evaluation. Copyright © 2017 Elsevier Ltd. All rights reserved.
Molecular origin of the vibrational structure of ice I h
Moberg, Daniel R.; Straight, Shelby C.; Knight, Christopher; ...
2017-05-25
Here, an unambiguous assignment of the vibrational spectra of ice I h remains a matter of debate. This study demonstrates that an accurate representation of many-body interactions between water molecules, combined with an explicit treatment of nuclear quantum effects through many-body molecular dynamics (MB-MD), leads to a unified interpretation of the vibrational spectra of ice I h in terms of the structure and dynamics of the underlying hydrogen-bond network. All features of the infrared and Raman spectra in the OH stretching region can be unambiguously assigned by taking into account both the symmetry and the delocalized nature of the latticemore » vibrations as well as the local electrostatic environment experienced by each water molecule within the crystal. The high level of agreement with experiment raises prospects for predictive MB-MD simulations that, complementing analogous measurements, will provide molecular-level insights into fundamental processes taking place in bulk ice and on ice surfaces under different thermodynamic conditions.« less
Isocitrate dehydrogenase-mutant glioma: Evolving clinical and therapeutic implications.
Miller, Julie J; Shih, Helen A; Andronesi, Ovidiu C; Cahill, Daniel P
2017-12-01
The metabolic genes isocitrate dehydrogenase 1 (IDH1) and IDH2 are commonly mutated in low-grade glioma and in a subset of glioblastoma. These mutations co-occur with other recurrent molecular alterations, including 1p/19q codeletions and tumor suppressor protein 53 (TP53) and alpha thalassemia/mental retardation (ATRX) mutations, which together help to define a molecular signature that aids in the classification of gliomas and helps to better predict clinical behavior. A confluence of research suggests that glioma development in IDH-mutant and IDH wild-type tumors is driven by different oncogenic processes and responds differently to current treatment paradigms. Herein, the authors discuss the discovery of IDH mutations and associated molecular alterations in glioma, review clinical features common to patients with IDH-mutant glioma, and highlight current understanding of IDH mutation-driven gliomagenesis with implications for emerging treatment strategies. Cancer 2017;123:4535-4546. © 2017 American Cancer Society. © 2017 American Cancer Society.
VarMod: modelling the functional effects of non-synonymous variants.
Pappalardo, Morena; Wass, Mark N
2014-07-01
Unravelling the genotype-phenotype relationship in humans remains a challenging task in genomics studies. Recent advances in sequencing technologies mean there are now thousands of sequenced human genomes, revealing millions of single nucleotide variants (SNVs). For non-synonymous SNVs present in proteins the difficulties of the problem lie in first identifying those nsSNVs that result in a functional change in the protein among the many non-functional variants and in turn linking this functional change to phenotype. Here we present VarMod (Variant Modeller) a method that utilises both protein sequence and structural features to predict nsSNVs that alter protein function. VarMod develops recent observations that functional nsSNVs are enriched at protein-protein interfaces and protein-ligand binding sites and uses these characteristics to make predictions. In benchmarking on a set of nearly 3000 nsSNVs VarMod performance is comparable to an existing state of the art method. The VarMod web server provides extensive resources to investigate the sequence and structural features associated with the predictions including visualisation of protein models and complexes via an interactive JSmol molecular viewer. VarMod is available for use at http://www.wasslab.org/varmod. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Lefkowith, J B; Di Valerio, R; Norris, J; Glick, G D; Alexander, A L; Jackson, L; Gilkeson, G S
1996-08-01
We recently produced a panel of seven glomerular-binding mAbs from a nephritic MRL-lpr mouse that bind to histones/nucleosomes (group I) or DNA (group II) adherent to glomerular basement membrane. To elucidate the molecular basis of their binding and ontogeny, we sequenced their variable (V) regions, analyzed the apparent somatic mutations, and predicted their three-dimensional structures. There were two clonally related sets (3 of 4 in group I, 3 of 3 in group II) both of the VHJ1558 family, and one mAb of the VH 7183 family. V region somatic mutations within clonally related sets had little effect on glomerular binding and did not appear to be selected for based on glomerular binding. The VH regions were most homologous with those from autoantibodies to histones, DNA, or IgG (i.e., rheumatoid factors), the Vkappa regions, with those from autoantibodies to small nuclear ribonucleoproteins (snRNP). The VH regions also exhibited an unusual VD junction (in the group I clonally related set) and an overall high content of charged amino acids (arginine, aspartic acid) in complementarity-determining regions (CDRs), particularly in CDR3. Molecular modeling studies suggested that the Fv regions of these mAbs converge to form a flat, open surface with a net positive charge. The CDR arginines in group I mAbs; appear to be located in Ag contact regions of the binding cleft. In sum, these data suggest that glomerulotropic mAbs are a highly restricted set of Abs with distinctive molecular features that may mediate their binding to glomeruli.
Ali, Mehreen; Khan, Suleiman A; Wennerberg, Krister; Aittokallio, Tero
2018-04-15
Proteomics profiling is increasingly being used for molecular stratification of cancer patients and cell-line panels. However, systematic assessment of the predictive power of large-scale proteomic technologies across various drug classes and cancer types is currently lacking. To that end, we carried out the first pan-cancer, multi-omics comparative analysis of the relative performance of two proteomic technologies, targeted reverse phase protein array (RPPA) and global mass spectrometry (MS), in terms of their accuracy for predicting the sensitivity of cancer cells to both cytotoxic chemotherapeutics and molecularly targeted anticancer compounds. Our results in two cell-line panels demonstrate how MS profiling improves drug response predictions beyond that of the RPPA or the other omics profiles when used alone. However, frequent missing MS data values complicate its use in predictive modeling and required additional filtering, such as focusing on completely measured or known oncoproteins, to obtain maximal predictive performance. Rather strikingly, the two proteomics profiles provided complementary predictive signal both for the cytotoxic and targeted compounds. Further, information about the cellular-abundance of primary target proteins was found critical for predicting the response of targeted compounds, although the non-target features also contributed significantly to the predictive power. The clinical relevance of the selected protein markers was confirmed in cancer patient data. These results provide novel insights into the relative performance and optimal use of the widely applied proteomic technologies, MS and RPPA, which should prove useful in translational applications, such as defining the best combination of omics technologies and marker panels for understanding and predicting drug sensitivities in cancer patients. Processed datasets, R as well as Matlab implementations of the methods are available at https://github.com/mehr-een/bemkl-rbps. mehreen.ali@helsinki.fi or tero.aittokallio@fimm.fi. Supplementary data are available at Bioinformatics online.
Solvent-accessible surface area: How well can be applied to hot-spot detection?
Martins, João M; Ramos, Rui M; Pimenta, António C; Moreira, Irina S
2014-03-01
A detailed comprehension of protein-based interfaces is essential for the rational drug development. One of the key features of these interfaces is their solvent accessible surface area profile. With that in mind, we tested a group of 12 SASA-based features for their ability to correlate and differentiate hot- and null-spots. These were tested in three different data sets, explicit water MD, implicit water MD, and static PDB structure. We found no discernible improvement with the use of more comprehensive data sets obtained from molecular dynamics. The features tested were shown to be capable of discerning between hot- and null-spots, while presenting low correlations. Residue standardization such as rel SASAi or rel/res SASAi , improved the features as a tool to predict ΔΔGbinding values. A new method using support machine learning algorithms was developed: SBHD (Sasa-Based Hot-spot Detection). This method presents a precision, recall, and F1 score of 0.72, 0.81, and 0.76 for the training set and 0.91, 0.73, and 0.81 for an independent test set. Copyright © 2013 Wiley Periodicals, Inc.
Protein-protein interaction inference based on semantic similarity of Gene Ontology terms.
Zhang, Shu-Bo; Tang, Qiang-Rong
2016-07-21
Identifying protein-protein interactions is important in molecular biology. Experimental methods to this issue have their limitations, and computational approaches have attracted more and more attentions from the biological community. The semantic similarity derived from the Gene Ontology (GO) annotation has been regarded as one of the most powerful indicators for protein interaction. However, conventional methods based on GO similarity fail to take advantage of the specificity of GO terms in the ontology graph. We proposed a GO-based method to predict protein-protein interaction by integrating different kinds of similarity measures derived from the intrinsic structure of GO graph. We extended five existing methods to derive the semantic similarity measures from the descending part of two GO terms in the GO graph, then adopted a feature integration strategy to combines both the ascending and the descending similarity scores derived from the three sub-ontologies to construct various kinds of features to characterize each protein pair. Support vector machines (SVM) were employed as discriminate classifiers, and five-fold cross validation experiments were conducted on both human and yeast protein-protein interaction datasets to evaluate the performance of different kinds of integrated features, the experimental results suggest the best performance of the feature that combines information from both the ascending and the descending parts of the three ontologies. Our method is appealing for effective prediction of protein-protein interaction. Copyright © 2016 Elsevier Ltd. All rights reserved.
Leigh, Margaret W; Ferkol, Thomas W; Davis, Stephanie D; Lee, Hye-Seung; Rosenfeld, Margaret; Dell, Sharon D; Sagel, Scott D; Milla, Carlos; Olivier, Kenneth N; Sullivan, Kelli M; Zariwala, Maimoona A; Pittman, Jessica E; Shapiro, Adam J; Carson, Johnny L; Krischer, Jeffrey; Hazucha, Milan J; Knowles, Michael R
2016-08-01
Primary ciliary dyskinesia (PCD), a genetically heterogeneous, recessive disorder of motile cilia, is associated with distinct clinical features. Diagnostic tests, including ultrastructural analysis of cilia, nasal nitric oxide measurements, and molecular testing for mutations in PCD genes, have inherent limitations. To define a statistically valid combination of systematically defined clinical features that strongly associates with PCD in children and adolescents. Investigators at seven North American sites in the Genetic Disorders of Mucociliary Clearance Consortium prospectively and systematically assessed individuals (aged 0-18 yr) referred due to high suspicion for PCD. The investigators defined specific clinical questions for the clinical report form based on expert opinion. Diagnostic testing was performed using standardized protocols and included nasal nitric oxide measurement, ciliary biopsy for ultrastructural analysis of cilia, and molecular genetic testing for PCD-associated genes. Final diagnoses were assigned as "definite PCD" (hallmark ultrastructural defects and/or two mutations in a PCD-associated gene), "probable/possible PCD" (no ultrastructural defect or genetic diagnosis, but compatible clinical features and nasal nitric oxide level in PCD range), and "other diagnosis or undefined." Criteria were developed to define early childhood clinical features on the basis of responses to multiple specific queries. Each defined feature was tested by logistic regression. Sensitivity and specificity analyses were conducted to define the most robust set of clinical features associated with PCD. From 534 participants 18 years of age and younger, 205 were identified as having "definite PCD" (including 164 with two mutations in a PCD-associated gene), 187 were categorized as "other diagnosis or undefined," and 142 were defined as having "probable/possible PCD." Participants with "definite PCD" were compared with the "other diagnosis or undefined" group. Four criteria-defined clinical features were statistically predictive of PCD: laterality defect; unexplained neonatal respiratory distress; early-onset, year-round nasal congestion; and early-onset, year-round wet cough (adjusted odds ratios of 7.7, 6.6, 3.4, and 3.1, respectively). The sensitivity and specificity based on the number of criteria-defined clinical features were four features, 0.21 and 0.99, respectively; three features, 0.50 and 0.96, respectively; and two features, 0.80 and 0.72, respectively. Systematically defined early clinical features could help identify children, including infants, likely to have PCD. Clinical trial registered with ClinicalTrials.gov (NCT00323167).
Impact of experimental design on PET radiomics in predicting somatic mutation status.
Yip, Stephen S F; Parmar, Chintan; Kim, John; Huynh, Elizabeth; Mak, Raymond H; Aerts, Hugo J W L
2017-12-01
PET-based radiomic features have demonstrated great promises in predicting genetic data. However, various experimental parameters can influence the feature extraction pipeline, and hence, Here, we investigated how experimental settings affect the performance of radiomic features in predicting somatic mutation status in non-small cell lung cancer (NSCLC) patients. 348 NSCLC patients with somatic mutation testing and diagnostic PET images were included in our analysis. Radiomic feature extractions were analyzed for varying voxel sizes, filters and bin widths. 66 radiomic features were evaluated. The performance of features in predicting mutations status was assessed using the area under the receiver-operating-characteristic curve (AUC). The influence of experimental parameters on feature predictability was quantified as the relative difference between the minimum and maximum AUC (δ). The large majority of features (n=56, 85%) were significantly predictive for EGFR mutation status (AUC≥0.61). 29 radiomic features significantly predicted EGFR mutations and were robust to experimental settings with δ Overall <5%. The overall influence (δ Overall ) of the voxel size, filter and bin width for all features ranged from 5% to 15%, respectively. For all features, none of the experimental designs was predictive of KRAS+ from KRAS- (AUC≤0.56). The predictability of 29 radiomic features was robust to the choice of experimental settings; however, these settings need to be carefully chosen for all other features. The combined effect of the investigated processing methods could be substantial and must be considered. Optimized settings that will maximize the predictive performance of individual radiomic features should be investigated in the future. Copyright © 2017 Elsevier B.V. All rights reserved.
Tabu search and binary particle swarm optimization for feature selection using microarray data.
Chuang, Li-Yeh; Yang, Cheng-Huei; Yang, Cheng-Hong
2009-12-01
Gene expression profiles have great potential as a medical diagnosis tool because they represent the state of a cell at the molecular level. In the classification of cancer type research, available training datasets generally have a fairly small sample size compared to the number of genes involved. This fact poses an unprecedented challenge to some classification methodologies due to training data limitations. Therefore, a good selection method for genes relevant for sample classification is needed to improve the predictive accuracy, and to avoid incomprehensibility due to the large number of genes investigated. In this article, we propose to combine tabu search (TS) and binary particle swarm optimization (BPSO) for feature selection. BPSO acts as a local optimizer each time the TS has been run for a single generation. The K-nearest neighbor method with leave-one-out cross-validation and support vector machine with one-versus-rest serve as evaluators of the TS and BPSO. The proposed method is applied and compared to the 11 classification problems taken from the literature. Experimental results show that our method simplifies features effectively and either obtains higher classification accuracy or uses fewer features compared to other feature selection methods.
Tabei, Yasuo; Pauwels, Edouard; Stoven, Véronique; Takemoto, Kazuhiro; Yamanishi, Yoshihiro
2012-01-01
Motivation: Drug effects are mainly caused by the interactions between drug molecules and their target proteins including primary targets and off-targets. Identification of the molecular mechanisms behind overall drug–target interactions is crucial in the drug design process. Results: We develop a classifier-based approach to identify chemogenomic features (the underlying associations between drug chemical substructures and protein domains) that are involved in drug–target interaction networks. We propose a novel algorithm for extracting informative chemogenomic features by using L1 regularized classifiers over the tensor product space of possible drug–target pairs. It is shown that the proposed method can extract a very limited number of chemogenomic features without loosing the performance of predicting drug–target interactions and the extracted features are biologically meaningful. The extracted substructure–domain association network enables us to suggest ligand chemical fragments specific for each protein domain and ligand core substructures important for a wide range of protein families. Availability: Softwares are available at the supplemental website. Contact: yamanishi@bioreg.kyushu-u.ac.jp Supplementary Information: Datasets and all results are available at http://cbio.ensmp.fr/~yyamanishi/l1binary/ . PMID:22962471
NASA Technical Reports Server (NTRS)
Burton, Michael G.; Moorhouse, Alan; Brand, P. W. J. L.; Roche, Patrick F.; Geballe, T. R.
1989-01-01
Images were obtained of the (fluorescent) molecular hydrogen 1-0 S(1) line, and of the 3.3 micron emission feature, in Orion's Bar and three reflection nebulae. The emission from these species appears to come from the same spatial locations in all sources observed. This suggests that the 3.3 micron feature is excited by the same energetic UV-photons which cause the molecular hydrogen to fluoresce.
Tabouret, E; Bequet, C; Denicolaï, E; Barrié, M; Nanni, I; Metellus, P; Dufour, Henri; Chinot, O; Figarella-Branger, D
2015-12-01
Pleomorphic xanthoastrocytoma (PXA) is a rare, low-grade glioma that frequently occurs in pediatric patients. To analyze adult patients diagnosed with PXA and to search for pathological and molecular markers of diagnosis and prognosis. We retrospectively included patients older than 16 years with PXA who were referred to our institution between October 2003 and September 2013. All pathological diagnoses were reviewed by a neuropathologist. Histological characteristics and immunostaining of GFAP, OLIG2, neurofilament, CD34, Ki67, p53, p16, and IDH1 R132H were analyzed. The following molecular alterations were analyzed: mutations of IDH1/2, BRAF and the histone H3.3 and the EGFR amplification. Clinical data, treatment modalities, and patient outcome were recorded. We identified 16 adult patients with reviewed PXA diagnosis. No IDH neither histone H3.3 mutations were found; BRAF V600E mutation was recorded in six patients. Ten patients presented with anaplastic features. BRAF mutations were associated with lower Ki67, OLIG2 expression, and lack of p16 expression. Median PFS and OS were 41.5 months (95% CI: 11.4-71.6) and 71.4 months (95% CI: 15.5-127.3), respectively. BRAF mutation tended to be associated with greater PFS (p = 0.051), whereas anaplastic features were associated with minimal PFS (p = 0.042). PXA in adults PXA may present features distinct from pediatric PXA. Anaplastic features and BRAF mutation may potentially identify specific subgroups with distinct prognoses. Copyright © 2015 Elsevier Ltd. All rights reserved.
Monte Carlo modeling of single-molecule cytoplasmic dynein.
Singh, Manoranjan P; Mallik, Roop; Gross, Steven P; Yu, Clare C
2005-08-23
Molecular motors are responsible for active transport and organization in the cell, underlying an enormous number of crucial biological processes. Dynein is more complicated in its structure and function than other motors. Recent experiments have found that, unlike other motors, dynein can take different size steps along microtubules depending on load and ATP concentration. We use Monte Carlo simulations to model the molecular motor function of cytoplasmic dynein at the single-molecule level. The theory relates dynein's enzymatic properties to its mechanical force production. Our simulations reproduce the main features of recent single-molecule experiments that found a discrete distribution of dynein step sizes, depending on load and ATP concentration. The model reproduces the large steps found experimentally under high ATP and no load by assuming that the ATP binding affinities at the secondary sites decrease as the number of ATP bound to these sites increases. Additionally, to capture the essential features of the step-size distribution at very low ATP concentration and no load, the ATP hydrolysis of the primary site must be dramatically reduced when none of the secondary sites have ATP bound to them. We make testable predictions that should guide future experiments related to dynein function.
Paluch, Andrew S; Parameswaran, Sreeja; Liu, Shuai; Kolavennu, Anasuya; Mobley, David L
2015-01-28
We present a general framework to predict the excess solubility of small molecular solids (such as pharmaceutical solids) in binary solvents via molecular simulation free energy calculations at infinite dilution with conventional molecular models. The present study used molecular dynamics with the General AMBER Force Field to predict the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol solvents. The simulations are able to predict the existence of solubility enhancement and the results are in good agreement with available experimental data. The accuracy of the predictions in addition to the generality of the method suggests that molecular simulations may be a valuable design tool for solvent selection in drug development processes.
NASA Astrophysics Data System (ADS)
Paluch, Andrew S.; Parameswaran, Sreeja; Liu, Shuai; Kolavennu, Anasuya; Mobley, David L.
2015-01-01
We present a general framework to predict the excess solubility of small molecular solids (such as pharmaceutical solids) in binary solvents via molecular simulation free energy calculations at infinite dilution with conventional molecular models. The present study used molecular dynamics with the General AMBER Force Field to predict the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol solvents. The simulations are able to predict the existence of solubility enhancement and the results are in good agreement with available experimental data. The accuracy of the predictions in addition to the generality of the method suggests that molecular simulations may be a valuable design tool for solvent selection in drug development processes.
Predicted 25(OH)D score and colorectal cancer risk according to vitamin D receptor expression.
Jung, Seungyoun; Qian, Zhi Rong; Yamauchi, Mai; Bertrand, Kimberly A; Fitzgerald, Kathryn C; Inamura, Kentaro; Kim, Sun A; Mima, Kosuke; Sukawa, Yasutaka; Zhang, Xuehong; Wang, Molin; Smith-Warner, Stephanie A; Wu, Kana; Fuchs, Charles S; Chan, Andrew T; Giovannucci, Edward L; Ng, Kimmie; Cho, Eunyoung; Ogino, Shuji; Nishihara, Reiko
2014-08-01
Despite accumulating evidence for the preventive effect of vitamin D on colorectal carcinogenesis, its precise mechanisms remain unclear. We hypothesized that vitamin D was associated with a lower risk of colorectal cancer with high-level vitamin D receptor (VDR) expression, but not with risk of tumor with low-level VDR expression. Among 140,418 participants followed from 1986 through 2008 in the Nurses' Health Study and the Health Professionals' Follow-up Study, we identified 1,059 incident colorectal cancer cases with tumor molecular data. The predicted 25-hydroxyvitamin D [25(OH)D] score was developed using the known determinants of plasma 25(OH)D. We estimated the HR for cancer subtypes using the duplication method Cox proportional hazards model. A higher predicted 25(OH)D score was associated with a lower risk of colorectal cancer irrespective of VDR expression level (P(heterogeneity) for subtypes = 0.75). Multivariate HRs (95% confidence intervals) comparing the highest with the lowest quintile of predicted 25(OH)D scores were 0.48 (0.30-0.78) for VDR-negative tumor and 0.56 (0.42-0.75) for VDR-positive tumor. Similarly, the significant inverse associations of the predicted 25(OH)D score with colorectal cancer risk did not significantly differ by KRAS, BRAF, or PIK3CA status (P(heterogeneity) for subtypes ≥ 0.22). A higher predicted vitamin D score was significantly associated with a lower colorectal cancer risk, regardless of VDR status and other molecular features examined. The preventive effect of vitamin D on colorectal carcinogenesis may not totally depend on tumor factors. Host factors (such as local and systemic immunity) may need to be considered. ©2014 American Association for Cancer Research.
Behavior dynamics: One perspective
Marr, M. Jackson
1992-01-01
Behavior dynamics is a field devoted to analytic descriptions of behavior change. A principal source of both models and methods for these descriptions is found in physics. This approach is an extension of a long conceptual association between behavior analysis and physics. A theme common to both is the role of molar versus molecular events in description and prediction. Similarities and differences in how these events are treated are discussed. Two examples are presented that illustrate possible correspondence between mechanical and behavioral systems. The first demonstrates the use of a mechanical model to describe the molar properties of behavior under changing reinforcement conditions. The second, dealing with some features of concurrent schedules, focuses on the possible utility of nonlinear dynamical systems to the description of both molar and molecular behavioral events as the outcome of a deterministic, but chaotic, process. PMID:16812655
Ju, Zhe; Wang, Shi-Yun
2018-04-22
As one of the most important and common protein post-translational modifications, citrullination plays a key role in regulating various biological processes and is associated with several human diseases. The accurate identification of citrullination sites is crucial for elucidating the underlying molecular mechanisms of citrullination and designing drugs for related human diseases. In this study, a novel bioinformatics tool named CKSAAP_CitrSite is developed for the prediction of citrullination sites. With the assistance of support vector machine algorithm, the highlight of CKSAAP_CitrSite is to adopt the composition of k-spaced amino acid pairs surrounding a query site as input. As illustrated by 10-fold cross-validation, CKSAAP_CitrSite achieves a satisfactory performance with a Sensitivity of 77.59%, a Specificity of 95.26%, an Accuracy of 89.37% and a Matthew's correlation coefficient of 0.7566, which is much better than those of the existing prediction method. Feature analysis shows that the N-terminal space containing pairs may play an important role in the prediction of citrullination sites, and the arginines close to N-terminus tend to be citrullinated. The conclusions derived from this study could offer useful information for elucidating the molecular mechanisms of citrullination and related experimental validations. A user-friendly web-server for CKSAAP_CitrSite is available at 123.206.31.171/CKSAAP_CitrSite/. Copyright © 2017. Published by Elsevier B.V.
Fouad, Marwa A; Tolba, Enas H; El-Shal, Manal A; El Kerdawy, Ahmed M
2018-05-11
The justified continuous emerging of new β-lactam antibiotics provokes the need for developing suitable analytical methods that accelerate and facilitate their analysis. A face central composite experimental design was adopted using different levels of phosphate buffer pH, acetonitrile percentage at zero time and after 15 min in a gradient program to obtain the optimum chromatographic conditions for the elution of 31 β-lactam antibiotics. Retention factors were used as the target property to build two QSRR models utilizing the conventional forward selection and the advanced nature-inspired firefly algorithm for descriptor selection, coupled with multiple linear regression. The obtained models showed high performance in both internal and external validation indicating their robustness and predictive ability. Williams-Hotelling test and student's t-test showed that there is no statistical significant difference between the models' results. Y-randomization validation showed that the obtained models are due to significant correlation between the selected molecular descriptors and the analytes' chromatographic retention. These results indicate that the generated FS-MLR and FFA-MLR models are showing comparable quality on both the training and validation levels. They also gave comparable information about the molecular features that influence the retention behavior of β-lactams under the current chromatographic conditions. We can conclude that in some cases simple conventional feature selection algorithm can be used to generate robust and predictive models comparable to that are generated using advanced ones. Copyright © 2018 Elsevier B.V. All rights reserved.
Burington, Bart; Barlogie, Bart; Zhan, Fenghuang; Crowley, John; Shaughnessy, John D.
2013-01-01
Changes in global gene expression patterns in tumor cells following in vivo therapy may vary by treatment and provide added or synergistic prognostic power over pretherapy gene expression profiles (GEP). This molecular readout of drug-cell interaction may also point to mechanisms of action/resistance. In newly diagnosed patients with multiple myeloma (MM), microarray data were obtained on tumor cells prior to and 48 hours after in vivo treatment using dexamethasone (n = 45) or thalidomide (n = 42); in the case of relapsed MM, microarray data were obtained prior to (n = 36) and after (n = 19) lenalidomide administration. Dexamethasone and thalidomide induced both common and unique GEP changes in tumor cells. Combined baseline and 48-hour changes in GEP in a subset of genes, many related to oxidative stress and cytoskeletal dynamics, were predictive of outcome in newly diagnosed MM patients receiving tandem transplants. Thalidomide-altered genes also changed following lenalidomide exposure and predicted event-free and overall survival in relapsed patients receiving lenalidomide as a single agent. Combined with baseline molecular features, changes in GEP following short-term single-agent exposure may help guide treatment decisions for patients with MM. Genes whose drug-altered expression were found to be related to survival may point to molecular switches related to response and/or resistance to different classes of drugs. PMID:18676754
InterPred: A pipeline to identify and model protein-protein interactions.
Mirabello, Claudio; Wallner, Björn
2017-06-01
Protein-protein interactions (PPI) are crucial for protein function. There exist many techniques to identify PPIs experimentally, but to determine the interactions in molecular detail is still difficult and very time-consuming. The fact that the number of PPIs is vastly larger than the number of individual proteins makes it practically impossible to characterize all interactions experimentally. Computational approaches that can bridge this gap and predict PPIs and model the interactions in molecular detail are greatly needed. Here we present InterPred, a fully automated pipeline that predicts and model PPIs from sequence using structural modeling combined with massive structural comparisons and molecular docking. A key component of the method is the use of a novel random forest classifier that integrate several structural features to distinguish correct from incorrect protein-protein interaction models. We show that InterPred represents a major improvement in protein-protein interaction detection with a performance comparable or better than experimental high-throughput techniques. We also show that our full-atom protein-protein complex modeling pipeline performs better than state of the art protein docking methods on a standard benchmark set. In addition, InterPred was also one of the top predictors in the latest CAPRI37 experiment. InterPred source code can be downloaded from http://wallnerlab.org/InterPred Proteins 2017; 85:1159-1170. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Inoue, Hiroshi; Okuya, Shigeru; Ohta, Yasuharu; Akiyama, Masaru; Taguchi, Akihiko; Kora, Yukari; Okayama, Naoko; Yamada, Yuichiro; Wada, Yasuhiko; Amemiya, Shin; Sugihara, Shigetaka; Nakao, Yuzo; Oka, Yoshitomo; Tanizawa, Yukio
2014-01-01
Background Wolfram syndrome (WFS) is a recessive neurologic and endocrinologic degenerative disorder, and is also known as DIDMOAD (Diabetes Insipidus, early-onset Diabetes Mellitus, progressive Optic Atrophy and Deafness) syndrome. Most affected individuals carry recessive mutations in the Wolfram syndrome 1 gene (WFS1). However, the phenotypic pleiomorphism, rarity and molecular complexity of this disease complicate our efforts to understand WFS. To address this limitation, we aimed to describe complications and to elucidate the contributions of WFS1 mutations to clinical manifestations in Japanese patients with WFS. Methodology The minimal ascertainment criterion for diagnosing WFS was having both early onset diabetes mellitus and bilateral optic atrophy. Genetic analysis for WFS1 was performed by direct sequencing. Principal Findings Sixty-seven patients were identified nationally for a prevalence of one per 710,000, with 33 patients (49%) having all 4 components of DIDMOAD. In 40 subjects who agreed to participate in this investigation from 30 unrelated families, the earliest manifestation was DM at a median age of 8.7 years, followed by OA at a median age of 15.8 years. However, either OA or DI was the first diagnosed feature in 6 subjects. In 10, features other than DM predated OA. Twenty-seven patients (67.5%) had a broad spectrum of recessive mutations in WFS1. Two patients had mutations in only one allele. Eleven patients (27.5%) had intact WFS1 alleles. Ages at onset of both DM and OA in patients with recessive WFS1 mutations were indistinguishable from those in patients without WFS1 mutations. In the patients with predicted complete loss-of-function mutations, ages at the onsets of both DM and OA were significantly earlier than those in patients with predicted partial-loss-of function mutations. Conclusion/Significance This study emphasizes the clinical and genetic heterogeneity in patients with WFS. Genotype-phenotype correlations may exist in patients with WFS1 mutations, as demonstrated by the disease onset. PMID:25211237
van Rhijn, Bas W G; Catto, James W; Goebell, Peter J; Knüchel, Ruth; Shariat, Shahrokh F; van der Poel, Henk G; Sanchez-Carbayo, Marta; Thalmann, George N; Schmitz-Dräger, Bernd J; Kiemeney, Lambertus A
2014-10-01
To summarize the current status of clinicopathological and molecular markers for the prediction of recurrence or progression or both in non-muscle-invasive and survival in muscle-invasive urothelial bladder cancer, to address the reproducibility of pathology and molecular markers, and to provide directions toward implementation of molecular markers in future clinical decision making. Immunohistochemistry, gene signatures, and FGFR3-based molecular grading were used as molecular examples focussing on prognostics and issues related to robustness of pathological and molecular assays. The role of molecular markers to predict recurrence is limited, as clinical variables are currently more important. The prediction of progression and survival using molecular markers holds considerable promise. Despite a plethora of prognostic (clinical and molecular) marker studies, reproducibility of pathology and molecular assays has been understudied, and lack of reproducibility is probably the main reason that individual prediction of disease outcome is currently not reliable. Molecular markers are promising to predict progression and survival, but not recurrence. However, none of these are used in the daily clinical routine because of reproducibility issues. Future studies should focus on reproducibility of marker assessment and consistency of study results by incorporating scoring systems to reduce heterogeneity of reporting. This may ultimately lead to incorporation of molecular markers in clinical practice. Copyright © 2014 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Xiaolin; Ye, Li; Wang, Xiaoxiang
2012-12-15
Several recent reports suggested that hydroxylated polybrominated diphenyl ethers (HO-PBDEs) may disturb thyroid hormone homeostasis. To illuminate the structural features for thyroid hormone activity of HO-PBDEs and the binding mode between HO-PBDEs and thyroid hormone receptor (TR), the hormone activity of a series of HO-PBDEs to thyroid receptors β was studied based on the combination of 3D-QSAR, molecular docking, and molecular dynamics (MD) methods. The ligand- and receptor-based 3D-QSAR models were obtained using Comparative Molecular Similarity Index Analysis (CoMSIA) method. The optimum CoMSIA model with region focusing yielded satisfactory statistical results: leave-one-out cross-validation correlation coefficient (q{sup 2}) was 0.571 andmore » non-cross-validation correlation coefficient (r{sup 2}) was 0.951. Furthermore, the results of internal validation such as bootstrapping, leave-many-out cross-validation, and progressive scrambling as well as external validation indicated the rationality and good predictive ability of the best model. In addition, molecular docking elucidated the conformations of compounds and key amino acid residues at the docking pocket, MD simulation further determined the binding process and validated the rationality of docking results. -- Highlights: ► The thyroid hormone activities of HO-PBDEs were studied by 3D-QSAR. ► The binding modes between HO-PBDEs and TRβ were explored. ► 3D-QSAR, molecular docking, and molecular dynamics (MD) methods were performed.« less
Fu, Chien-wei; Lin, Thy-Hou
2017-01-01
As an important enzyme in Phase I drug metabolism, the flavin-containing monooxygenase (FMO) also metabolizes some xenobiotics with soft nucleophiles. The site of metabolism (SOM) on a molecule is the site where the metabolic reaction is exerted by an enzyme. Accurate prediction of SOMs on drug molecules will assist the search for drug leads during the optimization process. Here, some quantum mechanics features such as the condensed Fukui function and attributes from circular fingerprints (called Molprint2D) are computed and classified using the support vector machine (SVM) for predicting some potential SOMs on a series of drugs that can be metabolized by FMO enzymes. The condensed Fukui function fA− representing the nucleophilicity of central atom A and the attributes from circular fingerprints accounting the influence of neighbors on the central atom. The total number of FMO substrates and non-substrates collected in the study is 85 and they are equally divided into the training and test sets with each carrying roughly the same number of potential SOMs. However, only N-oxidation and S-oxidation features were considered in the prediction since the available C-oxidation data was scarce. In the training process, the LibSVM package of WEKA package and the option of 10-fold cross validation are employed. The prediction performance on the test set evaluated by accuracy, Matthews correlation coefficient and area under ROC curve computed are 0.829, 0.659, and 0.877 respectively. This work reveals that the SVM model built can accurately predict the potential SOMs for drug molecules that are metabolizable by the FMO enzymes. PMID:28072829
Huang, Yu-An; You, Zhu-Hong; Chen, Xing
2018-01-01
Drug-Target Interactions (DTI) play a crucial role in discovering new drug candidates and finding new proteins to target for drug development. Although the number of detected DTI obtained by high-throughput techniques has been increasing, the number of known DTI is still limited. On the other hand, the experimental methods for detecting the interactions among drugs and proteins are costly and inefficient. Therefore, computational approaches for predicting DTI are drawing increasing attention in recent years. In this paper, we report a novel computational model for predicting the DTI using extremely randomized trees model and protein amino acids information. More specifically, the protein sequence is represented as a Pseudo Substitution Matrix Representation (Pseudo-SMR) descriptor in which the influence of biological evolutionary information is retained. For the representation of drug molecules, a novel fingerprint feature vector is utilized to describe its substructure information. Then the DTI pair is characterized by concatenating the two vector spaces of protein sequence and drug substructure. Finally, the proposed method is explored for predicting the DTI on four benchmark datasets: Enzyme, Ion Channel, GPCRs and Nuclear Receptor. The experimental results demonstrate that this method achieves promising prediction accuracies of 89.85%, 87.87%, 82.99% and 81.67%, respectively. For further evaluation, we compared the performance of Extremely Randomized Trees model with that of the state-of-the-art Support Vector Machine classifier. And we also compared the proposed model with existing computational models, and confirmed 15 potential drug-target interactions by looking for existing databases. The experiment results show that the proposed method is feasible and promising for predicting drug-target interactions for new drug candidate screening based on sizeable features. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
CNN-BLPred: a Convolutional neural network based predictor for β-Lactamases (BL) and their classes.
White, Clarence; Ismail, Hamid D; Saigo, Hiroto; Kc, Dukka B
2017-12-28
The β-Lactamase (BL) enzyme family is an important class of enzymes that plays a key role in bacterial resistance to antibiotics. As the newly identified number of BL enzymes is increasing daily, it is imperative to develop a computational tool to classify the newly identified BL enzymes into one of its classes. There are two types of classification of BL enzymes: Molecular Classification and Functional Classification. Existing computational methods only address Molecular Classification and the performance of these existing methods is unsatisfactory. We addressed the unsatisfactory performance of the existing methods by implementing a Deep Learning approach called Convolutional Neural Network (CNN). We developed CNN-BLPred, an approach for the classification of BL proteins. The CNN-BLPred uses Gradient Boosted Feature Selection (GBFS) in order to select the ideal feature set for each BL classification. Based on the rigorous benchmarking of CCN-BLPred using both leave-one-out cross-validation and independent test sets, CCN-BLPred performed better than the other existing algorithms. Compared with other architectures of CNN, Recurrent Neural Network, and Random Forest, the simple CNN architecture with only one convolutional layer performs the best. After feature extraction, we were able to remove ~95% of the 10,912 features using Gradient Boosted Trees. During 10-fold cross validation, we increased the accuracy of the classic BL predictions by 7%. We also increased the accuracy of Class A, Class B, Class C, and Class D performance by an average of 25.64%. The independent test results followed a similar trend. We implemented a deep learning algorithm known as Convolutional Neural Network (CNN) to develop a classifier for BL classification. Combined with feature selection on an exhaustive feature set and using balancing method such as Random Oversampling (ROS), Random Undersampling (RUS) and Synthetic Minority Oversampling Technique (SMOTE), CNN-BLPred performs significantly better than existing algorithms for BL classification.
Paluch, Andrew S.; Parameswaran, Sreeja; Liu, Shuai; Kolavennu, Anasuya; Mobley, David L.
2015-01-01
We present a general framework to predict the excess solubility of small molecular solids (such as pharmaceutical solids) in binary solvents via molecular simulation free energy calculations at infinite dilution with conventional molecular models. The present study used molecular dynamics with the General AMBER Force Field to predict the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol solvents. The simulations are able to predict the existence of solubility enhancement and the results are in good agreement with available experimental data. The accuracy of the predictions in addition to the generality of the method suggests that molecular simulations may be a valuable design tool for solvent selection in drug development processes. PMID:25637996
Expanding the molecular-ruler process through vapor deposition of hexadecanethiol
Patron, Alexandra M; Hooker, Timothy S; Santavicca, Daniel F
2017-01-01
The development of methods to produce nanoscale features with tailored chemical functionalities is fundamental for applications such as nanoelectronics and sensor fabrication. The molecular-ruler process shows great utility for this purpose as it combines top-down lithography for the creation of complex architectures over large areas in conjunction with molecular self-assembly, which enables precise control over the physical and chemical properties of small local features. The molecular-ruler process, which most commonly uses mercaptoalkanoic acids and metal ions to generate metal-ligated multilayers, can be employed to produce registered nanogaps between metal features. Expansion of this methodology to include molecules with other chemical functionalities could greatly expand the overall versatility, and thus the utility, of this process. Herein, we explore the use of alkanethiol molecules as the terminating layer of metal-ligated multilayers. During this study, it was discovered that the solution deposition of alkanethiol molecules resulted in low overall surface coverage with features that varied in height. Because features with varied heights are not conducive to the production of uniform nanogaps via the molecular-ruler process, the vapor-phase deposition of alkanethiol molecules was explored. Unlike the solution-phase deposition, alkanethiol islands produced by vapor-phase deposition exhibited markedly higher surface coverages of uniform heights. To illustrate the applicability of this method, metal-ligated multilayers, both with and without an alkanethiol capping layer, were utilized to create nanogaps between Au features using the molecular-ruler process. PMID:29181290
NASA Astrophysics Data System (ADS)
Rifai, Eko Aditya; van Dijk, Marc; Vermeulen, Nico P. E.; Geerke, Daan P.
2018-01-01
Computational protein binding affinity prediction can play an important role in drug research but performing efficient and accurate binding free energy calculations is still challenging. In the context of phase 2 of the Drug Design Data Resource (D3R) Grand Challenge 2 we used our automated eTOX ALLIES approach to apply the (iterative) linear interaction energy (LIE) method and we evaluated its performance in predicting binding affinities for farnesoid X receptor (FXR) agonists. Efficiency was obtained by our pre-calibrated LIE models and molecular dynamics (MD) simulations at the nanosecond scale, while predictive accuracy was obtained for a small subset of compounds. Using our recently introduced reliability estimation metrics, we could classify predictions with higher confidence by featuring an applicability domain (AD) analysis in combination with protein-ligand interaction profiling. The outcomes of and agreement between our AD and interaction-profile analyses to distinguish and rationalize the performance of our predictions highlighted the relevance of sufficiently exploring protein-ligand interactions during training and it demonstrated the possibility to quantitatively and efficiently evaluate if this is achieved by using simulation data only.
Sonographic features of invasive ductal breast carcinomas predictive of malignancy grade
Gupta, Kanika; Kumaresan, Meenakshisundaram; Venkatesan, Bhuvaneswari; Chandra, Tushar; Patil, Aruna; Menon, Maya
2018-01-01
Context: Assessment of individual sonographic features provides vital clues about the biological behavior of breast masses and can assist in determining histological grade of malignancy and thereby prognosis. Aims: Assessment of individual sonographic features of biopsy proven invasive ductal breast carcinomas as predictors of malignancy grade. Settings and Design: A retrospective analysis of sonographic findings of 103 biopsy proven invasive ductal breast carcinomas. Materials and Methods: Tumor characteristics on gray-scale ultrasound and color flow were assessed using American College of Radiology (ACR) Breast Imaging Reporting and Data System (BI-RADS) Atlas Fifth Edition. The sonographic findings of masses were individually correlated with their histopathologic grades. Statistical Analysis Used: Chi square test, ordinal regression, and Goodman and Kruskal tau test. Results: Breast mass showing reversal/lack of diastolic flow has a high probability of belonging to histological high grade tumor (β 1.566, P 0.0001). The masses with abrupt interface boundary are more likely grade 3 (β 1.524, P 0.001) in comparison to masses with echogenic halos. The suspicious calcifications present in and outside the mass is a finding associated with histologically high grade tumors. The invasive ductal carcinomas (IDCs) with complex solid and cystic echotexture are more likely to be of high histological grade (β 1.146, P 0.04) as compared to masses with hypoechoic echotexture. Conclusions: Certain ultrasound features are associated with tumor grade on histopathology. If the radiologist is cognizant of these sonographic features, ultrasound can be a potent modality for predicting histopathological grade of IDCs of the breast, especially in settings where advanced tests such as receptor and molecular analyses are limited. PMID:29692540
Lee, Su Hyun; Chang, Jung Min; Shin, Sung Ui; Chu, A Jung; Yi, Ann; Cho, Nariya; Moon, Woo Kyung
2017-12-01
To evaluate imaging features of breast cancers on digital breast tomosynthesis (DBT) according to molecular subtype and to determine whether the molecular subtype affects breast cancer detection on DBT. This was an institutional review board--approved study with a waiver of informed consent. DBT findings of 288 invasive breast cancers were reviewed according to Breast Imaging Reporting and Data System lexicon. Detectability of breast cancer was quantified by the number of readers (0-3) who correctly detected the cancer in an independent blinded review. DBT features and the cancer detectability score according to molecular subtype were compared using Fisher's exact test and analysis of variance. Of 288 invasive cancers, 194 were hormone receptor (HR)-positive, 48 were human epidermal growth factor receptor 2 (HER2) positive and 46 were triple negative breast cancers. The most common DBT findings were irregular spiculated masses for HR-positive cancer, fine pleomorphic or linear branching calcifications for HER2 positive cancer and irregular masses with circumscribed margins for triple negative breast cancers (p < 0.001). Cancer detectability on DBT was not significantly different according to molecular subtype (p = 0.213) but rather affected by tumour size, breast density and presence of mass or calcifications. Breast cancers showed different imaging features according to molecular subtype; however, it did not affect the cancer detectability on DBT. Advances in knowledge: DBT showed characteristic imaging features of breast cancers according to molecular subtype. However, cancer detectability on DBT was not affected by molecular subtype of breast cancers.
[Diagnostic molecular pathology of lymphatic and myeloid neoplasms].
Klapper, W; Kreipe, H
2015-03-01
Molecular pathology has been an integral part of the diagnostics of tumors of the hematopoietic system substantially longer than for solid neoplasms. In contrast to solid tumors, the primary objective of molecular pathology in hematopoietic neoplasms is not the prediction of drug efficacy but the diagnosis itself by excluding reactive proliferation and by using molecular features for tumor classification. In the case of malignant lymphomas, the most commonly applied molecular tests are those for gene rearrangements for immunoglobulin heavy chains and T-cell receptors. However, this article puts the focus on new and diagnostically relevant assays in hematopathology. Among these are mutations of MYD88 codon 265 in lymphoplasmacytic lymphomas, B-raf V600E in hairy cell leukemia and Stat3 exon 21 in indolent T-cell lymphomas. In myeloproliferative neoplasms, MPL W515, calreticulin exon 9 and the BCR-ABL and JAK2 V617F junctions are the most frequently analyzed differentiation series. In myelodysplastic and myeloproliferative neoplasms, SRSF2, SETBP1 and CSF3R mutations provide important differential diagnostic information. Genes mutated in myelodysplastic syndromes (MDS) are particularly diverse but their analysis significantly improves the differential diagnostics between reactive conditions and MDS. The most frequent changes in MDS include mutations of TET2 and various genes encoding splicing factors.
Molecular simulations of electrolyte structure and dynamics in lithium-sulfur battery solvents
NASA Astrophysics Data System (ADS)
Park, Chanbum; Kanduč, Matej; Chudoba, Richard; Ronneburg, Arne; Risse, Sebastian; Ballauff, Matthias; Dzubiella, Joachim
2018-01-01
The performance of modern lithium-sulfur (Li/S) battery systems critically depends on the electrolyte and solvent compositions. For fundamental molecular insights and rational guidance of experimental developments, efficient and sufficiently accurate molecular simulations are thus in urgent need. Here, we construct a molecular dynamics (MD) computer simulation model of representative state-of-the art electrolyte-solvent systems for Li/S batteries constituted by lithium-bis(trifluoromethane)sulfonimide (LiTFSI) and LiNO3 electrolytes in mixtures of the organic solvents 1,2-dimethoxyethane (DME) and 1,3-dioxolane (DOL). We benchmark and verify our simulations by comparing structural and dynamic features with various available experimental reference systems and demonstrate their applicability for a wide range of electrolyte-solvent compositions. For the state-of-the-art battery solvent, we finally calculate and discuss the detailed composition of the first lithium solvation shell, the temperature dependence of lithium diffusion, as well as the electrolyte conductivities and lithium transference numbers. Our model will serve as a basis for efficient future predictions of electrolyte structure and transport in complex electrode confinements for the optimization of modern Li/S batteries (and related devices).
Yu, Shuling; Yuan, Jintao; Zhang, Yi; Gao, Shufang; Gan, Ying; Han, Meng; Chen, Yuewen; Zhou, Qiaoqiao; Shi, Jiahua
2017-06-01
Sodium-glucose cotransporter 2 (SGLT2) is a promising target for diabetes therapy. We aimed to develop computational approaches to identify structural features for more potential SGLT2 inhibitors. In this work, 46 triazole derivatives as SGLT2 inhibitors were studied using a combination of several approaches, including hologram quantitative structure-activity relationships (HQSAR), topomer comparative molecular field analysis (CoMFA), homology modeling, and molecular docking. HQSAR and topomer CoMFA were used to construct models. Molecular docking was conducted to investigate the interaction of triazole derivatives and homology modeling of SGLT2, as well as to validate the results of the HQSAR and topomer CoMFA models. The most effective HQSAR and topomer CoMFA models exhibited noncross-validated correlation coefficients of 0.928 and 0.891 for the training set, respectively. External predictions were made successfully on a test set and then compared with previously reported models. The graphical results of HQSAR and topomer CoMFA were proven to be consistent with the binding mode of the inhibitors and SGLT2 from molecular docking. The models and docking provided important insights into the design of potent inhibitors for SGLT2.
Raghi, K R; Sherin, D R; Saumya, M J; Arun, P S; Sobha, V N; Manojkumar, T K
2018-04-05
Chronic myeloid leukemia (CML), a hematological malignancy arises due to the spontaneous fusion of the BCR and ABL gene, resulting in a constitutively active tyrosine kinase (BCR-ABL). Pharmacological activity of Gallic acid and 1,3,4-Oxadiazole as potential inhibitors of ABL kinase has already been reported. Objective of this study is to evaluate the ABL kinase inhibitory activity of derivatives of Gallic acid fused with 1,3,4-Oxadiazole moieties. Attempts have been made to identify the key structural features responsible for drug likeness of the Gallic acid and the 1,3,4-Oxadiazole ring using molecular electrostatic potential maps (MESP). To investigate the inhibitory activity of Gallic acid derivatives towards the ABL receptor, we have applied molecular docking and molecular dynamics (MD) simulation approaches. A comparative study was performed using Bosutinib as the standard which is an approved CML drug acting on the same receptor. Furthermore, the novel compounds designed and reported here in were evaluated for ADME properties and the results indicate that they show acceptable pharmacokinetic properties. Accordingly these compounds are predicted to be drug like with low toxicity potential. Copyright © 2018 Elsevier Ltd. All rights reserved.
Signature properties of water: Their molecular electronic origins
Jones, Andrew P.; Cipcigan, Flaviu S.; Crain, Jason; Martyna, Glenn J.
2015-01-01
Water challenges our fundamental understanding of emergent materials properties from a molecular perspective. It exhibits a uniquely rich phenomenology including dramatic variations in behavior over the wide temperature range of the liquid into water’s crystalline phases and amorphous states. We show that many-body responses arising from water’s electronic structure are essential mechanisms harnessed by the molecule to encode for the distinguishing features of its condensed states. We treat the complete set of these many-body responses nonperturbatively within a coarse-grained electronic structure derived exclusively from single-molecule properties. Such a “strong coupling” approach generates interaction terms of all symmetries to all orders, thereby enabling unique transferability to diverse local environments such as those encountered along the coexistence curve. The symmetries of local motifs that can potentially emerge are not known a priori. Consequently, electronic responses unfiltered by artificial truncation are then required to embody the terms that tip the balance to the correct set of structures. Therefore, our fully responsive molecular model produces, a simple, accurate, and intuitive picture of water’s complexity and its molecular origin, predicting water’s signature physical properties from ice, through liquid–vapor coexistence, to the critical point. PMID:25941394
Krall, Jacob; Jensen, Claus Hatt; Bavo, Francesco; Falk-Petersen, Christina Birkedahl; Haugaard, Anne Stæhr; Vogensen, Stine Byskov; Tian, Yongsong; Nittegaard-Nielsen, Mia; Sigurdardóttir, Sara Björk; Kehler, Jan; Kongstad, Kenneth Thermann; Gloriam, David E; Clausen, Rasmus Prætorius; Harpsøe, Kasper; Wellendorph, Petrine; Frølund, Bente
2017-11-09
γ-Hydroxybutyric acid (GHB) is a neuroactive substance with specific high-affinity binding sites. To facilitate target identification and ligand optimization, we herein report a comprehensive structure-affinity relationship study for novel ligands targeting these binding sites. A molecular hybridization strategy was used based on the conformationally restricted 3-hydroxycyclopent-1-enecarboxylic acid (HOCPCA) and the linear GHB analog trans-4-hydroxycrotonic acid (T-HCA). In general, all structural modifications performed on HOCPCA led to reduced affinity. In contrast, introduction of diaromatic substituents into the 4-position of T-HCA led to high-affinity analogs (medium nanomolar K i ) for the GHB high-affinity binding sites as the most high-affinity analogs reported to date. The SAR data formed the basis for a three-dimensional pharmacophore model for GHB ligands, which identified molecular features important for high-affinity binding, with high predictive validity. These findings will be valuable in the further processes of both target characterization and ligand identification for the high-affinity GHB binding sites.
Stargate GTM: Bridging Descriptor and Activity Spaces.
Gaspar, Héléna A; Baskin, Igor I; Marcou, Gilles; Horvath, Dragos; Varnek, Alexandre
2015-11-23
Predicting the activity profile of a molecule or discovering structures possessing a specific activity profile are two important goals in chemoinformatics, which could be achieved by bridging activity and molecular descriptor spaces. In this paper, we introduce the "Stargate" version of the Generative Topographic Mapping approach (S-GTM) in which two different multidimensional spaces (e.g., structural descriptor space and activity space) are linked through a common 2D latent space. In the S-GTM algorithm, the manifolds are trained simultaneously in two initial spaces using the probabilities in the 2D latent space calculated as a weighted geometric mean of probability distributions in both spaces. S-GTM has the following interesting features: (1) activities are involved during the training procedure; therefore, the method is supervised, unlike conventional GTM; (2) using molecular descriptors of a given compound as input, the model predicts a whole activity profile, and (3) using an activity profile as input, areas populated by relevant chemical structures can be detected. To assess the performance of S-GTM prediction models, a descriptor space (ISIDA descriptors) of a set of 1325 GPCR ligands was related to a B-dimensional (B = 1 or 8) activity space corresponding to pKi values for eight different targets. S-GTM outperforms conventional GTM for individual activities and performs similarly to the Lasso multitask learning algorithm, although it is still slightly less accurate than the Random Forest method.
Petersen, Bent; Lundegaard, Claus; Petersen, Thomas Nordahl
2010-01-01
β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC = 0.50, Qtotal = 82.1%, sensitivity = 75.6%, PPV = 68.8% and AUC = 0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17 – 0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. Conclusion The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences. PMID:21152409
Petersen, Bent; Lundegaard, Claus; Petersen, Thomas Nordahl
2010-11-30
β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.
RED: a set of molecular descriptors based on Renyi entropy.
Delgado-Soler, Laura; Toral, Raul; Tomás, M Santos; Rubio-Martinez, Jaime
2009-11-01
New molecular descriptors, RED (Renyi entropy descriptors), based on the generalized entropies introduced by Renyi are presented. Topological descriptors based on molecular features have proven to be useful for describing molecular profiles. Renyi entropy is used as a variability measure to contract a feature-pair distribution composing the descriptor vector. The performance of RED descriptors was tested for the analysis of different sets of molecular distances, virtual screening, and pharmacological profiling. A free parameter of the Renyi entropy has been optimized for all the considered applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rahman, Md. Mostafizar; Yu, Peiqiang
Progress in ruminant feed research is no more feasible only based on wet chemical analysis, which is merely able to provide information on chemical composition of feeds regardless of their digestive features and nutritive value in ruminants. Studying internal structural make-up of functional groups/feed nutrients is often vital for understanding the digestive behaviors and nutritive values of feeds in ruminant because the intrinsic structure of feed nutrients is more related to its overall absorption. In this article, the detail information on the recent developments in molecular spectroscopic techniques to reveal microstructural information of feed nutrients and the use of nutritionmore » models in regards to ruminant feed research was reviewed. The emphasis of this review was on (1) the technological progress in the use of molecular spectroscopic techniques in ruminant feed research; (2) revealing spectral analysis of functional groups of biomolecules/feed nutrients; (3) the use of advanced nutrition models for better prediction of nutrient availability in ruminant systems; and (4) the application of these molecular techniques and combination of nutrient models in cereals, co-products and pulse crop research. The information described in this article will promote better insight in the progress of research on molecular structural make-up of feed nutrients in ruminants.« less
Khan, Nazir Ahmad; Booker, Helen; Yu, Peiqiang
2014-07-16
The objectives of this study were to investigate the chemical profiles; crude protein (CP) subfractions; ruminal CP degradation characteristics and intestinal digestibility of rumen undegraded protein (RUP); and protein molecular structures using molecular spectroscopy of newly developed yellow-seeded flax (Linum usitatissimum L.). Seeds from two yellow flaxseed breeding lines and two brown flaxseed varieties were evaluated. The yellow-seeded lines had higher (P < 0.001) contents of oil (44.54 vs 41.42% dry matter (DM)) and CP (24.94 vs 20.91% DM) compared to those of the brown-seeded varieties. The CP in yellow seeds contained lower (P < 0.01) contents of true protein subfraction (81.31 vs 92.71% CP) and more (P < 0.001) extensively degraded (70.8 vs 64.9% CP) in rumen resulting in lower (P < 0.001) content of RUP (29.2 vs 35.1% CP) than that in the brown-seeded varieties. However, the total supply of digestible RUP was not significantly different between the two seed types. Regression equations based on protein molecular structural features gave relatively good estimation for the contents of CP (R(2) = 0.87), soluble CP (R(2) = 0.92), RUP (R(2) = 0.97), and intestinal digestibility of RUP (R(2) = 0.71). In conclusion, molecular spectroscopy can be used to rapidly characterize feed protein molecular structures and predict their nutritive value.
Pira, Silvain L; Wallace, Timothy W; Graham, Jonathan P
2009-04-02
(-)-5-Methyl-6,7-dihydro-5H-dibenz[c,e]azepine 4, a new secondary amine featuring an axis-center stereochemical relay, was prepared enantioselectively from 2'-acetylbiphenyl-2-carboxylic acid, using (R)-2-phenylglycinol as an auxiliary for the control of both elements of chirality. The biaryl axis in 4 preferentially adopts the aS-configuration, with the methyl substituent pseudoequatorial, but conversion into the corresponding N-Boc derivative locks the axis into the aR-configuration, as predicted on the basis of molecular mechanics calculations.
Pulitzer, Melissa
2017-06-01
Merkel cell carcinoma (MCC) encompasses neuroendocrine carcinomas primary to skin and occurs most commonly in association with clonally integrated Merkel cell polyomavirus with related retinoblastoma protein sequestration or in association with UV radiation-induced alterations involving the TP53 gene and mutations, heterozygous deletion, and hypermethylation of the Retinoblastoma gene. Molecular genetic signatures may provide therapeutic guidance. Morphologic features, although patterned, are associated with predictable diagnostic pitfalls, usually resolvable by immunohistochemistry. Therapeutic options for MCC, traditionally limited to surgical intervention and later chemotherapy and radiation, are growing, given promising early results of immunotherapeutic regimens. Copyright © 2017 Elsevier Inc. All rights reserved.
Modeling Shear Induced Von Willebrand Factor Binding to Collagen
NASA Astrophysics Data System (ADS)
Dong, Chuqiao; Wei, Wei; Morabito, Michael; Webb, Edmund; Oztekin, Alparslan; Zhang, Xiaohui; Cheng, Xuanhong
2017-11-01
Von Willebrand factor (vWF) is a blood glycoprotein that binds with platelets and collagen on injured vessel surfaces to form clots. VWF bioactivity is shear flow induced: at low shear, binding between VWF and other biological entities is suppressed; for high shear rate conditions - as are found near arterial injury sites - VWF elongates, activating its binding with platelets and collagen. Based on parameters derived from single molecule force spectroscopy experiments, we developed a coarse-grain molecular model to simulate bond formation probability as a function of shear rate. By introducing a binding criterion that depends on the conformation of a sub-monomer molecular feature of our model, the model predicts shear-induced binding, even for conditions where binding is highly energetically favorable. We further investigate the influence of various model parameters on the ability to predict shear-induced binding (vWF length, collagen site density and distribution, binding energy landscape, and slip/catch bond length) and demonstrate parameter ranges where the model provides good agreement with existing experimental data. Our results may be important for understanding vWF activity and also for achieving targeted drug therapy via biomimetic synthetic molecules. National Science Foundation (NSF),Division of Mathematical Sciences (DMS).
Arvind, Akanksha; Kumar, Vivek; Saravanan, Parameswaran; Mohan, C Gopi
2012-09-01
The cell wall of mycobacterium offers well validated targets which can be exploited for discovery of new lead compounds. MurC-MurF ligases catalyze a series of irreversible steps in the biosynthesis of peptidoglycan precursor, i.e. MurD catalyzes the ligation of D-glutamate to the nucleotide precursor UMA. The three dimensional structure of Mtb-MurD is not known and was predicted by us for the first time using comparative homology modeling technique. The accuracy and stability of the predicted Mtb-MurD structure was validated using Procheck and molecular dynamics simulation. Key interactions in Mtb-MurD were studied using docking analysis of available transition state inhibitors of E.coli-MurD. The docking analysis revealed that analogues of both L and D forms of glutamic acid have similar interaction profiles with Mtb-MurD. Further, residues His192, Arg382, Ser463, and Tyr470 are proposed to be important for inhibitor-(Mtb-MurD) interactions. We also identified few pharmacophoric features essential for Mtb-MurD ligase inhibitory activity and which can further been utilized for the discovery of putative antitubercular chemotherapy.
Anichini, Andrea; Tassi, Elena; Grazia, Giulia; Mortarini, Roberta
2018-06-01
Immunotherapy of non-small cell lung cancer (NSCLC), by immune checkpoint inhibitors, has profoundly improved the clinical management of advanced disease. However, only a fraction of patients respond and no effective predictive factors have been defined. Here, we discuss the prospects for identification of such predictors of response to immunotherapy, by fostering an in-depth analysis of the immune landscape of NSCLC. The emerging picture, from several recent studies, is that the immune contexture of NSCLC lesions is a complex and heterogeneous feature, as documented by analysis for frequency, phenotype and spatial distribution of innate and adaptive immune cells, and by characterization of functional status of inhibitory receptor + T cells. The complexity of the immune landscape of NSCLC stems from the interaction of several factors, including tumor histology, molecular subtype, main oncogenic drivers, nonsynonymous mutational load, tumor aneuploidy, clonal heterogeneity and tumor evolution, as well as the process of epithelial-mesenchymal transition. All these factors contribute to shape NSCLC immune profiles that have clear prognostic significance. An integrated analysis of the immune and molecular profile of the neoplastic lesions may allow to define the potential predictive role of the immune landscape for response to immunotherapy.
Vijayaraj, Ramadoss; Devi, Mekapothula Lakshmi Vasavi; Subramanian, Venkatesan; Chattaraj, Pratim Kumar
2012-06-01
Three-dimensional quantitative structure activity relationship (3D-QSAR) study has been carried out on the Escherichia coli DHFR inhibitors 2,4-diamino-5-(substituted-benzyl)pyrimidine derivatives to understand the structural features responsible for the improved potency. To construct highly predictive 3D-QSAR models, comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) methods were used. The predicted models show statistically significant cross-validated and non-cross-validated correlation coefficient of r2 CV and r2 nCV, respectively. The final 3D-QSAR models were validated using structurally diverse test set compounds. Analysis of the contour maps generated from CoMFA and CoMSIA methods reveals that the substitution of electronegative groups at the first and second position along with electropositive group at the third position of R2 substitution significantly increases the potency of the derivatives. The results obtained from the CoMFA and CoMSIA study delineate the substituents on the trimethoprim analogues responsible for the enhanced potency and also provide valuable directions for the design of new trimethoprim analogues with improved affinity. © 2012 John Wiley & Sons A/S.
Aragon Han, Patricia; Kim, Hyun-seok; Cho, Soonweng; Fazeli, Roghayeh; Najafian, Alireza; Khawaja, Hunain; McAlexander, Melissa; Dy, Benzon; Sorensen, Meredith; Aronova, Anna; Sebo, Thomas J.; Giordano, Thomas J.; Fahey, Thomas J.; Thompson, Geoffrey B.; Gauger, Paul G.; Somervell, Helina; Bishop, Justin A.; Eshleman, James R.; Schneider, Eric B.; Witwer, Kenneth W.; Umbricht, Christopher B.
2016-01-01
Background: Studies have demonstrated an association of the BRAFV600E mutation and microRNA (miR) expression with aggressive clinicopathologic features in papillary thyroid cancer (PTC). Analysis of BRAFV600E mutations with miR expression data may improve perioperative decision making for patients with PTC, specifically in identifying patients harboring central lymph node metastases (CLNM). Methods: Between January 2012 and June 2013, 237 consecutive patients underwent total thyroidectomy and prophylactic central lymph node dissection (CLND) at four endocrine surgery centers. All tumors were tested for the presence of the BRAFV600E mutation and miR-21, miR-146b-3p, miR-146b-5p, miR-204, miR-221, miR-222, and miR-375 expression. Bivariate and multivariable analyses were performed to examine associations between molecular markers and aggressive clinicopathologic features of PTC. Results: Multivariable logistic regression analysis of all clinicopathologic features found miR-146b-3p and miR-146b-5p to be independent predictors of CLNM, while the presence of BRAFV600E almost reached significance. Multivariable logistic regression analysis limited to only predictors available preoperatively (molecular markers, age, sex, and tumor size) found miR-146b-3p, miR-146b-5p, miR-222, and BRAFV600E mutation to predict CLNM independently. While BRAFV600E was found to be associated with CLNM (48% mutated in node-positive cases vs. 28% mutated in node-negative cases), its positive and negative predictive values (48% and 72%, respectively) limit its clinical utility as a stand-alone marker. In the subgroup analysis focusing on only classical variant of PTC cases (CVPTC), undergoing prophylactic lymph node dissection, multivariable logistic regression analysis found only miR-146b-5p and miR-222 to be independent predictors of CLNM, while BRAFV600E was not significantly associated with CLNM. Conclusion: In the patients undergoing prophylactic CLNDs, miR-146b-3p, miR-146b-5p, and miR-222 were found to be predictive of CLNM preoperatively. However, there was significant overlap in expression of these miRs in the two outcome groups. The BRAFV600E mutation, while being a marker of CLNM when considering only preoperative variables among all histological subtypes, is likely not a useful stand-alone marker clinically because the difference between node-positive and node-negative cases was small. Furthermore, it lost significance when examining only CVPTC. Overall, our results speak to the concept and interpretation of statistical significance versus actual applicability of molecular markers, raising questions about their clinical usefulness as individual prognostic markers. PMID:26950846
NASA Astrophysics Data System (ADS)
Feng, Wei; Ma, Ning; Zhu, Dan
2015-03-01
The improvement of methods for optical clearing agent prediction exerts an important impact on tissue optical clearing technique. The molecular dynamic simulation is one of the most convincing and simplest approaches to predict the optical clearing potential of agents by analyzing the hydrogen bonds, hydrogen bridges and hydrogen bridges type forming between agents and collagen. However, the above analysis methods still suffer from some problem such as analysis of cyclic molecule by reason of molecular conformation. In this study, a molecular effective coverage surface area based on the molecular dynamic simulation was proposed to predict the potential of optical clearing agents. Several typical cyclic molecules, fructose, glucose and chain molecules, sorbitol, xylitol were analyzed by calculating their molecular effective coverage surface area, hydrogen bonds, hydrogen bridges and hydrogen bridges type, respectively. In order to verify this analysis methods, in vitro skin samples optical clearing efficacy were measured after 25 min immersing in the solutions, fructose, glucose, sorbitol and xylitol at concentration of 3.5 M using 1951 USAF resolution test target. The experimental results show accordance with prediction of molecular effective coverage surface area. Further to compare molecular effective coverage surface area with other parameters, it can show that molecular effective coverage surface area has a better performance in predicting OCP of agents.
A realistic molecular model of cement hydrates.
Pellenq, Roland J-M; Kushima, Akihiro; Shahsavari, Rouzbeh; Van Vliet, Krystyn J; Buehler, Markus J; Yip, Sidney; Ulm, Franz-Josef
2009-09-22
Despite decades of studies of calcium-silicate-hydrate (C-S-H), the structurally complex binder phase of concrete, the interplay between chemical composition and density remains essentially unexplored. Together these characteristics of C-S-H define and modulate the physical and mechanical properties of this "liquid stone" gel phase. With the recent determination of the calcium/silicon (C/S = 1.7) ratio and the density of the C-S-H particle (2.6 g/cm(3)) by neutron scattering measurements, there is new urgency to the challenge of explaining these essential properties. Here we propose a molecular model of C-S-H based on a bottom-up atomistic simulation approach that considers only the chemical specificity of the system as the overriding constraint. By allowing for short silica chains distributed as monomers, dimers, and pentamers, this C-S-H archetype of a molecular description of interacting CaO, SiO2, and H2O units provides not only realistic values of the C/S ratio and the density computed by grand canonical Monte Carlo simulation of water adsorption at 300 K. The model, with a chemical composition of (CaO)(1.65)(SiO2)(H2O)(1.75), also predicts other essential structural features and fundamental physical properties amenable to experimental validation, which suggest that the C-S-H gel structure includes both glass-like short-range order and crystalline features of the mineral tobermorite. Additionally, we probe the mechanical stiffness, strength, and hydrolytic shear response of our molecular model, as compared to experimentally measured properties of C-S-H. The latter results illustrate the prospect of treating cement on equal footing with metals and ceramics in the current application of mechanism-based models and multiscale simulations to study inelastic deformation and cracking.
NASA Astrophysics Data System (ADS)
Kose, Etem; Atac, Ahmet; Bardak, Fehmi
2018-07-01
This study comprises the structural and spectroscopic evaluation of a quinoline derivative, 2-chloro-3-methylquinoline (2Cl3MQ), via UV-Vis, 1H and 13C NMR, FT-IR and FT-Raman techniques experimentally, theoretically with DFT and TD-DFT quantum chemical calculations at B3LYP/6-311++G (d, p) level of theory, and investigation of the in silico pharmaceutical potent of 2Cl3MQ in comparison to 2ClnMQ (n = 3,4,7,8,9,10) substituted quinolines. The experimental measurements were recorded as follows; UV-vis spectra were obtained in the range of 200-400 nm in the water and ethanol solvents. 1H and 13C NMR spectra were recorded in CDCl3. Vibrational spectra were obtained in the region of 4000-400 cm-1 and 3500-10 cm-1 for FT-IR and FT-Raman spectra, respectively. Structural and spectroscopic features obtained through theoretical evaluations include: electrostatic features, atomic charges and molecular electrostatic potential surface, the frontier molecular orbital characteristics, the density of states and their overlapping nature, the electronic transition properties, thermodynamical and nonlinear optical characteristics, and predicted UV-Vis, 1H and 13C NMR, FT-IR and FT-Raman spectra. Ligand-enzyme interactions of 2ClnMQ (n = 3,4,7,8,9,10) substituted quinolines with Malate Synthase from Mycobacterium Tuberculosis (MtbMS) were investigated via molecular docking. The role of position of methyl substitution on the inhibitor character of the ligands was discussed on the basis of noncovalent interaction profiles.
Radujkovic, Aleksandar; Guglielmi, Cesare; Bergantini, Stefania; Iacobelli, Simona; van Biezen, Anja; Milojkovic, Dragana; Gratwohl, Alois; Schattenberg, Antonius V M B; Verdonck, Leo F; Niederwieser, Dietger W; de Witte, Theo; Kröger, Nicolaus; Olavarria, Eduardo
2015-07-01
Donor lymphocyte infusions (DLI) are an effective treatment for relapsed chronic myeloid leukemia (CML) after allogeneic stem cell transplantation (alloSCT). Leukemia resistance and secondary graft-versus-host disease (GVHD) are major obstacles to success with DLI. The aim of this study was to identify pre-DLI factors associated with prolonged survival in remission without secondary GVHD. We retrospectively analyzed 500 patients treated with DLI for CML relapse (16% molecular, 30% cytogenetic, and 54% hematological) after alloSCT. The overall probabilities of failure- and secondary GVHD-free survival (FGFS) were 29% and 27% at 5 and 10 years after DLI, respectively. The type of relapse was the major factor influencing FGFS (40% for molecular and/or cytogenetic relapse and 20% for hematological relapse at 5 years, P < .001). Chronic GVHD before DLI and an interval <1 year between alloSCT and first DLI were independently associated with inferior FGFS in patients with molecular and/or cytogenetic relapse. Consequently, FGFS was 13%, 35%, to 56% at 5 years in patients with 2, 1, and 0 adverse features, respectively. In patients with hematological relapse, independent adverse prognostic factors for FGFS were initial dose of CD3(+) cells ≥ 50 × 10(6)/kg, donor-recipient sex mismatch, and chronic GVHD before DLI. FGFS was 0%, 17%, 33%, to 37% in patients with 3, 2, 1, and 0 adverse features, respectively. The probability of survival in remission without secondary GVHD was highest (>50% at 5 years) when DLI were given beyond 1 year from alloSCT for molecular and/or cytogenetic CML relapse that was not preceded by chronic GVHD. Copyright © 2015 American Society for Blood and Marrow Transplantation. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Esposito, Emilio Xavier, E-mail: emilio@exeResearch.com; The Chem21 Group, Inc., 1780 Wilson Drive, Lake Forest, IL 60045; Hopfinger, Anton J., E-mail: hopfingr@gmail.com
2015-10-01
Carbon nanotubes have become widely used in a variety of applications including biosensors and drug carriers. Therefore, the issue of carbon nanotube toxicity is increasingly an area of focus and concern. While previous studies have focused on the gross mechanisms of action relating to nanomaterials interacting with biological entities, this study proposes detailed mechanisms of action, relating to nanotoxicity, for a series of decorated (functionalized) carbon nanotube complexes based on previously reported QSAR models. Possible mechanisms of nanotoxicity for six endpoints (bovine serum albumin, carbonic anhydrase, chymotrypsin, hemoglobin along with cell viability and nitrogen oxide production) have been extracted frommore » the corresponding optimized QSAR models. The molecular features relevant to each of the endpoint respective mechanism of action for the decorated nanotubes are also discussed. Based on the molecular information contained within the optimal QSAR models for each nanotoxicity endpoint, either the decorator attached to the nanotube is directly responsible for the expression of a particular activity, irrespective of the decorator's 3D-geometry and independent of the nanotube, or those decorators having structures that place the functional groups of the decorators as far as possible from the nanotube surface most strongly influence the biological activity. These molecular descriptors are further used to hypothesize specific interactions involved in the expression of each of the six biological endpoints. - Highlights: • Proposed toxicity mechanism of action for decorated nanotubes complexes • Discussion of the key molecular features for each endpoint's mechanism of action • Unique mechanisms of action for each of the six biological systems • Hypothesized mechanisms of action based on QSAR/QNAR predictive models.« less
Alberini, Giulio; Benfenati, Fabio
2017-01-01
Tight-junctions between epithelial cells of biological barriers are specialized molecular structures that regulate the flux of solutes across the barrier, parallel to cell walls. The tight-junction backbone is made of strands of transmembrane proteins from the claudin family, but the molecular mechanism of its function is still not completely understood. Recently, the crystal structure of a mammalian claudin-15 was reported, displaying for the first time the detailed features of transmembrane and extracellular domains. Successively, a structural model of claudin-15-based paracellular channels has been proposed, suggesting a putative assembly that illustrates how claudins associate in the same cell (via cis interactions) and across adjacent cells (via trans interactions). Although very promising, the model offers only a static conformation, with residues missing in the most important extracellular regions and potential steric clashes. Here we present detailed atomic models of paracellular single and double pore architectures, obtained from the putative assembly and refined via structural modeling and all-atom molecular dynamics simulations in double membrane bilayer and water environment. Our results show an overall stable configuration of the complex with a fluctuating pore size. Extracellular residue loops in trans interaction are able to form stable contacts and regulate the size of the pore, which displays a stationary radius of 2.5–3.0 Å at the narrowest region. The side-by-side interactions of the cis configuration are preserved via stable hydrogen bonds, already predicted by cysteine crosslinking experiments. Overall, this work introduces an improved version of the claudin-15-based paracellular channel model that strengthens its validity and that can be used in further computational studies to understand the structural features of tight-junctions regulation. PMID:28863193
Trucchi, Emiliano; Sbordoni, Valerio
2009-05-18
Biological invasions can be considered one of the main threats to biodiversity, and the recognition of common ecological and evolutionary features among invaders can help developing a predictive framework to control further invasions. In particular, the analysis of successful invasive species and of their autochthonous source populations by means of genetic, phylogeographic and demographic tools can provide novel insights into the study of biological invasion patterns. Today, long-term dynamics of biological invasions are still poorly understood and need further investigations. Moreover, distribution and molecular data on native populations could contribute to the recognition of common evolutionary features of successful aliens. We analyzed 2,195 mitochondrial base pairs, including Cytochrome b, Control Region and rRNA 12S, in 161 Italian and 27 African specimens and assessed the ancient invasive origin of Italian crested porcupine (Hystrix cristata) populations from Tunisia. Molecular coalescent-based Bayesian analyses proposed the Roman Age as a putative timeframe of introduction and suggested a retention of genetic diversity during the early phases of colonization. The characterization of the native African genetic background revealed the existence of two differentiated clades: a Mediterranean group and a Sub-Saharan one. Both standard population genetic and advanced molecular demography tools (Bayesian Skyline Plot) did not evidence a clear genetic signature of the expected increase in population size after introduction. Along with the genetic diversity retention during the bottlenecked steps of introduction, this finding could be better described by hypothesizing a multi-invasion event. Evidences of the ancient anthropogenic invasive origin of the Italian Hystrix cristata populations were clearly shown and the native African genetic background was preliminary described. A more complex pattern than a simple demographic exponential growth from a single propagule seems to have characterized this long-term invasion.
A Feature and Algorithm Selection Method for Improving the Prediction of Protein Structural Class.
Ni, Qianwu; Chen, Lei
2017-01-01
Correct prediction of protein structural class is beneficial to investigation on protein functions, regulations and interactions. In recent years, several computational methods have been proposed in this regard. However, based on various features, it is still a great challenge to select proper classification algorithm and extract essential features to participate in classification. In this study, a feature and algorithm selection method was presented for improving the accuracy of protein structural class prediction. The amino acid compositions and physiochemical features were adopted to represent features and thirty-eight machine learning algorithms collected in Weka were employed. All features were first analyzed by a feature selection method, minimum redundancy maximum relevance (mRMR), producing a feature list. Then, several feature sets were constructed by adding features in the list one by one. For each feature set, thirtyeight algorithms were executed on a dataset, in which proteins were represented by features in the set. The predicted classes yielded by these algorithms and true class of each protein were collected to construct a dataset, which were analyzed by mRMR method, yielding an algorithm list. From the algorithm list, the algorithm was taken one by one to build an ensemble prediction model. Finally, we selected the ensemble prediction model with the best performance as the optimal ensemble prediction model. Experimental results indicate that the constructed model is much superior to models using single algorithm and other models that only adopt feature selection procedure or algorithm selection procedure. The feature selection procedure or algorithm selection procedure are really helpful for building an ensemble prediction model that can yield a better performance. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
2012-01-01
Background While lenalidomide (LEN) shows high efficacy in myelodysplastic syndromes (MDS) with del[5q], responses can be also seen in patients presenting without del[5q]. We hypothesized that improved detection of chromosomal abnormalities with new karyotyping tools may better predict response to LEN. Design and methods We have studied clinical, molecular and cytogenetic features of 42 patients with MDS, myeloproliferative neoplasms (MPN), MDS/MPN overlap syndromes and secondary acute myeloid leukemia (sAML) without del[5q] by metaphase cytogenetics (MC) who underwent therapy with LEN. Results Fluorescence in situ hybridization (FISH) or single nucleotide polymorphism array (SNP-A)-based karyotyping marginally increased the diagnostic yield over MC, detecting 2/42 (4.8%) additional cases with del[5q], one of whom were responded to LEN. Responses were more often observed in patients with a normal karyotype by MC (60% vs abnormal MC; 17%, p = .08) and those with gain of chromosome 8 material by either of all 3 karyotyping methods (83% vs all other chromosomal abnormalities; 44% p = .11). However, 5 out of those 6 patients received combined LEN/AZA therapy and it may also suggest those with gain of chromosome 8 material respond well to AZA. The addition of FISH or SNP-A did not improve the predictive value of normal cytogenetics by MC. Mutational analysis of TET2, UTX, CBL, EZH2, ASXL1, TP53, RAS, IDH1/2, and DNMT-3A was performed on 21 of 41 patients, and revealed 13 mutations in 11 patients, but did not show any molecular markers of responsiveness to LEN. Conclusions Normal karyotype and gain of chromosome 8 material was predictive of response to LEN in non-del[5q] patients with myeloid malignancies. PMID:22390313
ERIC Educational Resources Information Center
Sweller, Naomi
2015-01-01
Individuals with autism have difficulty generalising information from one situation to another, a process that requires the learning of categories and concepts. Category information may be learned through: (1) classifying items into categories, or (2) predicting missing features of category items. Predicting missing features has to this point been…
Chaube, Udit; Chhatbar, Dhara; Bhatt, Hardik
2016-02-01
According to WHO statistics, lung cancer is one of the leading causes of death among all other types of cancer. Many genes get mutated in lung cancer but involvement of EGFR and KRAS are more common. Unavailability of drugs or resistance to the available drugs is the major problem in the treatment of lung cancer. In the present research, mTOR was selected as an alternative target for the treatment of lung cancer which involves PI3K/AKT/mTOR pathway. 28 synthetic mTOR inhibitors were selected from the literature. Ligand based approach (CoMFA and CoMSIA) and structure based approach (molecular dynamics simulations assisted molecular docking study) were applied for the identification of important features of benzoxazepine moiety, responsible for mTOR inhibition. Three different alignments were tried to obtain best QSAR model, of which, distil was found to be the best method, as it gave good statistical results. In CoMFA, Leave One Out (LOO) cross validated coefficients (q(2)), conventional coefficient (r(2)) and predicted correlation coefficient (r(2)pred) values were found to be 0.615, 0.990 and 0.930, respectively. Similarly in CoMSIA, q(2), r(2)ncv and r(2)pred values were found to be 0.748, 0.986 and 0.933, respectively. Molecular dynamics and simulations study revealed that B-chain of mTOR protein was stable at and above 500 FS with respect to temperature (at and above 298 K), Potential energy (at and above 7669.72 kJ/mol) and kinetic energy (at and above 4009.77 kJ/mol). Molecular docking study was performed on simulated protein of mTOR which helped to correlate interactions of amino acids surrounded to the ligand with contour maps generated by QSAR method. Important features of benzoxazepine were identified by contour maps and molecular docking study which would be useful to design novel molecules as mTOR inhibitors for the treatment of lung cancer. Copyright © 2015 Elsevier Ltd. All rights reserved.
Protein Interactome of Muscle Invasive Bladder Cancer
Bhat, Akshay; Heinzel, Andreas; Mayer, Bernd; Perco, Paul; Mühlberger, Irmgard; Husi, Holger; Merseburger, Axel S.; Zoidakis, Jerome; Vlahou, Antonia; Schanstra, Joost P.; Mischak, Harald; Jankowski, Vera
2015-01-01
Muscle invasive bladder carcinoma is a complex, multifactorial disease caused by disruptions and alterations of several molecular pathways that result in heterogeneous phenotypes and variable disease outcome. Combining this disparate knowledge may offer insights for deciphering relevant molecular processes regarding targeted therapeutic approaches guided by molecular signatures allowing improved phenotype profiling. The aim of the study is to characterize muscle invasive bladder carcinoma on a molecular level by incorporating scientific literature screening and signatures from omics profiling. Public domain omics signatures together with molecular features associated with muscle invasive bladder cancer were derived from literature mining to provide 286 unique protein-coding genes. These were integrated in a protein-interaction network to obtain a molecular functional map of the phenotype. This feature map educated on three novel disease-associated pathways with plausible involvement in bladder cancer, namely Regulation of actin cytoskeleton, Neurotrophin signalling pathway and Endocytosis. Systematic integration approaches allow to study the molecular context of individual features reported as associated with a clinical phenotype and could potentially help to improve the molecular mechanistic description of the disorder. PMID:25569276
LICRE: unsupervised feature correlation reduction for lipidomics.
Wong, Gerard; Chan, Jeffrey; Kingwell, Bronwyn A; Leckie, Christopher; Meikle, Peter J
2014-10-01
Recent advances in high-throughput lipid profiling by liquid chromatography electrospray ionization tandem mass spectrometry (LC-ESI-MS/MS) have made it possible to quantify hundreds of individual molecular lipid species (e.g. fatty acyls, glycerolipids, glycerophospholipids, sphingolipids) in a single experimental run for hundreds of samples. This enables the lipidome of large cohorts of subjects to be profiled to identify lipid biomarkers significantly associated with disease risk, progression and treatment response. Clinically, these lipid biomarkers can be used to construct classification models for the purpose of disease screening or diagnosis. However, the inclusion of a large number of highly correlated biomarkers within a model may reduce classification performance, unnecessarily inflate associated costs of a diagnosis or a screen and reduce the feasibility of clinical translation. An unsupervised feature reduction approach can reduce feature redundancy in lipidomic biomarkers by limiting the number of highly correlated lipids while retaining informative features to achieve good classification performance for various clinical outcomes. Good predictive models based on a reduced number of biomarkers are also more cost effective and feasible from a clinical translation perspective. The application of LICRE to various lipidomic datasets in diabetes and cardiovascular disease demonstrated superior discrimination in terms of the area under the receiver operator characteristic curve while using fewer lipid markers when predicting various clinical outcomes. The MATLAB implementation of LICRE is available from http://ww2.cs.mu.oz.au/∼gwong/LICRE © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A framework for feature extraction from hospital medical data with applications in risk prediction.
Tran, Truyen; Luo, Wei; Phung, Dinh; Gupta, Sunil; Rana, Santu; Kennedy, Richard Lee; Larkins, Ann; Venkatesh, Svetha
2014-12-30
Feature engineering is a time consuming component of predictive modeling. We propose a versatile platform to automatically extract features for risk prediction, based on a pre-defined and extensible entity schema. The extraction is independent of disease type or risk prediction task. We contrast auto-extracted features to baselines generated from the Elixhauser comorbidities. Hospital medical records was transformed to event sequences, to which filters were applied to extract feature sets capturing diversity in temporal scales and data types. The features were evaluated on a readmission prediction task, comparing with baseline feature sets generated from the Elixhauser comorbidities. The prediction model was through logistic regression with elastic net regularization. Predictions horizons of 1, 2, 3, 6, 12 months were considered for four diverse diseases: diabetes, COPD, mental disorders and pneumonia, with derivation and validation cohorts defined on non-overlapping data-collection periods. For unplanned readmissions, auto-extracted feature set using socio-demographic information and medical records, outperformed baselines derived from the socio-demographic information and Elixhauser comorbidities, over 20 settings (5 prediction horizons over 4 diseases). In particular over 30-day prediction, the AUCs are: COPD-baseline: 0.60 (95% CI: 0.57, 0.63), auto-extracted: 0.67 (0.64, 0.70); diabetes-baseline: 0.60 (0.58, 0.63), auto-extracted: 0.67 (0.64, 0.69); mental disorders-baseline: 0.57 (0.54, 0.60), auto-extracted: 0.69 (0.64,0.70); pneumonia-baseline: 0.61 (0.59, 0.63), auto-extracted: 0.70 (0.67, 0.72). The advantages of auto-extracted standard features from complex medical records, in a disease and task agnostic manner were demonstrated. Auto-extracted features have good predictive power over multiple time horizons. Such feature sets have potential to form the foundation of complex automated analytic tasks.
La Rosa, Stefano; Sessa, Fausto; Capella, Carlo
2015-01-01
Acinar cell carcinomas (ACCs) of the pancreas are rare pancreatic neoplasms accounting for about 1-2% of pancreatic tumors in adults and about 15% in pediatric subjects. They show different clinical symptoms at presentation, different morphological features, different outcomes, and different molecular alterations. This heterogeneous clinicopathological spectrum may give rise to difficulties in the clinical and pathological diagnosis with consequential therapeutic and prognostic implications. The molecular mechanisms involved in the onset and progression of ACCs are still not completely understood, although in recent years, several attempts have been made to clarify the molecular mechanisms involved in ACC biology. In this paper, we will review the main clinicopathological and molecular features of pancreatic ACCs of both adult and pediatric subjects to give the reader a comprehensive overview of this rare tumor type.
Materna-Kiryluk, Anna; Kiryluk, Krzysztof; Burgess, Katelyn E; Bieleninik, Arkadiusz; Sanna-Cherchi, Simone; Gharavi, Ali G.; Latos-Bielenska, Anna
2014-01-01
Background Copy number variants (CNVs) are increasingly recognized as an important cause of congenital malformations and likely explain over 16% cases of CAKUT. Here, we illustrate how a molecular diagnosis of CNV can inform the clinical management of a pediatric patient presenting with CAKUT and other organ defects. Methods We describe a 14 year-old girl with a large de novo deletion of chromosome 3q13.31-22.1 that disrupts 101 known genes and manifests with CAKUT, neurodevelopmental delay, agenesis of corpus callosum (ACC), cardiac malformations, electrolyte and endocrine disorders, skeletal abnormalities and dysmorphic features. We perform extensive annotation of the deleted region to prioritize genes for specific phenotypes and to predict future disease risk. Results Our case defined new minimal chromosomal candidate regions for both CAKUT and ACC. Moreover, the presence of the CASR gene in the deleted interval predicted a diagnosis of hypocalciuric hypercalcemia, which was confirmed by serum and urine chemistries. Our gene annotation explained clinical hypothyroidism and predicted that the index case is at increased risk of thoracic aortic aneurysm, renal cell carcinoma and myeloproliferative disorder. Conclusions Extended annotation of CNV regions refines diagnosis and uncovers previously unrecognized phenotypic features. This approach enables personalized treatment and prevention strategies in patients harboring genomic deletions. PMID:24292865
Bechtold-Dalla Pozza, Susanne; Hiedl, Stefan; Roeb, Julia; Lohse, Peter; Malik, Raleigh E.; Park, Soyoung; Durán-Prado, Mario; Rhodes, Simon J.
2012-01-01
Background/Aims Recessive mutations in the LHX3 ho-meodomain transcription factor gene are associated with developmental disorders affecting the pituitary and nervous system. We describe pediatric patients with combined pituitary hormone deficiency (CPHD) who harbor a novel mutation in LHX3. Methods Two female siblings from related parents were examined. Both patients had neonatal complications. The index patient had CPHD featuring deficiencies of GH, LH, FSH, PRL, and TSH, with later onset of ACTH deficiency. She also had a hypoplastic anterior pituitary, respiratory distress, hearing impairment, and limited neck rotation. The LHX3 gene was sequenced and the biochemical properties of the predicted altered proteins were characterized. Results A novel homozygous mutation predicted to change amino acid 194 from threonine to arginine (T194R) was detected in both patients. This amino acid is conserved in the DNA-binding homeodomain. Computer modeling predicted that the T194R change would alter the homeodomain structure. The T194R protein did not bind tested LHX3 DNA recognition sites and did not activate the α-glycoprotein and PRL target genes. Conclusion The T194R mutation affects a critical residue in the LHX3 protein. This study extends our understanding of the phenotypic features, molecular mechanism, and developmental course associated with mutations in the LHX3 gene. PMID:22286346
Prediction of stream volatilization coefficients
Rathbun, Ronald E.
1990-01-01
Equations are developed for predicting the liquid-film and gas-film reference-substance parameters for quantifying volatilization of organic solutes from streams. Molecular weight and molecular-diffusion coefficients of the solute are used as correlating parameters. Equations for predicting molecular-diffusion coefficients of organic solutes in water and air are developed, with molecular weight and molal volume as parameters. Mean absolute errors of prediction for diffusion coefficients in water are 9.97% for the molecular-weight equation, 6.45% for the molal-volume equation. The mean absolute error for the diffusion coefficient in air is 5.79% for the molal-volume equation. Molecular weight is not a satisfactory correlating parameter for diffusion in air because two equations are necessary to describe the values in the data set. The best predictive equation for the liquid-film reference-substance parameter has a mean absolute error of 5.74%, with molal volume as the correlating parameter. The best equation for the gas-film parameter has a mean absolute error of 7.80%, with molecular weight as the correlating parameter.
NASA Astrophysics Data System (ADS)
Ghavami, Raouf; Sadeghi, Faridoon; Rasouli, Zolikha; Djannati, Farhad
2012-12-01
Experimental values for the 13C NMR chemical shifts (ppm, TMS = 0) at 300 K ranging from 96.28 ppm (C4' of indole derivative 17) to 159.93 ppm (C4' of indole derivative 23) relative to deuteride chloroform (CDCl3, 77.0 ppm) or dimethylsulfoxide (DMSO, 39.50 ppm) as internal reference in CDCl3 or DMSO-d6 solutions have been collected from literature for thirty 2-functionalized 5-(methylsulfonyl)-1-phenyl-1H-indole derivatives containing different substituted groups. An effective quantitative structure-property relationship (QSPR) models were built using hybrid method combining genetic algorithm (GA) based on stepwise selection multiple linear regression (SWS-MLR) as feature-selection tools and correlation models between each carbon atom of indole derivative and calculated descriptors. Each compound was depicted by molecular structural descriptors that encode constitutional, topological, geometrical, electrostatic, and quantum chemical features. The accuracy of all developed models were confirmed using different types of internal and external procedures and various statistical tests. Furthermore, the domain of applicability for each model which indicates the area of reliable predictions was defined.
Barbieri, Federica; Albertelli, Manuela; Grillo, Federica; Mohamed, Amira; Saveanu, Alexandru; Barlier, Anne; Ferone, Diego; Florio, Tullio
2014-04-01
Neuroendocrine tumors (NETs) are heterogeneous neoplasms with respect to molecular characteristics and clinical outcome. Although slow-growing, NETs are often late diagnosed, already showing invasion of adjacent tissues and metastases. Precise knowledge of NET biological and molecular features has opened the door to the identification of novel pharmacological targets. Therapeutic options include somatostatin analogs, alone or in combination with interferon-α, multi-targeted tyrosine kinase inhibitors (e.g. sunitinib) or mammalian target of rapamycin (mTOR) inhibitors (e.g. everolimus). Antiangiogenic approaches and anti insulin-like growth factor receptor (IGFR) compounds have been also proposed as combination therapies with the aforementioned compounds. This review will focus on recent studies that have improved therapeutic strategies in NETs, discussing management challenges such as drug resistance development as well as focusing on the need for predictive biomarkers to design distinct drug combinations and optimize pharmacological control. Copyright © 2013 Elsevier Ltd. All rights reserved.
Rosenfeld, Carine; Serra, Christophe; Brochon, Cyril; Hadziioannou, Georges
2008-10-01
The influence of interdigital multilamination micromixer characteristics on monomer conversions, molecular weights and especially on the polydispersity index of block copolymers synthesized continuously in two microtube reactors is investigated. The micromixers are used to mix, before copolymerization, a polymer solution with different viscosities and the second monomer. Different geometries of micromixer (number of microchannels, characteristic lengths) have been studied. It was found that polydispersity indices of the copolymers follow a linear relationship with the Reynolds number in the micromixer, represented by a form factor. Thus, beside the operating conditions (nature of the first block and comonomer flow rate), the choice of the micromixer geometry and dimension is essential to control the copolymerization in terms of molecular weights and polydispersity indices. This linear correlation allows the prediction of copolymer features. It can also be a new method to optimize existing micromixers or design other geometries so that mixing could be more efficient.
Molecular detection with terahertz waves based on absorption-induced transparency metamaterials
NASA Astrophysics Data System (ADS)
G. Rodrigo, Sergio; Martín-Moreno, L.
2016-10-01
A system for the detection of spectral signatures of chemical compounds at the Terahertz regime is presented. The system consists on a holey metal film whereby the presence of a given substance provokes the appearance of spectral features in transmission and reflection induced by the molecular specimen. These induced effects can be regarded as an extraordinary optical transmission phenomenon called absorption-induced transparency (AIT). The phenomenon consist precisely in the appearance of peaks in transmission and dips in reflection after sputtering of a chemical compound onto an initially opaque holey metal film. The spectral signatures due to AIT occur unexpectedly close to the absorption energies of the molecules. The presence of a target, a chemical compound, would be thus revealed as a strong drop in reflectivity measurements. We theoretically predict the AIT based system would serve to detect amounts of hydrocyanic acid (HCN) at low rate concentrations.
Hu, Yan-Hong; Chen, Xiao-Ming; Yang, Pu; Ding, Wei-Feng
2018-04-01
Ericerus pela Chavannes (Hemiptera: Coccoidae) is an economically important scale insect because the second instar males secrete a harvestable wax-like substance. In this study, we report the molecular cloning of a fatty acyl-CoA reductase gene (EpFAR) of E. pela. We predicted a 520-aa protein with the FAR family features from the deduced amino acid sequence. The EpFAR mRNA was expressed in five tested tissues, testis, alimentary canal, fat body, Malpighian tubules, and mostly in cuticle. The EpFAR protein was localized by immunofluorescence only in the wax glands and testis. EpFAR expression in High Five insect cells documented the recombinant EpFAR reduced 26-0:(S) CoA and to its corresponding alcohol. The data illuminate the molecular mechanism for fatty alcohol biosynthesis in a beneficial insect, E. pela. © 2017 Wiley Periodicals, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moberg, Daniel R.; Straight, Shelby C.; Knight, Christopher
Here, an unambiguous assignment of the vibrational spectra of ice I h remains a matter of debate. This study demonstrates that an accurate representation of many-body interactions between water molecules, combined with an explicit treatment of nuclear quantum effects through many-body molecular dynamics (MB-MD), leads to a unified interpretation of the vibrational spectra of ice I h in terms of the structure and dynamics of the underlying hydrogen-bond network. All features of the infrared and Raman spectra in the OH stretching region can be unambiguously assigned by taking into account both the symmetry and the delocalized nature of the latticemore » vibrations as well as the local electrostatic environment experienced by each water molecule within the crystal. The high level of agreement with experiment raises prospects for predictive MB-MD simulations that, complementing analogous measurements, will provide molecular-level insights into fundamental processes taking place in bulk ice and on ice surfaces under different thermodynamic conditions.« less
Manoj Kumar, Palanivelu; Karthikeyan, Chandrabose; Hari Narayana Moorthy, Narayana Subbiah; Trivedi, Piyush
2006-11-01
In the present paper, quantitative structure activity relationship (QSAR) approach was applied to understand the affinity and selectivity of a novel series of triaryl imidazole derivatives towards glucagon receptor. Statistically significant and highly predictive QSARs were derived for glucagon receptor inhibition by triaryl imidazoles using QuaSAR descriptors of molecular operating environment (MOE) employing computer-assisted multiple regression procedure. The generated QSAR models revealed that factors related to hydrophobicity, molecular shape and geometry predominantly influences glucagon receptor binding affinity of the triaryl imidazoles indicating the relevance of shape specific steric interactions between the molecule and the receptor. Further, QSAR models formulated for selective inhibition of glucagon receptor over p38 mitogen activated protein (MAP) kinase of the compounds in the series highlights that the same structural features, which influence the glucagon receptor affinity, also contribute to their selective inhibition.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilbois, Timo; Helm, Hanspeter
2011-11-15
Strong-field ionization of molecular hydrogen is studied at wavelengths ranging from 300 to 800 nm using pulses of 100-fs duration. We find that over this wide wavelength range, from nominally 4-photon to 11-photon ionization, resonance features dominate the ionization probability at intensities below 10{sup 14} W/cm{sup 2}. Photoelectron momentum maps recorded by an imaging spectrometer are analyzed to identify the wavelength-dependent ionization pathways in single ionization of molecular hydrogen. A number of models, some empirical, which are appropriate for a quantitative interpretation of the spectra and the ionization yield are introduced. A near-absolute comparison of measured ionization yields at 398more » nm is made with the predictions based on a numerical solution [Y. V. Vanne and A. Saenz, Phys. Rev. A 79, 023421 (2009)] of the time-dependent Schroedinger equation for two correlated electrons.« less
Relationship Between Frequency and Deflection Angle in the DNA Prism
Chen, Zhen; Dorfman, Kevin D.
2013-01-01
The DNA prism is a modification of the standard pulsed-field electrophoresis protocol to provide a continuous separation, where the DNA are deflected at an angle that depends on their molecular weight. The standard switchback model for the DNA prism predicts a monotonic increase in the deflection angle as a function of the frequency for switching the field until a plateau regime is reached. However, experiments indicate that the deflection angle achieves a maximum value before decaying to a size-independent value at high frequencies. Using Brownian dynamics simulations, we show that the maximum in the deflection angle is related to the reorientation time for the DNA and the decay in deflection angle at high frequencies is due to inadequate stretching. The generic features of the dependence of the deflection angle on molecular weight, switching frequency, and electric field strength explain a number of experimental phenomena. PMID:23410375
NASA Astrophysics Data System (ADS)
Yakub, Eugene; Ronchi, Claudio; Staicu, Dragos
2007-09-01
Results of molecular dynamics (MD) simulation of UO2 in a wide temperature range are presented and discussed. A new approach to the calibration of a partly ionic Busing-Ida-type model is proposed. A potential parameter set is obtained reproducing the experimental density of solid UO2 in a wide range of temperatures. A conventional simulation of the high-temperature stoichiometric UO2 on large MD cells, based on a novel fast method of computation of Coulomb forces, reveals characteristic features of a premelting λ transition at a temperature near to that experimentally observed (Tλ=2670K ). A strong deviation from the Arrhenius behavior of the oxygen self-diffusion coefficient was found in the vicinity of the transition point. Predictions for liquid UO2, based on the same potential parameter set, are in good agreement with existing experimental data and theoretical calculations.
Creation of Rydberg Polarons in a Bose Gas
NASA Astrophysics Data System (ADS)
Schmidt, Richard
2017-04-01
In this talk we review the theory of various types of Bose polarons that can be realized in ultracold atomic systems. We then report the spectroscopic observation of Rydberg polarons in a Bose gas which is in excellent agreement with theoretical predictions. This novel type of polaron is created by excitation of Rydberg atoms in a strontium Bose-Einstein condensate and it is distinguished by the occupation of a large number bound molecular states. The cross-over from few-body bound molecular oligomers to many-body polaron features is described with a functional determinant theory that solves an extended Froehlich Hamiltonian for an impurity in a Bose gas. The detailed analysis of the red-detuned tail of the excitation spectrum describes the contribution from the region of highest density in the condensate and provides a clear signature of Rydberg polarons. This work has been performed in collaboration with groups at Rice University, Harvard University, and the TU Vienna.
NASA Astrophysics Data System (ADS)
Babin, Volodymr; Baucom, Jason; Darden, Thomas; Sagui, Celeste
2006-03-01
We have investigated to what extend molecular dynamics (MD) simulatons can reproduce DNA sequence-specific features, given different electrostatic descriptions and different cell environments. For this purpose, we have carried out multiple unrestrained MD simulations of the duplex d(CCAACGTTGG)2. With respect to the electrostatic descriptions, two different force fields were studied: a traditional description based on atomic point charges and a polarizable force field. With respect to the cell environment, the difference between crystal and solution environments is emphasized, as well as the structural importance of divalent ions. By imposing the correct experimental unit cell environment, an initial configuration with two ideal B-DNA duplexes in the unit cell is shown to converge to the crystallographic structure. To the best of our knowledge, this provides the first example of a multiple nanosecond MD trajectory that shows and ideal structure converging to an experimental one, with a significant decay of the RMSD.
NASA Astrophysics Data System (ADS)
Ayyappan, S.; Sundaraganesan, N.; Aroulmoji, V.; Murano, E.; Sebastian, S.
2010-09-01
The FT-IR and FT-Raman spectral studies of the Methotrexate (MTX) were carried out. The equilibrium geometry, various bonding features and harmonic vibrational frequencies of MTX have been investigated with the help of B3LYP density functional theory (DFT) using 6-31G(d) as basis set. Detailed analysis of the vibrational spectra has been made with the aid of theoretically predicted vibrational frequencies. The vibrational analysis confirms the differently acting ring modes, steric repulsion, conjugation and back-donation. The energy and oscillator strength calculated by Time-Dependent Density Functional Theory (TD-DFT) results complement with the experimental findings. The calculated HOMO and LUMO energies show that charge transfer occur within the molecule. Good correlations between the experimental 1H and 13C NMR chemical shifts in DMSO solution and calculated GIAO shielding tensors were found.
Jeong, Youngtae; Hoang, Ngoc T.; Lovejoy, Alexander; Stehr, Henning; Newman, Aaron M.; Gentles, Andrew J.; Kong, William; Truong, Diana; Martin, Shanique; Chaudhuri, Aadel; Heiser, Diane; Zhou, Li; Say, Carmen; Carter, Justin N.; Hiniker, Susan M.; Loo, Billy W.; West, Robert B.; Beachy, Philip; Alizadeh, Ash A.; Diehn, Maximilian
2016-01-01
Lung squamous cell carcinomas (LSCC) pathogenesis remains incompletely understood and biomarkers predicting treatment response remain lacking. Here we describe novel murine LSCC models driven by loss of Trp53 and Keap1, both of which are frequently mutated in human LSCCs. Homozygous inactivation of Keap1 or Trp53 promoted airway basal stem cell (ABSC) self-renewal, suggesting that mutations in these genes lead to expansion of mutant stem cell clones. Deletion of Trp53 and Keap1 in ABSCs, but not more differentiated tracheal cells, produced tumors recapitulating histological and molecular features of human LSCCs, indicating that they represent the likely cell of origin in this model. Deletion of Keap1 promoted tumor aggressiveness, metastasis, and resistance to oxidative stress and radiotherapy (RT). KEAP1/NRF2 mutation status predicted risk of local recurrence after RT in non-small lung cancer (NSCLC) patients and could be non-invasively identified in circulating tumor DNA. Thus, KEAP1/NRF2 mutations could serve as predictive biomarkers for personalization of therapeutic strategies for NSCLCs. PMID:27663899
PPSP: prediction of PK-specific phosphorylation site with Bayesian decision theory.
Xue, Yu; Li, Ao; Wang, Lirong; Feng, Huanqing; Yao, Xuebiao
2006-03-20
As a reversible and dynamic post-translational modification (PTM) of proteins, phosphorylation plays essential regulatory roles in a broad spectrum of the biological processes. Although many studies have been contributed on the molecular mechanism of phosphorylation dynamics, the intrinsic feature of substrates specificity is still elusive and remains to be delineated. In this work, we present a novel, versatile and comprehensive program, PPSP (Prediction of PK-specific Phosphorylation site), deployed with approach of Bayesian decision theory (BDT). PPSP could predict the potential phosphorylation sites accurately for approximately 70 PK (Protein Kinase) groups. Compared with four existing tools Scansite, NetPhosK, KinasePhos and GPS, PPSP is more accurate and powerful than these tools. Moreover, PPSP also provides the prediction for many novel PKs, say, TRK, mTOR, SyK and MET/RON, etc. The accuracy of these novel PKs are also satisfying. Taken together, we propose that PPSP could be a potentially powerful tool for the experimentalists who are focusing on phosphorylation substrates with their PK-specific sites identification. Moreover, the BDT strategy could also be a ubiquitous approach for PTMs, such as sumoylation and ubiquitination, etc.
Gastric biomarkers: a global review.
Baniak, Nick; Senger, Jenna-Lynn; Ahmed, Shahid; Kanthan, S C; Kanthan, Rani
2016-08-11
Gastric cancer is an aggressive disease with a poor 5-year survival and large global burden of disease. The disease is biologically and genetically heterogeneous with a poorly understood carcinogenesis at the molecular level. Despite the many prognostic, predictive, and therapeutic biomarkers investigated to date, gastric cancer continues to be detected at an advanced stage with resultant poor clinical outcomes. This is a global review of gastric biomarkers with an emphasis on HER2, E-cadherin, fibroblast growth factor receptor, mammalian target of rapamycin, and hepatocyte growth factor receptor as well as sections on microRNAs, long noncoding RNAs, matrix metalloproteinases, PD-L1, TP53, and microsatellite instability. A deeper understanding of the pathogenesis and biological features of gastric cancer, including the identification and characterization of diagnostic, prognostic, predictive, and therapeutic biomarkers, hopefully will provide improved clinical outcomes.
Lin, Daniel W; Crawford, E David; Keane, Thomas; Evans, Brent; Reid, Julia; Rajamani, Saradha; Brown, Krystal; Gutin, Alexander; Tward, Jonathan; Scardino, Peter; Brawer, Michael; Stone, Steven; Cuzick, Jack
2018-06-01
A combined clinical cell-cycle risk (CCR) score that incorporates prognostic molecular and clinical information has been recently developed and validated to improve prostate cancer mortality (PCM) risk stratification over clinical features alone. As clinical features are currently used to select men for active surveillance (AS), we developed and validated a CCR score threshold to improve the identification of men with low-risk disease who are appropriate for AS. The score threshold was selected based on the 90th percentile of CCR scores among men who might typically be considered for AS based on NCCN low/favorable-intermediate risk criteria (CCR = 0.8). The threshold was validated using 10-year PCM in an unselected, conservatively managed cohort and in the subset of the same cohort after excluding men with high-risk features. The clinical effect was evaluated in a contemporary clinical cohort. In the unselected validation cohort, men with CCR scores below the threshold had a predicted mean 10-year PCM of 2.7%, and the threshold significantly dichotomized low- and high-risk disease (P = 1.2 × 10 -5 ). After excluding high-risk men from the validation cohort, men with CCR scores below the threshold had a predicted mean 10-year PCM of 2.3%, and the threshold significantly dichotomized low- and high-risk disease (P = 0.020). There were no prostate cancer-specific deaths in men with CCR scores below the threshold in either analysis. The proportion of men in the clinical testing cohort identified as candidates for AS was substantially higher using the threshold (68.8%) compared to clinicopathologic features alone (42.6%), while mean 10-year predicted PCM risks remained essentially identical (1.9% vs. 2.0%, respectively). The CCR score threshold appropriately dichotomized patients into low- and high-risk groups for 10-year PCM, and may enable more appropriate selection of patients for AS. Copyright © 2018 Elsevier Inc. All rights reserved.
Prediction of protein-protein interactions based on PseAA composition and hybrid feature selection.
Liu, Liang; Cai, Yudong; Lu, Wencong; Feng, Kaiyan; Peng, Chunrong; Niu, Bing
2009-03-06
Based on pseudo amino acid (PseAA) composition and a novel hybrid feature selection frame, this paper presents a computational system to predict the PPIs (protein-protein interactions) using 8796 protein pairs. These pairs are coded by PseAA composition, resulting in 114 features. A hybrid feature selection system, mRMR-KNNs-wrapper, is applied to obtain an optimized feature set by excluding poor-performed and/or redundant features, resulting in 103 remaining features. Using the optimized 103-feature subset, a prediction model is trained and tested in the k-nearest neighbors (KNNs) learning system. This prediction model achieves an overall accurate prediction rate of 76.18%, evaluated by 10-fold cross-validation test, which is 1.46% higher than using the initial 114 features and is 6.51% higher than the 20 features, coded by amino acid compositions. The PPIs predictor, developed for this research, is available for public use at http://chemdata.shu.edu.cn/ppi.
Ferkol, Thomas W.; Davis, Stephanie D.; Lee, Hye-Seung; Rosenfeld, Margaret; Dell, Sharon D.; Sagel, Scott D.; Milla, Carlos; Olivier, Kenneth N.; Sullivan, Kelli M.; Zariwala, Maimoona A.; Pittman, Jessica E.; Shapiro, Adam J.; Carson, Johnny L.; Krischer, Jeffrey; Hazucha, Milan J.
2016-01-01
Rationale: Primary ciliary dyskinesia (PCD), a genetically heterogeneous, recessive disorder of motile cilia, is associated with distinct clinical features. Diagnostic tests, including ultrastructural analysis of cilia, nasal nitric oxide measurements, and molecular testing for mutations in PCD genes, have inherent limitations. Objectives: To define a statistically valid combination of systematically defined clinical features that strongly associates with PCD in children and adolescents. Methods: Investigators at seven North American sites in the Genetic Disorders of Mucociliary Clearance Consortium prospectively and systematically assessed individuals (aged 0–18 yr) referred due to high suspicion for PCD. The investigators defined specific clinical questions for the clinical report form based on expert opinion. Diagnostic testing was performed using standardized protocols and included nasal nitric oxide measurement, ciliary biopsy for ultrastructural analysis of cilia, and molecular genetic testing for PCD-associated genes. Final diagnoses were assigned as “definite PCD” (hallmark ultrastructural defects and/or two mutations in a PCD-associated gene), “probable/possible PCD” (no ultrastructural defect or genetic diagnosis, but compatible clinical features and nasal nitric oxide level in PCD range), and “other diagnosis or undefined.” Criteria were developed to define early childhood clinical features on the basis of responses to multiple specific queries. Each defined feature was tested by logistic regression. Sensitivity and specificity analyses were conducted to define the most robust set of clinical features associated with PCD. Measurements and Main Results: From 534 participants 18 years of age and younger, 205 were identified as having “definite PCD” (including 164 with two mutations in a PCD-associated gene), 187 were categorized as “other diagnosis or undefined,” and 142 were defined as having “probable/possible PCD.” Participants with “definite PCD” were compared with the “other diagnosis or undefined” group. Four criteria-defined clinical features were statistically predictive of PCD: laterality defect; unexplained neonatal respiratory distress; early-onset, year-round nasal congestion; and early-onset, year-round wet cough (adjusted odds ratios of 7.7, 6.6, 3.4, and 3.1, respectively). The sensitivity and specificity based on the number of criteria-defined clinical features were four features, 0.21 and 0.99, respectively; three features, 0.50 and 0.96, respectively; and two features, 0.80 and 0.72, respectively. Conclusions: Systematically defined early clinical features could help identify children, including infants, likely to have PCD. Clinical trial registered with ClinicalTrials.gov (NCT00323167). PMID:27070726
Wang, ShaoPeng; Zhang, Yu-Hang; Lu, Jing; Cui, Weiren; Hu, Jerry; Cai, Yu-Dong
2016-01-01
The development of biochemistry and molecular biology has revealed an increasingly important role of compounds in several biological processes. Like the aptamer-protein interaction, aptamer-compound interaction attracts increasing attention. However, it is time-consuming to select proper aptamers against compounds using traditional methods, such as exponential enrichment. Thus, there is an urgent need to design effective computational methods for searching effective aptamers against compounds. This study attempted to extract important features for aptamer-compound interactions using feature selection methods, such as Maximum Relevance Minimum Redundancy, as well as incremental feature selection. Each aptamer-compound pair was represented by properties derived from the aptamer and compound, including frequencies of single nucleotides and dinucleotides for the aptamer, as well as the constitutional, electrostatic, quantum-chemical, and space conformational descriptors of the compounds. As a result, some important features were obtained. To confirm the importance of the obtained features, we further discussed the associations between them and aptamer-compound interactions. Simultaneously, an optimal prediction model based on the nearest neighbor algorithm was built to identify aptamer-compound interactions, which has the potential to be a useful tool for the identification of novel aptamer-compound interactions. The program is available upon the request. PMID:26955638
Yugandhar, K; Gromiha, M Michael
2014-09-01
Protein-protein interactions are intrinsic to virtually every cellular process. Predicting the binding affinity of protein-protein complexes is one of the challenging problems in computational and molecular biology. In this work, we related sequence features of protein-protein complexes with their binding affinities using machine learning approaches. We set up a database of 185 protein-protein complexes for which the interacting pairs are heterodimers and their experimental binding affinities are available. On the other hand, we have developed a set of 610 features from the sequences of protein complexes and utilized Ranker search method, which is the combination of Attribute evaluator and Ranker method for selecting specific features. We have analyzed several machine learning algorithms to discriminate protein-protein complexes into high and low affinity groups based on their Kd values. Our results showed a 10-fold cross-validation accuracy of 76.1% with the combination of nine features using support vector machines. Further, we observed accuracy of 83.3% on an independent test set of 30 complexes. We suggest that our method would serve as an effective tool for identifying the interacting partners in protein-protein interaction networks and human-pathogen interactions based on the strength of interactions. © 2014 Wiley Periodicals, Inc.
Wang, ShaoPeng; Zhang, Yu-Hang; Lu, Jing; Cui, Weiren; Hu, Jerry; Cai, Yu-Dong
2016-01-01
The development of biochemistry and molecular biology has revealed an increasingly important role of compounds in several biological processes. Like the aptamer-protein interaction, aptamer-compound interaction attracts increasing attention. However, it is time-consuming to select proper aptamers against compounds using traditional methods, such as exponential enrichment. Thus, there is an urgent need to design effective computational methods for searching effective aptamers against compounds. This study attempted to extract important features for aptamer-compound interactions using feature selection methods, such as Maximum Relevance Minimum Redundancy, as well as incremental feature selection. Each aptamer-compound pair was represented by properties derived from the aptamer and compound, including frequencies of single nucleotides and dinucleotides for the aptamer, as well as the constitutional, electrostatic, quantum-chemical, and space conformational descriptors of the compounds. As a result, some important features were obtained. To confirm the importance of the obtained features, we further discussed the associations between them and aptamer-compound interactions. Simultaneously, an optimal prediction model based on the nearest neighbor algorithm was built to identify aptamer-compound interactions, which has the potential to be a useful tool for the identification of novel aptamer-compound interactions. The program is available upon the request.
Radiomics-based Prognosis Analysis for Non-Small Cell Lung Cancer
NASA Astrophysics Data System (ADS)
Zhang, Yucheng; Oikonomou, Anastasia; Wong, Alexander; Haider, Masoom A.; Khalvati, Farzad
2017-04-01
Radiomics characterizes tumor phenotypes by extracting large numbers of quantitative features from radiological images. Radiomic features have been shown to provide prognostic value in predicting clinical outcomes in several studies. However, several challenges including feature redundancy, unbalanced data, and small sample sizes have led to relatively low predictive accuracy. In this study, we explore different strategies for overcoming these challenges and improving predictive performance of radiomics-based prognosis for non-small cell lung cancer (NSCLC). CT images of 112 patients (mean age 75 years) with NSCLC who underwent stereotactic body radiotherapy were used to predict recurrence, death, and recurrence-free survival using a comprehensive radiomics analysis. Different feature selection and predictive modeling techniques were used to determine the optimal configuration of prognosis analysis. To address feature redundancy, comprehensive analysis indicated that Random Forest models and Principal Component Analysis were optimum predictive modeling and feature selection methods, respectively, for achieving high prognosis performance. To address unbalanced data, Synthetic Minority Over-sampling technique was found to significantly increase predictive accuracy. A full analysis of variance showed that data endpoints, feature selection techniques, and classifiers were significant factors in affecting predictive accuracy, suggesting that these factors must be investigated when building radiomics-based predictive models for cancer prognosis.
Arnaud, Pauline; Hanna, Nadine; Aubart, Mélodie; Leheup, Bruno; Dupuis-Girod, Sophie; Naudion, Sophie; Lacombe, Didier; Milleron, Olivier; Odent, Sylvie; Faivre, Laurence; Bal, Laurence; Edouard, Thomas; Collod-Beroud, Gwenaëlle; Langeois, Maud; Spentchian, Myrtille; Gouya, Laurent; Jondeau, Guillaume; Boileau, Catherine
2017-02-01
Marfan syndrome (MFS) is an autosomal-dominant connective tissue disorder usually associated with heterozygous mutations in the gene encoding fibrillin-1 (FBN1). Homozygous and compound heterozygous cases are rare events and have been associated with a clinical severe presentation. Report unexpected findings of homozygosity and compound heterozygosity in the course of molecular diagnosis of heterozygous MFS and compare the findings with published cases. In the context of molecular diagnosis of heterozygous MFS, systematic sequencing of the FBN1 gene was performed in 2500 probands referred nationwide. 1400 probands carried a heterozygous mutation in this gene. Unexpectedly, among them four homozygous cases (0.29%) and five compound heterozygous cases (0.36%) were identified (total: 0.64%). Interestingly, none of these cases carried two premature termination codon mutations in the FBN1 gene. Clinical features for these carriers and their families were gathered and compared. There was a large spectrum of severity of the disease in probands carrying two mutated FBN1 alleles, but none of them presented extremely severe manifestations of MFS in any system compared with carriers of only one mutated FBN1 allele. This observation is not in line with the severe clinical features reported in the literature for four homozygous and three compound heterozygous probands. Homozygotes and compound heterozygotes were unexpectedly identified in the course of molecular diagnosis of MFS. Contrary to previous reports, the presence of two mutated alleles was not associated with severe forms of MFS. Although homozygosity and compound heterozygosity are rarely found in molecular diagnosis, they should not be overlooked, especially among consanguineous families. However, no predictive evaluation of severity should be provided. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Zlobec, Inti; Bihl, Michel; Foerster, Anja; Rufle, Alex; Lugli, Alessandro
2011-11-01
CpG island methylator phenotype (CIMP) is being investigated for its role in the molecular and prognostic classification of colorectal cancer patients but is also emerging as a factor with the potential to influence clinical decision-making. We report a comprehensive analysis of clinico-pathological and molecular features (KRAS, BRAF and microsatellite instability, MSI) as well as of selected tumour- and host-related protein markers characterizing CIMP-high (CIMP-H), -low, and -negative colorectal cancers. Immunohistochemical analysis for 48 protein markers and molecular analysis of CIMP (CIMP-H: ≥ 4/5 methylated genes), MSI (MSI-H: ≥ 2 instable genes), KRAS, and BRAF were performed on 337 colorectal cancers. Simple and multiple regression analysis and receiver operating characteristic (ROC) curve analysis were performed. CIMP-H was found in 24 cases (7.1%) and linked (p < 0.0001) to more proximal tumour location, BRAF mutation, MSI-H, MGMT methylation (p = 0.022), advanced pT classification (p = 0.03), mucinous histology (p = 0.069), and less frequent KRAS mutation (p = 0.067) compared to CIMP-low or -negative cases. Of the 48 protein markers, decreased levels of RKIP (p = 0.0056), EphB2 (p = 0.0045), CK20 (p = 0.002), and Cdx2 (p < 0.0001) and increased numbers of CD8+ intra-epithelial lymphocytes (p < 0.0001) were related to CIMP-H, independently of MSI status. In addition to the expected clinico-pathological and molecular associations, CIMP-H colorectal cancers are characterized by a loss of protein markers associated with differentiation, and metastasis suppression, and have increased CD8+ T-lymphocytes regardless of MSI status. In particular, Cdx2 loss seems to strongly predict CIMP-H in both microsatellite-stable (MSS) and MSI-H colorectal cancers. Cdx2 is proposed as a surrogate marker for CIMP-H. Copyright © 2011 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
Horiuchi, Katsumi; Ariga, Tadashi; Fujioka, Hirotaka; Kawashima, Kunihiro; Yamamoto, Yuhei; Igawa, Hiroharu; Sugihara, Tsuneki; Sakiyama, Yukio
2005-05-01
Treacher Collins Syndrome (TCS) (OMIM 154500) is a congenital, craniofacial disorder inherited as an autosomal dominant trait. The responsible gene for TCS, TCOF1, was mapped to 5q32-33.1 and identified in 1996. Since then, TCOF1 mutations in patients with TCS have been reported from Europe, North and South America, however, no TCS cases from an Asian country have been molecularly characterized. Here we report mutational analysis for 11 Japanese patients with TCS for the first time, and have identified TCOF1 mutations in 9 of them. The mutations detected were various, but most likely all the mutations are predicted to result in a truncated gene product, known as treacle. One mutation frequently reported was included in our cases, but no missense mutations were detected. These findings are similar to those for the previous studies for TCS in other races. We have speculated about the molecular mechanisms of the mutations in most cases. Collectively, we have defined some of the characteristic molecular features commonly observed in TCS patients, irrespective of racial difference. 2005 Wiley-Liss, Inc.
Molecular Predictors of 3D Morphogenesis by Breast Cancer Cell Lines in 3D Culture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Han, Ju; Chang, Hang; Giricz, Orsi
Correlative analysis of molecular markers with phenotypic signatures is the simplest model for hypothesis generation. In this paper, a panel of 24 breast cell lines was grown in 3D culture, their morphology was imaged through phase contrast microscopy, and computational methods were developed to segment and represent each colony at multiple dimensions. Subsequently, subpopulations from these morphological responses were identified through consensus clustering to reveal three clusters of round, grape-like, and stellate phenotypes. In some cases, cell lines with particular pathobiological phenotypes clustered together (e.g., ERBB2 amplified cell lines sharing the same morphometric properties as the grape-like phenotype). Next, associationsmore » with molecular features were realized through (i) differential analysis within each morphological cluster, and (ii) regression analysis across the entire panel of cell lines. In both cases, the dominant genes that are predictive of the morphological signatures were identified. Specifically, PPAR? has been associated with the invasive stellate morphological phenotype, which corresponds to triple-negative pathobiology. PPAR? has been validated through two supporting biological assays.« less
Trezza, Alfonso; Bernini, Andrea; Langella, Andrea; Ascher, David B; Pires, Douglas E V; Sodi, Andrea; Passerini, Ilaria; Pelo, Elisabetta; Rizzo, Stanislao; Niccolai, Neri; Spiga, Ottavia
2017-10-01
The aim of this article is to report the investigation of the structural features of ABCA4, a protein associated with a genetic retinal disease. A new database collecting knowledge of ABCA4 structure may facilitate predictions about the possible functional consequences of gene mutations observed in clinical practice. In order to correlate structural and functional effects of the observed mutations, the structure of mouse P-glycoprotein was used as a template for homology modeling. The obtained structural information and genetic data are the basis of our relational database (ABCA4Database). Sequence variability among all ABCA4-deposited entries was calculated and reported as Shannon entropy score at the residue level. The three-dimensional model of ABCA4 structure was used to locate the spatial distribution of the observed variable regions. Our predictions from structural in silico tools were able to accurately link the functional effects of mutations to phenotype. The development of the ABCA4Database gathers all the available genetic and structural information, yielding a global view of the molecular basis of some retinal diseases. ABCA4 modeled structure provides a molecular basis on which to analyze protein sequence mutations related to genetic retinal disease in order to predict the risk of retinal disease across all possible ABCA4 mutations. Additionally, our ABCA4 predicted structure is a good starting point for the creation of a new data analysis model, appropriate for precision medicine, in order to develop a deeper knowledge network of the disease and to improve the management of patients.
Predictive ecotoxicity of MoA 1 of organic chemicals using in silico approaches.
de Morais E Silva, Luana; Alves, Mateus Feitosa; Scotti, Luciana; Lopes, Wilton Silva; Scotti, Marcus Tullius
2018-05-30
Persistent organic products are compounds used for various purposes, such as personal care products, surfactants, colorants, industrial additives, food, pesticides and pharmaceuticals. These substances are constantly introduced into the environment and many of these pollutants are difficult to degrade. Toxic compounds classified as MoA 1 (Mode of Action 1) are low toxicity compounds that comprise nonreactive chemicals. In silico methods such as Quantitative Structure-Activity Relationships (QSARs) have been used to develop important models for prediction in several areas of science, as well as aquatic toxicity studies. The aim of the present study was to build a QSAR model-based set of theoretical Volsurf molecular descriptors using the fish acute toxicity values of compounds defined as MoA 1 to identify the molecular properties related to this mechanism. The selected Partial Least Squares (PLS) results based on the values of cross-validation coefficients of determination (Q cv 2 ) show the following values: Q cv 2 = 0.793, coefficient of determination (R 2 ) = 0.823, explained variance in external prediction (Q ext 2 ) = 0.87. From the selected descriptors, not only the hydrophobicity is related to the toxicity as already mentioned in previously published studies but other physicochemical properties combined contribute to the activity of these compounds. The symmetric distribution of the hydrophobic moieties in the structure of the compounds as well as the shape, as branched chains, are important features that are related to the toxicity. This information from the model can be useful in predicting so as to minimize the toxicity of organic compounds. Copyright © 2018. Published by Elsevier Inc.
Ahmed, Shiek S. S. J.; Ramakrishnan, V.
2012-01-01
Background Poor oral bioavailability is an important parameter accounting for the failure of the drug candidates. Approximately, 50% of developing drugs fail because of unfavorable oral bioavailability. In silico prediction of oral bioavailability (%F) based on physiochemical properties are highly needed. Although many computational models have been developed to predict oral bioavailability, their accuracy remains low with a significant number of false positives. In this study, we present an oral bioavailability model based on systems biological approach, using a machine learning algorithm coupled with an optimal discriminative set of physiochemical properties. Results The models were developed based on computationally derived 247 physicochemical descriptors from 2279 molecules, among which 969, 605 and 705 molecules were corresponds to oral bioavailability, intestinal absorption (HIA) and caco-2 permeability data set, respectively. The partial least squares discriminate analysis showed 49 descriptors of HIA and 50 descriptors of caco-2 are the major contributing descriptors in classifying into groups. Of these descriptors, 47 descriptors were commonly associated to HIA and caco-2, which suggests to play a vital role in classifying oral bioavailability. To determine the best machine learning algorithm, 21 classifiers were compared using a bioavailability data set of 969 molecules with 47 descriptors. Each molecule in the data set was represented by a set of 47 physiochemical properties with the functional relevance labeled as (+bioavailability/−bioavailability) to indicate good-bioavailability/poor-bioavailability molecules. The best-performing algorithm was the logistic algorithm. The correlation based feature selection (CFS) algorithm was implemented, which confirms that these 47 descriptors are the fundamental descriptors for oral bioavailability prediction. Conclusion The logistic algorithm with 47 selected descriptors correctly predicted the oral bioavailability, with a predictive accuracy of more than 71%. Overall, the method captures the fundamental molecular descriptors, that can be used as an entity to facilitate prediction of oral bioavailability. PMID:22815781
Ahmed, Shiek S S J; Ramakrishnan, V
2012-01-01
Poor oral bioavailability is an important parameter accounting for the failure of the drug candidates. Approximately, 50% of developing drugs fail because of unfavorable oral bioavailability. In silico prediction of oral bioavailability (%F) based on physiochemical properties are highly needed. Although many computational models have been developed to predict oral bioavailability, their accuracy remains low with a significant number of false positives. In this study, we present an oral bioavailability model based on systems biological approach, using a machine learning algorithm coupled with an optimal discriminative set of physiochemical properties. The models were developed based on computationally derived 247 physicochemical descriptors from 2279 molecules, among which 969, 605 and 705 molecules were corresponds to oral bioavailability, intestinal absorption (HIA) and caco-2 permeability data set, respectively. The partial least squares discriminate analysis showed 49 descriptors of HIA and 50 descriptors of caco-2 are the major contributing descriptors in classifying into groups. Of these descriptors, 47 descriptors were commonly associated to HIA and caco-2, which suggests to play a vital role in classifying oral bioavailability. To determine the best machine learning algorithm, 21 classifiers were compared using a bioavailability data set of 969 molecules with 47 descriptors. Each molecule in the data set was represented by a set of 47 physiochemical properties with the functional relevance labeled as (+bioavailability/-bioavailability) to indicate good-bioavailability/poor-bioavailability molecules. The best-performing algorithm was the logistic algorithm. The correlation based feature selection (CFS) algorithm was implemented, which confirms that these 47 descriptors are the fundamental descriptors for oral bioavailability prediction. The logistic algorithm with 47 selected descriptors correctly predicted the oral bioavailability, with a predictive accuracy of more than 71%. Overall, the method captures the fundamental molecular descriptors, that can be used as an entity to facilitate prediction of oral bioavailability.
Computing Prediction and Functional Analysis of Prokaryotic Propionylation.
Wang, Li-Na; Shi, Shao-Ping; Wen, Ping-Ping; Zhou, Zhi-You; Qiu, Jian-Ding
2017-11-27
Identification and systematic analysis of candidates for protein propionylation are crucial steps for understanding its molecular mechanisms and biological functions. Although several proteome-scale methods have been performed to delineate potential propionylated proteins, the majority of lysine-propionylated substrates and their role in pathological physiology still remain largely unknown. By gathering various databases and literatures, experimental prokaryotic propionylation data were collated to be trained in a support vector machine with various features via a three-step feature selection method. A novel online tool for seeking potential lysine-propionylated sites (PropSeek) ( http://bioinfo.ncu.edu.cn/PropSeek.aspx ) was built. Independent test results of leave-one-out and n-fold cross-validation were similar to each other, showing that PropSeek is a stable and robust predictor with satisfying performance. Meanwhile, analyses of Gene Ontology, Kyoto Encyclopedia of Genes and Genomes pathways, and protein-protein interactions implied a potential role of prokaryotic propionylation in protein synthesis and metabolism.
Game theory and neural basis of social decision making
Lee, Daeyeol
2008-01-01
Decision making in a social group displays two unique features. First, humans and other animals routinely alter their behaviors in response to changes in their physical and social environment. As a result, the outcomes of decisions that depend on the behaviors of multiple decision makers are difficult to predict, and this requires highly adaptive decision-making strategies. Second, decision makers may have other-regarding preferences and therefore choose their actions to improve or reduce the well-beings of others. Recently, many neurobiological studies have exploited game theory to probe the neural basis of decision making, and found that these unique features of social decision making might be reflected in the functions of brain areas involved in reward evaluation and reinforcement learning. Molecular genetic studies have also begun to identify genetic mechanisms for personal traits related to reinforcement learning and complex social decision making, further illuminating the biological basis of social behavior. PMID:18368047
An electronegativity-induced spin repulsion effect.
Stirling, Andras; Pasquarello, Alfredo
2005-09-22
We present a spin delocalization effect in radical Si-containing systems, featuring a heteroatom of high electronegativity (such as N, O, or Cl) bonded to the unsaturated Si atom. We find that the higher the electronegativity of the heteroatom, the more the localized spin shifts away from the unsaturated Si atom and the heteroatom toward saturated Si neighbors. We demonstrate that this spin repulsion toward saturated Si atoms is induced by the electronegativity difference between the Si atom and the heteroatoms. We present a simple molecular-orbital-based mechanism which fully explains the structural and electronic effects. We contrast the present spin delocalization mechanism with the classical hyperconjugation in organic chemistry. The most important consequences of this spin redistribution are the electron-spin-resonance activity of the saturated Si neighbors and the enhanced stability of the radical centers. We predict a similar effect for Ge radicals and discuss why organic systems based on carbon do not feature such spin repulsion.
Mo, Shaobo; Dai, Weixing; Xiang, Wenqiang; Li, Qingguo; Wang, Renjie; Cai, Guoxiang
2018-05-03
The objective of this study was to summarize the clinicopathological and molecular features of synchronous colorectal peritoneal metastases (CPM). We then combined clinical and pathological variables associated with synchronous CPM into a nomogram and confirmed its utilities using decision curve analysis. Synchronous metastatic colorectal cancer (mCRC) patients who received primary tumor resection and underwent KRAS, NRAS, and BRAF gene mutation detection at our center from January 2014 to September 2015 were included in this retrospective study. An analysis was performed to investigate the clinicopathological and molecular features for independent risk factors of synchronous CPM and to subsequently develop a nomogram for synchronous CPM based on multivariate logistic regression. Model performance was quantified in terms of calibration and discrimination. We studied the utility of the nomogram using decision curve analysis. In total, 226 patients were diagnosed with synchronous mCRC, of whom 50 patients (22.1%) presented with CPM. After uni- and multivariate analysis, a nomogram was built based on tumor site, histological type, age, and T4 status. The model had good discrimination with an area under the curve (AUC) at 0.777 (95% CI 0.703-0.850) and adequate calibration. By decision curve analysis, the model was shown to be relevant between thresholds of 0.10 and 0.66. Synchronous CPM is more likely to happen to patients with age ≤60, right-sided primary lesions, signet ring cell cancer or T4 stage. This is the first nomogram to predict synchronous CPM. To ensure generalizability, this model needs to be externally validated. Copyright © 2018 IJS Publishing Group Ltd. Published by Elsevier Ltd. All rights reserved.
Systemic Corticosteroid Responses in Children with Severe Asthma: Phenotypic and Endotypic Features.
Fitzpatrick, Anne M; Stephenson, Susan T; Brown, Milton R; Nguyen, Khristopher; Douglas, Shaneka; Brown, Lou Ann S
Severe asthma in children is a heterogeneous disorder associated with variable responses to corticosteroid treatment. Criterion standards for corticosteroid responsiveness assessment in children are lacking. This study sought to characterize systemic corticosteroid responses in children with severe asthma after treatment with intramuscular triamcinolone and to identify phenotypic and molecular predictors of an intramuscular triamcinolone response. Asthma-related quality of life, exhaled nitric oxide, blood eosinophils, lung function, and inflammatory cytokine and chemokine mRNA gene expression in peripheral blood mononuclear cells were assessed in 56 children with severe asthma at baseline and 14 days after intramuscular triamcinolone injection. The Asthma Control Questionnaire was used to classify children with severe asthma into corticosteroid response groups. Three groups of children with severe asthma were identified: controlled severe asthma, children who achieved control after triamcinolone, and children who did not achieve control. At baseline, these groups were phenotypically similar. After triamcinolone, discordance between symptoms, lung function, exhaled nitric oxide, and blood eosinophils was noted. Clinical phenotypic predictors were of limited utility in predicting the triamcinolone response, whereas systemic mRNA expression of inflammatory cytokines and chemokines related to IL-2, IL-10, and TNF signaling pathways, namely, AIMP1, CCR2, IL10RB, and IL5, strongly differentiated children who failed to achieve control with triamcinolone administration. Systemic corticosteroid responsiveness in children with severe asthma is heterogeneous. Alternative prediction models that include molecular endotypic as well as clinical phenotypic features are needed to identify which children derive the most clinical benefit from systemic corticosteroid step-up therapy given the potential side effects. Copyright © 2016 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
The value of nodal information in predicting lung cancer relapse using 4DPET/4DCT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Heyse, E-mail: heyse.li@mail.utoronto.ca; Becker, Nathan; Raman, Srinivas
2015-08-15
Purpose: There is evidence that computed tomography (CT) and positron emission tomography (PET) imaging metrics are prognostic and predictive in nonsmall cell lung cancer (NSCLC) treatment outcomes. However, few studies have explored the use of standardized uptake value (SUV)-based image features of nodal regions as predictive features. The authors investigated and compared the use of tumor and node image features extracted from the radiotherapy target volumes to predict relapse in a cohort of NSCLC patients undergoing chemoradiation treatment. Methods: A prospective cohort of 25 patients with locally advanced NSCLC underwent 4DPET/4DCT imaging for radiation planning. Thirty-seven image features were derivedmore » from the CT-defined volumes and SUVs of the PET image from both the tumor and nodal target regions. The machine learning methods of logistic regression and repeated stratified five-fold cross-validation (CV) were used to predict local and overall relapses in 2 yr. The authors used well-known feature selection methods (Spearman’s rank correlation, recursive feature elimination) within each fold of CV. Classifiers were ranked on their Matthew’s correlation coefficient (MCC) after CV. Area under the curve, sensitivity, and specificity values are also presented. Results: For predicting local relapse, the best classifier found had a mean MCC of 0.07 and was composed of eight tumor features. For predicting overall relapse, the best classifier found had a mean MCC of 0.29 and was composed of a single feature: the volume greater than 0.5 times the maximum SUV (N). Conclusions: The best classifier for predicting local relapse had only tumor features. In contrast, the best classifier for predicting overall relapse included a node feature. Overall, the methods showed that nodes add value in predicting overall relapse but not local relapse.« less
Bernini, Andrea; Henrici De Angelis, Lucia; Morandi, Edoardo; Spiga, Ottavia; Santucci, Annalisa; Assfalg, Michael; Molinari, Henriette; Pillozzi, Serena; Arcangeli, Annarosa; Niccolai, Neri
2014-03-01
Hotspot delineation on protein surfaces represents a fundamental step for targeting protein-protein interfaces. Disruptors of protein-protein interactions can be designed provided that the sterical features of binding pockets, including the transient ones, can be defined. Molecular Dynamics, MD, simulations have been used as a reliable framework for identifying transient pocket openings on the protein surface. Accessible surface area and intramolecular H-bond involvement of protein backbone amides are proposed as descriptors for characterizing binding pocket occurrence and evolution along MD trajectories. TEMPOL induced paramagnetic perturbations on (1)H-(15)N HSQC signals of protein backbone amides have been analyzed as a fragment-based search for surface hotspots, in order to validate MD predicted pockets. This procedure has been applied to CXCL12, a small chemokine responsible for tumor progression and proliferation. From combined analysis of MD data and paramagnetic profiles, two CXCL12 sites suitable for the binding of small molecules were identified. One of these sites is the already well characterized CXCL12 region involved in the binding to CXCR4 receptor. The other one is a transient pocket predicted by Molecular Dynamics simulations, which could not be observed from static analysis of CXCL12 PDB structures. The present results indicate how TEMPOL, instrumental in identifying this transient pocket, can be a powerful tool to delineate minor conformations which can be highly relevant in dynamic discovery of antitumoral drugs. Copyright © 2013 Elsevier B.V. All rights reserved.
Fulkerson, Christopher M; Dhawan, Deepika; Ratliff, Timothy L; Hahn, Noah M; Knapp, Deborah W
2017-01-01
Genomic analyses are defining numerous new targets for cancer therapy. Therapies aimed at specific genetic and epigenetic targets in cancer cells as well as expanded development of immunotherapies are placing increased demands on animal models. Traditional experimental models do not possess the collective features (cancer heterogeneity, molecular complexity, invasion, metastasis, and immune cell response) critical to predict success or failure of emerging therapies in humans. There is growing evidence, however, that dogs with specific forms of naturally occurring cancer can serve as highly relevant animal models to complement traditional models. Invasive urinary bladder cancer (invasive urothelial carcinoma (InvUC)) in dogs, for example, closely mimics the cancer in humans in pathology, molecular features, biological behavior including sites and frequency of distant metastasis, and response to chemotherapy. Genomic analyses are defining further intriguing similarities between InvUC in dogs and that in humans. Multiple canine clinical trials have been completed, and others are in progress with the aim of translating important findings into humans to increase the success rate of human trials, as well as helping pet dogs. Examples of successful targeted therapy studies and the challenges to be met to fully utilize naturally occurring dog models of cancer will be reviewed.
Fulkerson, Christopher M.; Ratliff, Timothy L.; Hahn, Noah M.
2017-01-01
Genomic analyses are defining numerous new targets for cancer therapy. Therapies aimed at specific genetic and epigenetic targets in cancer cells as well as expanded development of immunotherapies are placing increased demands on animal models. Traditional experimental models do not possess the collective features (cancer heterogeneity, molecular complexity, invasion, metastasis, and immune cell response) critical to predict success or failure of emerging therapies in humans. There is growing evidence, however, that dogs with specific forms of naturally occurring cancer can serve as highly relevant animal models to complement traditional models. Invasive urinary bladder cancer (invasive urothelial carcinoma (InvUC)) in dogs, for example, closely mimics the cancer in humans in pathology, molecular features, biological behavior including sites and frequency of distant metastasis, and response to chemotherapy. Genomic analyses are defining further intriguing similarities between InvUC in dogs and that in humans. Multiple canine clinical trials have been completed, and others are in progress with the aim of translating important findings into humans to increase the success rate of human trials, as well as helping pet dogs. Examples of successful targeted therapy studies and the challenges to be met to fully utilize naturally occurring dog models of cancer will be reviewed. PMID:28487862
NASA Astrophysics Data System (ADS)
Ji, Cuiying; Zhang, Xuewei; Yan, Xiaogang; Mostafizar Rahman, M.; Prates, Luciana L.; Yu, Peiqiang
2017-08-01
The objectives of this study were to: 1) investigate forage carbohydrate molecular structure profiles; 2) bio-functions in terms of CHO rumen degradation characteristics and hourly effective degradation ratio of N to OM (HEDN/OM), and 3) quantify interactive association between molecular structures, bio-functions and nutrient availability. The vibrational molecular spectroscopy was applied to investigate the structure feature on a molecular basis. Two sourced-origin alfalfa forages were used as modeled forages. The results showed that the carbohydrate molecular structure profiles were highly linked to the bio-functions in terms of rumen degradation characteristics and hourly effective degradation ratio. The molecular spectroscopic technique can be used to detect forage carbohydrate structure features on a molecular basis and can be used to study interactive association between forage molecular structure and bio-functions.
Statistical analyses and computational prediction of helical kinks in membrane proteins
NASA Astrophysics Data System (ADS)
Huang, Y.-H.; Chen, C.-M.
2012-10-01
We have carried out statistical analyses and computer simulations of helical kinks for TM helices in the PDBTM database. About 59 % of 1562 TM helices showed a significant kink, and 38 % of these kinks are associated with prolines in a range of ±4 residues. Our analyses show that helical kinks are more populated in the central region of helices, particularly in the range of 1-3 residues away from the helix center. Among 1,053 helical kinks analyzed, 88 % of kinks are bends (change in helix axis without loss of helical character) and 12 % are disruptions (change in helix axis and loss of helical character). It is found that proline residues tend to cause larger kink angles in helical bends, while this effect is not observed in helical disruptions. A further analysis of these kinked helices suggests that a kinked helix usually has 1-2 broken backbone hydrogen bonds with the corresponding N-O distance in the range of 4.2-8.7 Å, whose distribution is sharply peaked at 4.9 Å followed by an exponential decay with increasing distance. Our main aims of this study are to understand the formation of helical kinks and to predict their structural features. Therefore we further performed molecular dynamics (MD) simulations under four simulation scenarios to investigate kink formation in 37 kinked TM helices and 5 unkinked TM helices. The representative models of these kinked helices are predicted by a clustering algorithm, SPICKER, from numerous decoy structures possessing the above generic features of kinked helices. Our results show an accuracy of 95 % in predicting the kink position of kinked TM helices and an error less than 10° in the angle prediction of 71.4 % kinked helices. For unkinked helices, based on various structure similarity tests, our predicted models are highly consistent with their crystal structure. These results provide strong supports for the validity of our method in predicting the structure of TM helices.
Fu, Yong-Bi; Yang, Mo-Hua; Zeng, Fangqin; Biligetu, Bill
2017-01-01
Molecular plant breeding with the aid of molecular markers has played an important role in modern plant breeding over the last two decades. Many marker-based predictions for quantitative traits have been made to enhance parental selection, but the trait prediction accuracy remains generally low, even with the aid of dense, genome-wide SNP markers. To search for more accurate trait-specific prediction with informative SNP markers, we conducted a literature review on the prediction issues in molecular plant breeding and on the applicability of an RNA-Seq technique for developing function-associated specific trait (FAST) SNP markers. To understand whether and how FAST SNP markers could enhance trait prediction, we also performed a theoretical reasoning on the effectiveness of these markers in a trait-specific prediction, and verified the reasoning through computer simulation. To the end, the search yielded an alternative to regular genomic selection with FAST SNP markers that could be explored to achieve more accurate trait-specific prediction. Continuous search for better alternatives is encouraged to enhance marker-based predictions for an individual quantitative trait in molecular plant breeding. PMID:28729875
Visual Prediction Error Spreads Across Object Features in Human Visual Cortex
Summerfield, Christopher; Egner, Tobias
2016-01-01
Visual cognition is thought to rely heavily on contextual expectations. Accordingly, previous studies have revealed distinct neural signatures for expected versus unexpected stimuli in visual cortex. However, it is presently unknown how the brain combines multiple concurrent stimulus expectations such as those we have for different features of a familiar object. To understand how an unexpected object feature affects the simultaneous processing of other expected feature(s), we combined human fMRI with a task that independently manipulated expectations for color and motion features of moving-dot stimuli. Behavioral data and neural signals from visual cortex were then interrogated to adjudicate between three possible ways in which prediction error (surprise) in the processing of one feature might affect the concurrent processing of another, expected feature: (1) feature processing may be independent; (2) surprise might “spread” from the unexpected to the expected feature, rendering the entire object unexpected; or (3) pairing a surprising feature with an expected feature might promote the inference that the two features are not in fact part of the same object. To formalize these rival hypotheses, we implemented them in a simple computational model of multifeature expectations. Across a range of analyses, behavior and visual neural signals consistently supported a model that assumes a mixing of prediction error signals across features: surprise in one object feature spreads to its other feature(s), thus rendering the entire object unexpected. These results reveal neurocomputational principles of multifeature expectations and indicate that objects are the unit of selection for predictive vision. SIGNIFICANCE STATEMENT We address a key question in predictive visual cognition: how does the brain combine multiple concurrent expectations for different features of a single object such as its color and motion trajectory? By combining a behavioral protocol that independently varies expectation of (and attention to) multiple object features with computational modeling and fMRI, we demonstrate that behavior and fMRI activity patterns in visual cortex are best accounted for by a model in which prediction error in one object feature spreads to other object features. These results demonstrate how predictive vision forms object-level expectations out of multiple independent features. PMID:27810936
Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan
2012-12-01
A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.
Human-specific features of spatial gene expression and regulation in eight brain regions.
Xu, Chuan; Li, Qian; Efimova, Olga; He, Liu; Tatsumoto, Shoji; Stepanova, Vita; Oishi, Takao; Udono, Toshifumi; Yamaguchi, Katsushi; Shigenobu, Shuji; Kakita, Akiyoshi; Nawa, Hiroyuki; Khaitovich, Philipp; Go, Yasuhiro
2018-06-13
Molecular maps of the human brain alone do not inform us of the features unique to humans. Yet, the identification of these features is important for understanding both the evolution and nature of human cognition. Here, we approached this question by analyzing gene expression and H3K27ac chromatin modification data collected in eight brain regions of humans, chimpanzees, gorillas, a gibbon and macaques. An analysis of spatial transcriptome trajectories across eight brain regions in four primate species revealed 1,851 genes showing human-specific transcriptome differences in one or multiple brain regions, in contrast to 240 chimpanzee-specific ones. More than half of these human-specific differences represented elevated expression of genes enriched in neuronal and astrocytic markers in the human hippocampus, while the rest were enriched in microglial markers and displayed human-specific expression in several frontal cortical regions and the cerebellum. An analysis of the predicted regulatory interactions driving these differences revealed the role of transcription factors in species-specific transcriptome changes, while epigenetic modifications were linked to spatial expression differences conserved across species. Published by Cold Spring Harbor Laboratory Press.
Molecular Pathology: Predictive, Prognostic, and Diagnostic Markers in Uterine Tumors.
Ritterhouse, Lauren L; Howitt, Brooke E
2016-09-01
This article focuses on the diagnostic, prognostic, and predictive molecular biomarkers in uterine malignancies, in the context of morphologic diagnoses. The histologic classification of endometrial carcinomas is reviewed first, followed by the description and molecular classification of endometrial epithelial malignancies in the context of histologic classification. Taken together, the molecular and histologic classifications help clinicians to approach troublesome areas encountered in clinical practice and evaluate the utility of molecular alterations in the diagnosis and subclassification of endometrial carcinomas. Putative prognostic markers are reviewed. The use of molecular alterations and surrogate immunohistochemistry as prognostic and predictive markers is also discussed. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Sakkiah, Sugunadevi; Thangapandian, Sundarapandian; John, Shalini; Lee, Keun Woo
2011-01-01
This study was performed to find the selective chemical features for Aurora kinase-B inhibitors using the potent methods like Hip-Hop, virtual screening, homology modeling, molecular dynamics and docking. The best hypothesis, Hypo1 was validated toward a wide range of test set containing the selective inhibitors of Aurora kinase-B. Homology modeling and molecular dynamics studies were carried out to perform the molecular docking studies. The best hypothesis Hypo1 was used as a 3D query to screen the chemical databases. The screened molecules from the databases were sorted based on ADME and drug like properties. The selective hit compounds were docked and the hydrogen bond interactions with the critical amino acids present in Aurora kinase-B were compared with the chemical features present in the Hypo1. Finally, we suggest that the chemical features present in the Hypo1 are vital for a molecule to inhibit the Aurora kinase-B activity.
Machine learning in computational docking.
Khamis, Mohamed A; Gomaa, Walid; Ahmed, Walaa F
2015-03-01
The objective of this paper is to highlight the state-of-the-art machine learning (ML) techniques in computational docking. The use of smart computational methods in the life cycle of drug design is relatively a recent development that has gained much popularity and interest over the last few years. Central to this methodology is the notion of computational docking which is the process of predicting the best pose (orientation + conformation) of a small molecule (drug candidate) when bound to a target larger receptor molecule (protein) in order to form a stable complex molecule. In computational docking, a large number of binding poses are evaluated and ranked using a scoring function. The scoring function is a mathematical predictive model that produces a score that represents the binding free energy, and hence the stability, of the resulting complex molecule. Generally, such a function should produce a set of plausible ligands ranked according to their binding stability along with their binding poses. In more practical terms, an effective scoring function should produce promising drug candidates which can then be synthesized and physically screened using high throughput screening process. Therefore, the key to computer-aided drug design is the design of an efficient highly accurate scoring function (using ML techniques). The methods presented in this paper are specifically based on ML techniques. Despite many traditional techniques have been proposed, the performance was generally poor. Only in the last few years started the application of the ML technology in the design of scoring functions; and the results have been very promising. The ML-based techniques are based on various molecular features extracted from the abundance of protein-ligand information in the public molecular databases, e.g., protein data bank bind (PDBbind). In this paper, we present this paradigm shift elaborating on the main constituent elements of the ML approach to molecular docking along with the state-of-the-art research in this area. For instance, the best random forest (RF)-based scoring function on PDBbind v2007 achieves a Pearson correlation coefficient between the predicted and experimentally determined binding affinities of 0.803 while the best conventional scoring function achieves 0.644. The best RF-based ranking power ranks the ligands correctly based on their experimentally determined binding affinities with accuracy 62.5% and identifies the top binding ligand with accuracy 78.1%. We conclude with open questions and potential future research directions that can be pursued in smart computational docking; using molecular features of different nature (geometrical, energy terms, pharmacophore), advanced ML techniques (e.g., deep learning), combining more than one ML models. Copyright © 2015 Elsevier B.V. All rights reserved.
DeScipio, Cheryl; Conlin, Laura; Rosenfeld, Jill; Tepperberg, James; Pasion, Romela; Patel, Ankita; McDonald, Marie T; Aradhya, Swaroop; Ho, Darlene; Goldstein, Jennifer; McGuire, Marianne; Mulchandani, Surabhi; Medne, Livija; Rupps, Rosemarie; Serrano, Alvaro H.; Thorland, Erik C; Tsai, Anne C-H; Hilhorst-Hofstee, Yvonne; Ruivenkamp, Claudia AL; Van Esch, Hilde; Addor, Marie-Claude; Martinet, Danielle; Mason, Thornton B.A.; Clark, Dinah; Spinner, Nancy B; Krantz, Ian D
2012-01-01
We describe 19 unrelated individuals with submicroscopic deletions involving 10p15.3 characterized by chromosomal microarray (CMA). Interestingly, to our knowledge, only two individuals with isolated, submicroscopic 10p15.3 deletion have been reported to date; however, only limited clinical information is available for these probands and the deleted region has not been molecularly mapped. Comprehensive clinical history was obtained for 12 of the 19 individuals described in this study. Common features among these 12 individuals include: cognitive/behavioral/developmental differences (11/11), speech delay/language disorder (10/10), motor delay (10/10), craniofacial dysmorphism (9/12), hypotonia (7/11,), brain anomalies (4/6) and seizures (3/7). Parental studies were performed for nine of the 19 individuals; the 10p15.3 deletion was de novo in seven of the probands, not maternally inherited in one proband and inherited from an apparently affected mother in one proband. Molecular mapping of the 19 individuals reported in this study has identified two genes, ZMYND11 (OMIM# 608668) and DIP2C (OMIM# 611380) (UCSC Genome Browser), mapping within 10p15.3 which are most commonly deleted. Although no single gene has been identified which is deleted in all 19 individuals studied, the deleted region in all but one individual includes ZMYND11 and the deleted region in all but one other individual includes DIP2C. There is not a clearly identifiable phenotypic difference between these two individuals and the size of the deleted region does not generally predict clinical features. Little is currently known about these genes complicating a direct genotype/phenotype correlation at this time. These data however, suggest that ZMYND11 and/or DIP2C haploinsufficiency contributes to the clinical features associated with 10p15 deletions in probands described in this study. PMID:22847950
DeScipio, Cheryl; Conlin, Laura; Rosenfeld, Jill; Tepperberg, James; Pasion, Romela; Patel, Ankita; McDonald, Marie T; Aradhya, Swaroop; Ho, Darlene; Goldstein, Jennifer; McGuire, Marianne; Mulchandani, Surabhi; Medne, Livija; Rupps, Rosemarie; Serrano, Alvaro H; Thorland, Erik C; Tsai, Anne C-H; Hilhorst-Hofstee, Yvonne; Ruivenkamp, Claudia A L; Van Esch, Hilde; Addor, Marie-Claude; Martinet, Danielle; Mason, Thornton B A; Clark, Dinah; Spinner, Nancy B; Krantz, Ian D
2012-09-01
We describe 19 unrelated individuals with submicroscopic deletions involving 10p15.3 characterized by chromosomal microarray (CMA). Interestingly, to our knowledge, only two individuals with isolated, submicroscopic 10p15.3 deletion have been reported to date; however, only limited clinical information is available for these probands and the deleted region has not been molecularly mapped. Comprehensive clinical history was obtained for 12 of the 19 individuals described in this study. Common features among these 12 individuals include: cognitive/behavioral/developmental differences (11/11), speech delay/language disorder (10/10), motor delay (10/10), craniofacial dysmorphism (9/12), hypotonia (7/11), brain anomalies (4/6) and seizures (3/7). Parental studies were performed for nine of the 19 individuals; the 10p15.3 deletion was de novo in seven of the probands, not maternally inherited in one proband and inherited from an apparently affected mother in one proband. Molecular mapping of the 19 individuals reported in this study has identified two genes, ZMYND11 (OMIM 608668) and DIP2C (OMIM 611380; UCSC Genome Browser), mapping within 10p15.3 which are most commonly deleted. Although no single gene has been identified which is deleted in all 19 individuals studied, the deleted region in all but one individual includes ZMYND11 and the deleted region in all but one other individual includes DIP2C. There is not a clearly identifiable phenotypic difference between these two individuals and the size of the deleted region does not generally predict clinical features. Little is currently known about these genes complicating a direct genotype/phenotype correlation at this time. These data however, suggest that ZMYND11 and/or DIP2C haploinsufficiency contributes to the clinical features associated with 10p15 deletions in probands described in this study. Copyright © 2012 Wiley Periodicals, Inc.
Davatzikos, Christos; Rathore, Saima; Bakas, Spyridon; Pati, Sarthak; Bergman, Mark; Kalarot, Ratheesh; Sridharan, Patmaa; Gastounioti, Aimilia; Jahani, Nariman; Cohen, Eric; Akbari, Hamed; Tunc, Birkan; Doshi, Jimit; Parker, Drew; Hsieh, Michael; Sotiras, Aristeidis; Li, Hongming; Ou, Yangming; Doot, Robert K; Bilello, Michel; Fan, Yong; Shinohara, Russell T; Yushkevich, Paul; Verma, Ragini; Kontos, Despina
2018-01-01
The growth of multiparametric imaging protocols has paved the way for quantitative imaging phenotypes that predict treatment response and clinical outcome, reflect underlying cancer molecular characteristics and spatiotemporal heterogeneity, and can guide personalized treatment planning. This growth has underlined the need for efficient quantitative analytics to derive high-dimensional imaging signatures of diagnostic and predictive value in this emerging era of integrated precision diagnostics. This paper presents cancer imaging phenomics toolkit (CaPTk), a new and dynamically growing software platform for analysis of radiographic images of cancer, currently focusing on brain, breast, and lung cancer. CaPTk leverages the value of quantitative imaging analytics along with machine learning to derive phenotypic imaging signatures, based on two-level functionality. First, image analysis algorithms are used to extract comprehensive panels of diverse and complementary features, such as multiparametric intensity histogram distributions, texture, shape, kinetics, connectomics, and spatial patterns. At the second level, these quantitative imaging signatures are fed into multivariate machine learning models to produce diagnostic, prognostic, and predictive biomarkers. Results from clinical studies in three areas are shown: (i) computational neuro-oncology of brain gliomas for precision diagnostics, prediction of outcome, and treatment planning; (ii) prediction of treatment response for breast and lung cancer, and (iii) risk assessment for breast cancer.
Jiang, Xiaoyu; Fuchs, Mathias
2017-01-01
As modern biotechnologies advance, it has become increasingly frequent that different modalities of high-dimensional molecular data (termed “omics” data in this paper), such as gene expression, methylation, and copy number, are collected from the same patient cohort to predict the clinical outcome. While prediction based on omics data has been widely studied in the last fifteen years, little has been done in the statistical literature on the integration of multiple omics modalities to select a subset of variables for prediction, which is a critical task in personalized medicine. In this paper, we propose a simple penalized regression method to address this problem by assigning different penalty factors to different data modalities for feature selection and prediction. The penalty factors can be chosen in a fully data-driven fashion by cross-validation or by taking practical considerations into account. In simulation studies, we compare the prediction performance of our approach, called IPF-LASSO (Integrative LASSO with Penalty Factors) and implemented in the R package ipflasso, with the standard LASSO and sparse group LASSO. The use of IPF-LASSO is also illustrated through applications to two real-life cancer datasets. All data and codes are available on the companion website to ensure reproducibility. PMID:28546826
Expectation and Surprise Determine Neural Population Responses in the Ventral Visual Stream
Egner, Tobias; Monti, Jim M.; Summerfield, Christopher
2014-01-01
Visual cortex is traditionally viewed as a hierarchy of neural feature detectors, with neural population responses being driven by bottom-up stimulus features. Conversely, “predictive coding” models propose that each stage of the visual hierarchy harbors two computationally distinct classes of processing unit: representational units that encode the conditional probability of a stimulus and provide predictions to the next lower level; and error units that encode the mismatch between predictions and bottom-up evidence, and forward prediction error to the next higher level. Predictive coding therefore suggests that neural population responses in category-selective visual regions, like the fusiform face area (FFA), reflect a summation of activity related to prediction (“face expectation”) and prediction error (“face surprise”), rather than a homogenous feature detection response. We tested the rival hypotheses of the feature detection and predictive coding models by collecting functional magnetic resonance imaging data from the FFA while independently varying both stimulus features (faces vs houses) and subjects’ perceptual expectations regarding those features (low vs medium vs high face expectation). The effects of stimulus and expectation factors interacted, whereby FFA activity elicited by face and house stimuli was indistinguishable under high face expectation and maximally differentiated under low face expectation. Using computational modeling, we show that these data can be explained by predictive coding but not by feature detection models, even when the latter are augmented with attentional mechanisms. Thus, population responses in the ventral visual stream appear to be determined by feature expectation and surprise rather than by stimulus features per se. PMID:21147999
Molecular transistors based on BDT-type molecular bridges.
Wheeler, W D; Dahnovsky, Yu
2008-10-21
In this work we study the effect of electron correlations in molecular transistors with molecular bridges based on 1,4-benzene-dithiol (BDT) and 2-nitro-1,4-benzene-dithiol (nitro-BDT) by using ab initio electron propagator calculations. We find that there is no gate field effect for the BDT based transistor in accordance with the experimental data. After verifying the computational method on the BDT molecule, we consider a transistor with a nitro-BDT molecular bridge. From the electron propagator calculations, we predict strong negative differential resistance at small positive and negative values of source-drain voltages. The explanation of the peak and the minimum in the current is given in terms of the molecular orbital picture and switch-on (-off) properties due to the voltage dependencies of the Dyson poles (ionization potentials). When the current is off, the electronic states on both electrodes are populated resulting in the vanishing tunneling probability due to the Pauli principle. Besides the minimum and the maximum in the I-V characteristics, we find a strong gate field effect in the conductance where the peak at V(sd) = 0.15 eV and E(g) = 4x10(-3) a.u. switches to the minimum at E(g) = -4x10(-3) a.u. A similar behavior is discovered at the negative V(sd). Such a feature can be used for fast current modulation by changing the polarity of a gate field.
Molecular determinants archetypical to the phylum Nematoda
2009-01-01
Background Nematoda diverged from other animals between 600–1,200 million years ago and has become one of the most diverse animal phyla on earth. Most nematodes are free-living animals, but many are parasites of plants and animals including humans, posing major ecological and economical challenges around the world. Results We investigated phylum-specific molecular characteristics in Nematoda by exploring over 214,000 polypeptides from 32 nematode species including 27 parasites. Over 50,000 nematode protein families were identified based on primary sequence, including ~10% with members from at least three different species. Nearly 1,600 of the multi-species families did not share homology to Pfam domains, including a total of 758 restricted to Nematoda. Majority of the 462 families that were conserved among both free-living and parasitic species contained members from multiple nematode clades, yet ~90% of the 296 parasite-specific families originated only from a single clade. Features of these protein families were revealed through extrapolation of essential functions from observed RNAi phenotypes in C. elegans, bioinformatics-based functional annotations, identification of distant homology based on protein folds, and prediction of expression at accessible nematode surfaces. In addition, we identified a group of nematode-restricted sequence features in energy-generating electron transfer complexes as potential targets for new chemicals with minimal or no toxicity to the host. Conclusion This study identified and characterized the molecular determinants that help in defining the phylum Nematoda, and therefore improved our understanding of nematode protein evolution and provided novel insights for the development of next generation parasite control strategies. PMID:19296854
Adrenocortical adenoma and carcinoma: histopathological and molecular comparative analysis.
Stojadinovic, Alexander; Brennan, Murray F; Hoos, Axel; Omeroglu, Atilla; Leung, Denis H Y; Dudas, Maria E; Nissan, Aviram; Cordon-Cardo, Carlos; Ghossein, Ronald A
2003-08-01
We compared histomorphological features and molecular expression profiles of adrenocortical adenomas (ACAd) and carcinomas (ACCa). A critical histopathological review (mean, 11 slides per patient) was conducted of 37 ACAd and 67 ACCa. Paraffin-embedded tissue cores of ACAd (n = 33) and ACCa (n = 38) were arrayed in triplicate on tissue microarrays. Expression profiles of p53, mdm-2, p21, Bcl-2, cyclin D1, p27, and Ki-67 were investigated by immunohistochemistry and correlated with histopathology and patient outcome using standard statistical methodology. Median follow-up period was 5 years. Tumor necrosis, atypical mitoses, and >1 mitosis per 50 high-power fields were factors that were highly specific for ACCa (P <.001). Number (0 to 4) of unfavorable markers [Ki-67 (+), p21 (+), p27 (+), mdm-2(-)] expressed was significantly associated with mitotic activity and morphologic index (i.e., number of adverse morphologic features) and highly predictive of malignancy (P <.001). Ki-67 overexpression occurred in 0 ACAd and 36% ACCa (P <.001) and was significantly associated with mitotic rate and unfavorable morphologic index (P <.001). Tumor necrosis, atypical mitoses, >5 mitoses per 50 high-power fields, sinusoidal invasion, histologic index of >5, and presence of more than two unfavorable molecular markers were associated significantly with metastasis in ACCa. Well-established histopathologic criteria and Ki-67 can specifically distinguish ACCAd from ACCa. Tumor cell proliferation (Ki-67) correlates with mitotic activity and morphologic index. Tumor morphology is a better predictor of metastatic risk in ACCa than current immunohistochemistry-detected cell cycle regulatory and proliferation-associated proteins.
Control of Ultracold Photodissociation with Magnetic Fields
NASA Astrophysics Data System (ADS)
McDonald, M.; Majewska, I.; Lee, C.-H.; Kondov, S. S.; McGuyer, B. H.; Moszynski, R.; Zelevinsky, T.
2018-01-01
Photodissociation of a molecule produces a spatial distribution of photofragments determined by the molecular structure and the characteristics of the dissociating light. Performing this basic reaction at ultracold temperatures allows its quantum mechanical features to dominate. In this regime, weak applied fields can be used to control the reaction. Here, we photodissociate ultracold diatomic strontium in magnetic fields below 10 G and observe striking changes in photofragment angular distributions. The observations are in excellent agreement with a multichannel quantum chemistry model that includes nonadiabatic effects and predicts strong mixing of partial waves in the photofragment energy continuum. The experiment is enabled by precise quantum-state control of the molecules.
PET imaging: implications for the future of therapy monitoring with PET/CT in oncology.
Tomasi, Giampaolo; Rosso, Lula
2012-10-01
Among the methods based on molecular imaging, the measure of the tracer uptake variation between a baseline and follow-up scan with the SUV and [(18)F]FDG-PET/CT is a very powerful tool for assessing response to treatment in oncology. However, the development of new targeted therapeutics and tissue pharmacokinetic evaluation of existing ones are increasingly requiring therapy monitoring with alternative tracers and indicators. In parallel, the potential predictive and prognostic value of other image-derived parameters, such as tumour volume and textural features, relating to tumoral heterogeneity, has recently emerged from several works. Copyright © 2012 Elsevier Ltd. All rights reserved.
Nayana, M Ravi Shashi; Sekhar, Y Nataraja; Nandyala, Haritha; Muttineni, Ravikumar; Bairy, Santosh Kumar; Singh, Kriti; Mahmood, S K
2008-10-01
In the present study, a series of 179 quinoline and quinazoline heterocyclic analogues exhibiting inhibitory activity against Gastric (H+/K+)-ATPase were investigated using the comparative molecular field analysis (CoMFA) and comparative molecular similarity indices (CoMSIA) methods. Both the models exhibited good correlation between the calculated 3D-QSAR fields and the observed biological activity for the respective training set compounds. The most optimal CoMFA and CoMSIA models yielded significant leave-one-out cross-validation coefficient, q(2) of 0.777, 0.744 and conventional cross-validation coefficient, r(2) of 0.927, 0.914 respectively. The predictive ability of generated models was tested on a set of 52 compounds having broad range of activity. CoMFA and CoMSIA yielded predicted activities for test set compounds with r(pred)(2) of 0.893 and 0.917 respectively. These validation tests not only revealed the robustness of the models but also demonstrated that for our models r(pred)(2) based on the mean activity of test set compounds can accurately estimate external predictivity. The factors affecting activity were analyzed carefully according to standard coefficient contour maps of steric, electrostatic, hydrophobic, acceptor and donor fields derived from the CoMFA and CoMSIA. These contour plots identified several key features which explain the wide range of activities. The results obtained from models offer important structural insight into designing novel peptic-ulcer inhibitors prior to their synthesis.
A thermodynamic unification of jamming
NASA Astrophysics Data System (ADS)
Lu, Kevin; Brodsky, E. E.; Kavehpour, H. P.
2008-05-01
Fragile materials ranging from sand to fire retardant to toothpaste are able to exhibit both solid and fluid-like properties across the jamming transition. Unlike ordinary fusion, systems of grains, foams and colloids jam and cease to flow under conditions that still remain unknown. Here, we quantify jamming using a thermodynamic approach by accounting for the structural ageing and the shear-induced compressibility of dry sand. Specifically, the jamming threshold is defined using a non-thermal temperature that measures the `fluffiness' of a granular mixture. The thermodynamic model, cast in terms of pressure, temperature and free volume, also successfully predicts the entropic data of five molecular glasses. Notably, the predicted configurational entropy averts the Kauzmann paradox-an unresolved crisis where the configurational entropy becomes negative-entirely. Without any free parameters, the proposed equation-of-state also governs the mechanism of shear banding and the associated features of shear softening and thickness invariance.
Blood analysis by Raman spectroscopy.
Enejder, Annika M K; Koo, Tae-Woong; Oh, Jeankun; Hunter, Martin; Sasic, Slobodan; Feld, Michael S; Horowitz, Gary L
2002-11-15
Concentrations of multiple analytes were simultaneously measured in whole blood with clinical accuracy, without sample processing, using near-infrared Raman spectroscopy. Spectra were acquired with an instrument employing nonimaging optics, designed using Monte Carlo simulations of the influence of light-scattering-absorbing blood cells on the excitation and emission of Raman light in turbid medium. Raman spectra were collected from whole blood drawn from 31 individuals. Quantitative predictions of glucose, urea, total protein, albumin, triglycerides, hematocrit, and hemoglobin were made by means of partial least-squares (PLS) analysis with clinically relevant precision (r(2) values >0.93). The similarity of the features of the PLS calibration spectra to those of the respective analyte spectra illustrates that the predictions are based on molecular information carried by the Raman light. This demonstrates the feasibility of using Raman spectroscopy for quantitative measurements of biomolecular contents in highly light-scattering and absorbing media.
NASA Astrophysics Data System (ADS)
Shtykova, E. V.; Bogacheva, E. N.; Dadinova, L. A.; Jeffries, C. M.; Fedorova, N. V.; Golovko, A. O.; Baratova, L. A.; Batishchev, O. V.
2017-11-01
A complex structural analysis of nuclear export protein NS2 (NEP) of influenza virus A has been performed using bioinformatics predictive methods and small-angle X-ray scattering data. The behavior of NEP molecules in a solution (their aggregation, oligomerization, and dissociation, depending on the buffer composition) has been investigated. It was shown that stable associates are formed even in a conventional aqueous salt solution at physiological pH value. For the first time we have managed to get NEP dimers in solution, to analyze their structure, and to compare the models obtained using the method of the molecular tectonics with the spatial protein structure predicted by us using the bioinformatics methods. The results of the study provide a new insight into the structural features of nuclear export protein NS2 (NEP) of the influenza virus A, which is very important for viral infection development.
Binato, Renata; Santos, Everton Cruz; Boroni, Mariana; Demachki, Samia; Assumpção, Paulo; Abdelhay, Eliana
2018-01-26
Gastric carcinoma (GC) is one of the most aggressive cancers and the second leading cause of cancer death in the world. According to the Lauren classification, this adenocarcinoma is divided into two subtypes, intestinal and diffuse, which differ in their clinical, epidemiological and molecular features. Several studies have attempted to delineate the molecular signature of gastric cancer to develop new and non-invasive screening tests that improve diagnosis and lead to new treatment strategies. However, a consensus signature has not yet been identified for each condition. Thus, this work aimed to analyze the gene expression profile of Brazilian intestinal-type GC tissues using microarrays and compare the results to those of non-tumor tissue samples. Moreover, we compared our intestinal-type gastric carcinoma profile with those obtained from populations worldwide to assess their similarity. The results identified a molecular signature for intestinal-type GC and revealed that 38 genes differentially expressed in Brazilian intestinal-type gastric carcinoma samples can successfully distinguish gastric tumors from non-tumor tissue in the global population. These differentially expressed genes participate in biological processes important to cell homeostasis. Furthermore, Kaplan-Meier analysis suggested that 7 of these genes could individually be able to predict overall survival in intestinal-type gastric cancer patients.
Margreitter, Christian; Mayrhofer, Patrick; Kunert, Renate; Oostenbrink, Chris
2016-06-01
Monoclonal antibodies represent the fastest growing class of biotherapeutic proteins. However, as they are often initially derived from rodent organisms, there is a severe risk of immunogenic reactions, hampering their applicability. The humanization of these antibodies remains a challenging task in the context of rational drug design. "Superhumanization" describes the direct transfer of the complementarity determining regions to a human germline framework, but this humanization approach often results in loss of binding affinity. In this study, we present a new approach for predicting promising backmutation sites using molecular dynamics simulations of the model antibody Ab2/3H6. The simulation method was developed in close conjunction with novel specificity experiments. Binding properties of mAb variants were evaluated directly from crude supernatants and confirmed using established binding affinity assays for purified antibodies. Our approach provides access to the dynamical features of the actual binding sites of an antibody, based solely on the antibody sequence. Thus we do not need structural data on the antibody-antigen complex and circumvent cumbersome methods to assess binding affinities. © 2016 The Authors Journal of Molecular Recognition Published by John Wiley & Sons Ltd. © 2016 The Authors Journal of Molecular Recognition Published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Fytilis, N.; Lamb, R.; Kerans, B.; Stevens, L.; Rizzo, D. M.
2011-12-01
Fish diseases are often caused by waterborne parasites, making them ideal systems for modeling the non-linear relationships between disease dynamics, stream dwelling oligochaete communities and geochemical features. Myxobolus cerebralis, the causative agent of whirling disease in salmonid fishes, has been a major contributor to the loss of wild rainbow trout populations in numerous streams within the Intermountain West. The parasite alternates between an invertebrate and vertebrate host, being transmitted between the sediment feeding worm Tubifex tubifex (T.tubifex) and salmonid fishes. Worm community biodiversity and abundance are influenced by biogeochemical features and have been linked to disease severity in fish. The worm (T.tubifex) lives in communities with 3-4 other types of worms in stream sediments. Unfortunately, taxonomic identification of oligochaetes is largely dependent on morphological characteristics of sexually mature adults. We have collected and identified ~700 worms from eight sites using molecular genetic probes and a taxonomic key. Additionally, ~1700 worms were identified using only molecular genetic probes. To facilitate distinguishing among tubificids, we developed two multiplex molecular genetic probe-based quantitative polymerase reaction (qPCR) assays to assess tubificid communities in the study area. Similar qPCR techniques specific for M.cerebralis used to determine if individual worms were infected with the parasite. We show how simple Bayesian analysis of the qPCR data can predict the worm community structure and reveal relationships between biodiversity of host communities and host-parasite dynamics. To our knowledge, this is the first study that combines molecular data of both the host and the parasite to examine the effects of host community structure on the transmission of a parasite. Our work can be extended to examine the links between worm community structure and biogeochemical features using molecular genetics and Bayesian statistics to assist in identifying new nonlinear relationships and suggest new subsets of input parameters. Future work includes the development of a new complex systems tool capable of assimilating biological DNA sequence data and biogeochemical features using artificial neural networks and Bayesian analysis. The methodologies developed here helped mine the relationships between biodiversity of host communities and host-parasite dynamics. The results from our study will be useful to managers and researchers for assessing the risk of whirling disease in drainages where tubificid community composition data are needed. This collaboration between modelers, field ecologists and geneticists will prove useful in modeling efforts and will enable more effective, high-volume hypothesis generation. The ability to characterize areas of high whirling disease risk is essential for improving our understanding of the dynamics of M.cerebralis such that appropriate management strategies can be implemented.
Predictive information processing in music cognition. A critical review.
Rohrmeier, Martin A; Koelsch, Stefan
2012-02-01
Expectation and prediction constitute central mechanisms in the perception and cognition of music, which have been explored in theoretical and empirical accounts. We review the scope and limits of theoretical accounts of musical prediction with respect to feature-based and temporal prediction. While the concept of prediction is unproblematic for basic single-stream features such as melody, it is not straight-forward for polyphonic structures or higher-order features such as formal predictions. Behavioural results based on explicit and implicit (priming) paradigms provide evidence of priming in various domains that may reflect predictive behaviour. Computational learning models, including symbolic (fragment-based), probabilistic/graphical, or connectionist approaches, provide well-specified predictive models of specific features and feature combinations. While models match some experimental results, full-fledged music prediction cannot yet be modelled. Neuroscientific results regarding the early right-anterior negativity (ERAN) and mismatch negativity (MMN) reflect expectancy violations on different levels of processing complexity, and provide some neural evidence for different predictive mechanisms. At present, the combinations of neural and computational modelling methodologies are at early stages and require further research. Copyright © 2012 Elsevier B.V. All rights reserved.
Tamez-Peña, Jose-Gerardo; Rodriguez-Rojas, Juan-Andrés; Gomez-Rueda, Hugo; Celaya-Padilla, Jose-Maria; Rivera-Prieto, Roxana-Alicia; Palacios-Corona, Rebeca; Garza-Montemayor, Margarita; Cardona-Huerta, Servando; Treviño, Victor
2018-01-01
In breast cancer, well-known gene expression subtypes have been related to a specific clinical outcome. However, their impact on the breast tissue phenotype has been poorly studied. Here, we investigate the association of imaging data of tumors to gene expression signatures from 71 patients with breast cancer that underwent pre-treatment digital mammograms and tumor biopsies. From digital mammograms, a semi-automated radiogenomics analysis generated 1,078 features describing the shape, signal distribution, and texture of tumors along their contralateral image used as control. From tumor biopsy, we estimated the OncotypeDX and PAM50 recurrence scores using gene expression microarrays. Then, we used multivariate analysis under stringent cross-validation to train models predicting recurrence scores. Few univariate features reached Spearman correlation coefficients above 0.4. Nevertheless, multivariate analysis yielded significantly correlated models for both signatures (correlation of OncotypeDX = 0.49 ± 0.07 and PAM50 = 0.32 ± 0.10 in stringent cross-validation and OncotypeDX = 0.83 and PAM50 = 0.78 for a unique model). Equivalent models trained from the unaffected contralateral breast were not correlated suggesting that the image signatures were tumor-specific and that overfitting was not a considerable issue. We also noted that models were improved by combining clinical information (triple negative status and progesterone receptor). The models used mostly wavelets and fractal features suggesting their importance to capture tumor information. Our results suggest that molecular-based recurrence risk and breast cancer subtypes have observable radiographic phenotypes. To our knowledge, this is the first study associating mammographic information to gene expression recurrence signatures.
Amendoeira, Isabel; Maia, Tiago; Sobrinho-Simões, Manuel
2018-04-01
The 2017 edition of the WHO book on Classification of Tumours of Endocrine Organs includes a new section entitled 'Other encapsulated follicular-patterned thyroid tumours', in which the newly created NIFTP (non-invasive follicular thyroid neoplasm with papillary-like nuclear features) is identified and described in detail. Despite deleting the word 'carcinoma' from its name, NIFTP is not a benign tumor either and is best regarded as a neoplasm with 'very low malignant potential'. The main goal of the introduction of NIFTP category is to prevent overdiagnosis and overtreatment. Sampling constraints, especially when dealing with heterogeneous and/or large nodules, and difficulties in the invasiveness evaluation, are the major weaknesses of the histological characterization of NIFTP. At the cytological level, NIFTP can be separated from classic papillary carcinoma (cPTC) but not from encapsulated, invasive follicular variant PTC. The impact of NIFTP individualization for cytopathology is the drop of rates of malignancy for each Bethesda category in general and for indeterminate categories in particular. The biggest impact will be seen in institutions with a high frequency of FVPTC. The introduction of NIFTP has changed the utility of predictive values of molecular tests because RAS mutations and PAX8-PPARg rearrangements are frequently detected in NIFTP. This turns less promising the application of mutation detection panels as indicators of malignancy and will probably contribute to switch to a rule-out approach of molecular testing. Selection for surgery will go on being determined by a combined detection of clinical, cytological and ultrasound suspicious features. © 2018 Society for Endocrinology.
Tamez-Peña, Jose-Gerardo; Rodriguez-Rojas, Juan-Andrés; Gomez-Rueda, Hugo; Celaya-Padilla, Jose-Maria; Rivera-Prieto, Roxana-Alicia; Palacios-Corona, Rebeca; Garza-Montemayor, Margarita; Cardona-Huerta, Servando
2018-01-01
In breast cancer, well-known gene expression subtypes have been related to a specific clinical outcome. However, their impact on the breast tissue phenotype has been poorly studied. Here, we investigate the association of imaging data of tumors to gene expression signatures from 71 patients with breast cancer that underwent pre-treatment digital mammograms and tumor biopsies. From digital mammograms, a semi-automated radiogenomics analysis generated 1,078 features describing the shape, signal distribution, and texture of tumors along their contralateral image used as control. From tumor biopsy, we estimated the OncotypeDX and PAM50 recurrence scores using gene expression microarrays. Then, we used multivariate analysis under stringent cross-validation to train models predicting recurrence scores. Few univariate features reached Spearman correlation coefficients above 0.4. Nevertheless, multivariate analysis yielded significantly correlated models for both signatures (correlation of OncotypeDX = 0.49 ± 0.07 and PAM50 = 0.32 ± 0.10 in stringent cross-validation and OncotypeDX = 0.83 and PAM50 = 0.78 for a unique model). Equivalent models trained from the unaffected contralateral breast were not correlated suggesting that the image signatures were tumor-specific and that overfitting was not a considerable issue. We also noted that models were improved by combining clinical information (triple negative status and progesterone receptor). The models used mostly wavelets and fractal features suggesting their importance to capture tumor information. Our results suggest that molecular-based recurrence risk and breast cancer subtypes have observable radiographic phenotypes. To our knowledge, this is the first study associating mammographic information to gene expression recurrence signatures. PMID:29596496
Horta, Rodrigo S; Lavalle, Gleidice E; Monteiro, Lidianne N; Souza, Mayara C C; Cassali, Geovanni D; Araújo, Roberto B
2018-03-01
Mast cell tumor (MCT) is a frequent cutaneous neoplasm in dogs that is heterogeneous in clinical presentation and biological behavior, with a variable potential for recurrence and metastasis. Accurate prediction of clinical outcomes has been challenging. The study objective was to develop a system for classification of canine MCT according to the mortality risk based on individual assessment of clinical, histologic, immunohistochemical, and molecular features. The study included 149 dogs with a histologic diagnosis of cutaneous or subcutaneous MCT. By univariate analysis, MCT metastasis and related death was significantly associated with clinical stage ( P < .0001, r P = -0.610), history of tumor recurrence ( P < .0001, r P = -0.550), Patnaik ( P < .0001, r P = -0.380) and Kiupel grades ( P < .0001, r P = -0.500), predominant organization of neoplastic cells ( P < .0001, r P = -0.452), mitotic count ( P < .0001, r P = -0.325), Ki-67 labeling index ( P < .0001, r P = -0.414), KITr pattern ( P = .02, r P = 0.207), and c-KIT mutational status ( P < .0001, r P = -0.356). By multivariate analysis with Cox proportional hazard model, only 2 features were independent predictors of overall survival: an amendment of the World Health Organization clinical staging system (hazard ratio [95% CI]: 1.824 [1.210-4.481]; P = .01) and a history of tumor recurrence (hazard ratio [95% CI]: 9.250 [2.158-23.268]; P < .001]. From these results, we propose an amendment of the WHO staging system, a method of risk analysis, and a suggested approach to clinical and laboratory evaluation of dogs with cutaneous MCT.
Tsai, Chen-An; Lee, Kuan-Ting; Liu, Jen-Pei
2016-01-01
A key feature of precision medicine is that it takes individual variability at the genetic or molecular level into account in determining the best treatment for patients diagnosed with diseases detected by recently developed novel biotechnologies. The enrichment design is an efficient design that enrolls only the patients testing positive for specific molecular targets and randomly assigns them for the targeted treatment or the concurrent control. However there is no diagnostic device with perfect accuracy and precision for detecting molecular targets. In particular, the positive predictive value (PPV) can be quite low for rare diseases with low prevalence. Under the enrichment design, some patients testing positive for specific molecular targets may not have the molecular targets. The efficacy of the targeted therapy may be underestimated in the patients that actually do have the molecular targets. To address the loss of efficiency due to misclassification error, we apply the discrete mixture modeling for time-to-event data proposed by Eng and Hanlon [8] to develop an inferential procedure, based on the Cox proportional hazard model, for treatment effects of the targeted treatment effect for the true-positive patients with the molecular targets. Our proposed procedure incorporates both inaccuracy of diagnostic devices and uncertainty of estimated accuracy measures. We employed the expectation-maximization algorithm in conjunction with the bootstrap technique for estimation of the hazard ratio and its estimated variance. We report the results of simulation studies which empirically investigated the performance of the proposed method. Our proposed method is illustrated by a numerical example.
Molecular landscape of acute myeloid leukemia in younger adults and its clinical relevance
Ivey, Adam; Huntly, Brian J. P.
2016-01-01
Recent major advances in understanding the molecular basis of acute myeloid leukemia (AML) provide a double-edged sword. Although defining the topology and key features of the molecular landscape are fundamental to development of novel treatment approaches and provide opportunities for greater individualization of therapy, confirmation of the genetic complexity presents a huge challenge to successful translation into routine clinical practice. It is now clear that many genes are recurrently mutated in AML; moreover, individual leukemias harbor multiple mutations and are potentially composed of subclones with differing mutational composition, rendering each patient’s AML genetically unique. In order to make sense of the overwhelming mutational data and capitalize on this clinically, it is important to identify (1) critical AML-defining molecular abnormalities that distinguish biological disease entities; (2) mutations, typically arising in subclones, that may influence prognosis but are unlikely to be ideal therapeutic targets; (3) mutations associated with preleukemic clones; and (4) mutations that have been robustly shown to confer independent prognostic information or are therapeutically relevant. The reward of identifying AML-defining molecular lesions present in all leukemic populations (including subclones) has been exemplified by acute promyelocytic leukemia, where successful targeting of the underlying PML-RARα oncoprotein has eliminated the need for chemotherapy for disease cure. Despite the molecular heterogeneity and recognizing that treatment options for other forms of AML are limited, this review will consider the scope for using novel molecular information to improve diagnosis, identify subsets of patients eligible for targeted therapies, refine outcome prediction, and track treatment response. PMID:26660431
Practical quantum mechanics-based fragment methods for predicting molecular crystal properties.
Wen, Shuhao; Nanda, Kaushik; Huang, Yuanhang; Beran, Gregory J O
2012-06-07
Significant advances in fragment-based electronic structure methods have created a real alternative to force-field and density functional techniques in condensed-phase problems such as molecular crystals. This perspective article highlights some of the important challenges in modeling molecular crystals and discusses techniques for addressing them. First, we survey recent developments in fragment-based methods for molecular crystals. Second, we use examples from our own recent research on a fragment-based QM/MM method, the hybrid many-body interaction (HMBI) model, to analyze the physical requirements for a practical and effective molecular crystal model chemistry. We demonstrate that it is possible to predict molecular crystal lattice energies to within a couple kJ mol(-1) and lattice parameters to within a few percent in small-molecule crystals. Fragment methods provide a systematically improvable approach to making predictions in the condensed phase, which is critical to making robust predictions regarding the subtle energy differences found in molecular crystals.
Desbordes, Paul; Ruan, Su; Modzelewski, Romain; Pineau, Pascal; Vauclin, Sébastien; Gouel, Pierrick; Michel, Pierre; Di Fiore, Frédéric; Vera, Pierre; Gardin, Isabelle
2017-01-01
In oncology, texture features extracted from positron emission tomography with 18-fluorodeoxyglucose images (FDG-PET) are of increasing interest for predictive and prognostic studies, leading to several tens of features per tumor. To select the best features, the use of a random forest (RF) classifier was investigated. Sixty-five patients with an esophageal cancer treated with a combined chemo-radiation therapy were retrospectively included. All patients underwent a pretreatment whole-body FDG-PET. The patients were followed for 3 years after the end of the treatment. The response assessment was performed 1 month after the end of the therapy. Patients were classified as complete responders and non-complete responders. Sixty-one features were extracted from medical records and PET images. First, Spearman's analysis was performed to eliminate correlated features. Then, the best predictive and prognostic subsets of features were selected using a RF algorithm. These results were compared to those obtained by a Mann-Whitney U test (predictive study) and a univariate Kaplan-Meier analysis (prognostic study). Among the 61 initial features, 28 were not correlated. From these 28 features, the best subset of complementary features found using the RF classifier to predict response was composed of 2 features: metabolic tumor volume (MTV) and homogeneity from the co-occurrence matrix. The corresponding predictive value (AUC = 0.836 ± 0.105, Se = 82 ± 9%, Sp = 91 ± 12%) was higher than the best predictive results found using the Mann-Whitney test: busyness from the gray level difference matrix (P < 0.0001, AUC = 0.810, Se = 66%, Sp = 88%). The best prognostic subset found using RF was composed of 3 features: MTV and 2 clinical features (WHO status and nutritional risk index) (AUC = 0.822 ± 0.059, Se = 79 ± 9%, Sp = 95 ± 6%), while no feature was significantly prognostic according to the Kaplan-Meier analysis. The RF classifier can improve predictive and prognostic values compared to the Mann-Whitney U test and the univariate Kaplan-Meier survival analysis when applied to several tens of features in a limited patient database.
Draper, John; Enot, David P; Parker, David; Beckmann, Manfred; Snowdon, Stuart; Lin, Wanchang; Zubair, Hassan
2009-01-01
Background Metabolomics experiments using Mass Spectrometry (MS) technology measure the mass to charge ratio (m/z) and intensity of ionised molecules in crude extracts of complex biological samples to generate high dimensional metabolite 'fingerprint' or metabolite 'profile' data. High resolution MS instruments perform routinely with a mass accuracy of < 5 ppm (parts per million) thus providing potentially a direct method for signal putative annotation using databases containing metabolite mass information. Most database interfaces support only simple queries with the default assumption that molecules either gain or lose a single proton when ionised. In reality the annotation process is confounded by the fact that many ionisation products will be not only molecular isotopes but also salt/solvent adducts and neutral loss fragments of original metabolites. This report describes an annotation strategy that will allow searching based on all potential ionisation products predicted to form during electrospray ionisation (ESI). Results Metabolite 'structures' harvested from publicly accessible databases were converted into a common format to generate a comprehensive archive in MZedDB. 'Rules' were derived from chemical information that allowed MZedDB to generate a list of adducts and neutral loss fragments putatively able to form for each structure and calculate, on the fly, the exact molecular weight of every potential ionisation product to provide targets for annotation searches based on accurate mass. We demonstrate that data matrices representing populations of ionisation products generated from different biological matrices contain a large proportion (sometimes > 50%) of molecular isotopes, salt adducts and neutral loss fragments. Correlation analysis of ESI-MS data features confirmed the predicted relationships of m/z signals. An integrated isotope enumerator in MZedDB allowed verification of exact isotopic pattern distributions to corroborate experimental data. Conclusion We conclude that although ultra-high accurate mass instruments provide major insight into the chemical diversity of biological extracts, the facile annotation of a large proportion of signals is not possible by simple, automated query of current databases using computed molecular formulae. Parameterising MZedDB to take into account predicted ionisation behaviour and the biological source of any sample improves greatly both the frequency and accuracy of potential annotation 'hits' in ESI-MS data. PMID:19622150
The Use of Molecular Modeling Programs in Medicinal Chemistry Instruction.
ERIC Educational Resources Information Center
Harrold, Marc W.
1992-01-01
This paper describes and evaluates the use of a molecular modeling computer program (Alchemy II) in a pharmaceutical education program. Provided are the hardware requirements and basic program features as well as several examples of how this program and its features have been applied in the classroom. (GLR)
The GenTechnique Project: Developing an Open Environment for Learning Molecular Genetics.
ERIC Educational Resources Information Center
Calza, R. E.; Meade, J. T.
1998-01-01
The GenTechnique project at Washington State University uses a networked learning environment for molecular genetics learning. The project is developing courseware featuring animation, hyper-link controls, and interactive self-assessment exercises focusing on fundamental concepts. The first pilot course featured a Web-based module on DNA…
In silico quantitative structure-toxicity relationship study of aromatic nitro compounds.
Pasha, Farhan Ahmad; Neaz, Mohammad Morshed; Cho, Seung Joo; Ansari, Mohiuddin; Mishra, Sunil Kumar; Tiwari, Sharvan
2009-05-01
Small molecules often have toxicities that are a function of molecular structural features. Minor variations in structural features can make large difference in such toxicity. Consequently, in silico techniques may be used to correlate such molecular toxicities with their structural features. Relative to nine different sets of aromatic nitro compounds having known observed toxicities against different targets, we developed ligand-based 2D quantitative structure-toxicity relationship models using 20 selected topological descriptors. The topological descriptors have several advantages such as conformational independency, facile and less time-consuming computation to yield good results. Multiple linear regression analysis was used to correlate variations of toxicity with molecular properties. The information index on molecular size, lopping centric index and Kier flexibility index were identified as fundamental descriptors for different kinds of toxicity, and further showed that molecular size, branching and molecular flexibility might be particularly important factors in quantitative structure-toxicity relationship analysis. This study revealed that topological descriptor-guided quantitative structure-toxicity relationship provided a very useful, cost and time-efficient, in silico tool for describing small-molecule toxicities.
Al Sharif, Merilin; Tsakovska, Ivanka; Pajeva, Ilza; Alov, Petko; Fioravanzo, Elena; Bassan, Arianna; Kovarich, Simona; Yang, Chihae; Mostrag-Szlichtyng, Aleksandra; Vitcheva, Vessela; Worth, Andrew P; Richarz, Andrea-N; Cronin, Mark T D
2017-12-01
The aim of this paper was to provide a proof of concept demonstrating that molecular modelling methodologies can be employed as a part of an integrated strategy to support toxicity prediction consistent with the mode of action/adverse outcome pathway (MoA/AOP) framework. To illustrate the role of molecular modelling in predictive toxicology, a case study was undertaken in which molecular modelling methodologies were employed to predict the activation of the peroxisome proliferator-activated nuclear receptor γ (PPARγ) as a potential molecular initiating event (MIE) for liver steatosis. A stepwise procedure combining different in silico approaches (virtual screening based on docking and pharmacophore filtering, and molecular field analysis) was developed to screen for PPARγ full agonists and to predict their transactivation activity (EC 50 ). The performance metrics of the classification model to predict PPARγ full agonists were balanced accuracy=81%, sensitivity=85% and specificity=76%. The 3D QSAR model developed to predict EC 50 of PPARγ full agonists had the following statistical parameters: q 2 cv =0.610, N opt =7, SEP cv =0.505, r 2 pr =0.552. To support the linkage of PPARγ agonism predictions to prosteatotic potential, molecular modelling was combined with independently performed mechanistic mining of available in vivo toxicity data followed by ToxPrint chemotypes analysis. The approaches investigated demonstrated a potential to predict the MIE, to facilitate the process of MoA/AOP elaboration, to increase the scientific confidence in AOP, and to become a basis for 3D chemotype development. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Prediction of essential proteins based on gene expression programming.
Zhong, Jiancheng; Wang, Jianxin; Peng, Wei; Zhang, Zhen; Pan, Yi
2013-01-01
Essential proteins are indispensable for cell survive. Identifying essential proteins is very important for improving our understanding the way of a cell working. There are various types of features related to the essentiality of proteins. Many methods have been proposed to combine some of them to predict essential proteins. However, it is still a big challenge for designing an effective method to predict them by integrating different features, and explaining how these selected features decide the essentiality of protein. Gene expression programming (GEP) is a learning algorithm and what it learns specifically is about relationships between variables in sets of data and then builds models to explain these relationships. In this work, we propose a GEP-based method to predict essential protein by combing some biological features and topological features. We carry out experiments on S. cerevisiae data. The experimental results show that the our method achieves better prediction performance than those methods using individual features. Moreover, our method outperforms some machine learning methods and performs as well as a method which is obtained by combining the outputs of eight machine learning methods. The accuracy of predicting essential proteins can been improved by using GEP method to combine some topological features and biological features.
Houshyarifar, Vahid; Chehel Amirani, Mehdi
2016-08-12
In this paper we present a method to predict Sudden Cardiac Arrest (SCA) with higher order spectral (HOS) and linear (Time) features extracted from heart rate variability (HRV) signal. Predicting the occurrence of SCA is important in order to avoid the probability of Sudden Cardiac Death (SCD). This work is a challenge to predict five minutes before SCA onset. The method consists of four steps: pre-processing, feature extraction, feature reduction, and classification. In the first step, the QRS complexes are detected from the electrocardiogram (ECG) signal and then the HRV signal is extracted. In second step, bispectrum features of HRV signal and time-domain features are obtained. Six features are extracted from bispectrum and two features from time-domain. In the next step, these features are reduced to one feature by the linear discriminant analysis (LDA) technique. Finally, KNN and support vector machine-based classifiers are used to classify the HRV signals. We used two database named, MIT/BIH Sudden Cardiac Death (SCD) Database and Physiobank Normal Sinus Rhythm (NSR). In this work we achieved prediction of SCD occurrence for six minutes before the SCA with the accuracy over 91%.
Akahori, Masakazu; Itabashi, Takeshi; Nishino, Jo; Yoshitake, Kazutoshi; Ikeo, Kazuho; Tsuneoka, Hiroshi
2014-01-01
Purpose. To investigate genetic and clinical features of patients with rhodopsin (RHO) mutations in two Japanese families with autosomal dominant retinitis pigmentosa (adRP). Methods. Whole-exome sequence analysis was performed in ten adRP families. Identified RHO mutations for the cosegregation analysis were confirmed by Sanger sequencing. Ophthalmic examinations were performed to evaluate the RP phenotypes. The impact of the RHO mutation on the rhodopsin conformation was examined by molecular modeling analysis. Results. In two adRP families, we identified two RHO mutations (c.377G>T (p.W126L) and c.1036G>C (p.A346P)), one of which was novel. Complete cosegregation was confirmed for each mutation exhibiting the RP phenotype in both families. Molecular modeling predicted that the novel mutation (p.W126L) might impair rhodopsin function by affecting its conformational transition in the light-adapted form. Clinical phenotypes showed that patients with p.W126L exhibited sector RP, whereas patients with p.A346P exhibited classic RP. Conclusions. Our findings demonstrated that the novel mutation (p.W126L) may be associated with the phenotype of sector RP. Identification of RHO mutations is a very useful tool for predicting disease severity and providing precise genetic counseling. PMID:25485142
A structurally driven analysis of thiol reactivity in mammalian albumins.
Spiga, Ottavia; Summa, Domenico; Cirri, Simone; Bernini, Andrea; Venditti, Vincenzo; De Chiara, Matteo; Priora, Raffaella; Frosali, Simona; Margaritis, Antonios; Di Giuseppe, Danila; Di Simplicio, Paolo; Niccolai, Neri
2011-04-01
Understanding the structural basis of protein redox activity is still an open question. Hence, by using a structural genomics approach, different albumins have been chosen to correlate protein structural features with the corresponding reaction rates of thiol exchange between albumin and disulfide DTNB. Predicted structures of rat, porcine, and bovine albumins have been compared with the experimentally derived human albumin. High structural similarity among these four albumins can be observed, in spite of their markedly different reactivity with DTNB. Sequence alignments offered preliminary hints on the contributions of sequence-specific local environments modulating albumin reactivity. Molecular dynamics simulations performed on experimental and predicted albumin structures reveal that thiolation rates are influenced by hydrogen bonding pattern and stability of the acceptor C34 sulphur atom with donor groups of nearby residues. Atom depth evolution of albumin C34 thiol groups has been monitored during Molecular Dynamic trajectories. The most reactive albumins appeared also the ones presenting the C34 sulphur atom on the protein surface with the highest accessibility. High C34 sulphur atom reactivity in rat and porcine albumins seems to be determined by the presence of additional positively charged amino acid residues favoring both the C34 S⁻ form and the approach of DTNB. Copyright © 2011 Wiley Periodicals, Inc.
Superdiffusive gas recovery from nanopores
NASA Astrophysics Data System (ADS)
Wu, Haiyi; He, Yadong; Qiao, Rui
2016-11-01
Understanding the recovery of gas from reservoirs featuring pervasive nanopores is essential for effective shale gas extraction. Classical theories cannot accurately predict such gas recovery and many experimental observations are not well understood. Here we report molecular simulations of the recovery of gas from single nanopores, explicitly taking into account molecular gas-wall interactions. We show that, in very narrow pores, the strong gas-wall interactions are essential in determining the gas recovery behavior both quantitatively and qualitatively. These interactions cause the total diffusion coefficients of the gas molecules in nanopores to be smaller than those predicted by kinetic theories, hence slowing down the rate of gas recovery. These interactions also lead to significant adsorption of gas molecules on the pore walls. Because of the desorption of these gas molecules during gas recovery, the gas recovery from the nanopore does not exhibit the usual diffusive scaling law (i.e., the accumulative recovery scales as R ˜t1 /2 ) but follows a superdiffusive scaling law R ˜tn (n >0.5 ), which is similar to that observed in some field experiments. For the system studied here, the superdiffusive gas recovery scaling law can be captured well by continuum models in which the gas adsorption and desorption from pore walls are taken into account using the Langmuir model.
Yun, Dapeng; Zhao, Yingjie; Wang, Jingkun; Xu, Tao; Li, Xiaoying; Wang, Yuqi; Yuan, Li; Sun, Ruochuan; Song, Xiao; Huai, Cong; Hu, Lingna; Yang, Song; Min, Taishan; Chen, Juxiang; Chen, Hongyan; Lu, Daru
2015-01-01
Glioma is the most malignant brain tumor and glioblastoma (GBM) is the most aggressive type. The involvement of N-myc (and STAT) interactor (NMI) in tumorigenesis was sporadically reported but far from elucidation. This study aims to investigate roles of NMI in human glioma. Three independent cohorts, the Chinese tissue microarray (TMA) cohort (N = 209), the Repository for Molecular Brain Neoplasia Data (Rembrandt) cohort (N = 371) and The Cancer Genome Atlas (TCGA) cohort (N = 528 or 396) were employed. Transcriptional or protein levels of NMI expression were significantly increased according to tumor grade in all three cohorts. High expression of NMI predicted significantly unfavorable clinical outcome for GBM patients, which was further determined as an independent prognostic factor. Additionally, expression and prognostic value of NMI were associated with molecular features of GBM including PTEN deletion and EGFR amplification in TCGA cohort. Furthermore, overexpression or depletion of NMI revealed its regulation on G1/S progression and cell proliferation (both in vitro and in vivo), and this effect was partially dependent on STAT1, which interacted with and was regulated by NMI. These data demonstrate that NMI may serve as a novel prognostic biomarker and a potential therapeutic target for glioblastoma. PMID:25669971
NASA Astrophysics Data System (ADS)
Yuan, K. Y.; Yuan, W.; Ju, J. W.; Yang, J. M.; Kao, W.; Carlson, L.
2012-04-01
As asphalt pavements age and deteriorate, recurring pothole repair failures and propagating alligator cracks in the asphalt pavements have become a serious issue to our daily life and resulted in high repairing costs for pavement and vehicles. To solve this urgent issue, pothole repair materials with superior durability and long service life are needed. In the present work, revolutionary pothole patching materials with high toughness, high fatigue resistance that are reinforced with nano-molecular resins have been developed to enhance their resistance to traffic loads and service life of repaired potholes. In particular, DCPD resin (dicyclopentadiene, C10H12) with a Rhuthinium-based catalyst is employed to develop controlled properties that are compatible with aggregates and asphalt binders. In this paper, a multi-level numerical micromechanics-based model is developed to predict the mechanical properties of these innovative nanomolecular resin reinforced pothole patching materials. Coarse aggregates in the finite element analysis are modeled as irregular shapes through image processing techniques and randomly-dispersed coated particles. The overall properties of asphalt mastic, which consists of fine aggregates, asphalt binder, cured DCPD and air voids are theoretically estimated by the homogenization technique of micromechanics. Numerical predictions are compared with suitably designed experimental laboratory results.
Predicting chromatin architecture from models of polymer physics.
Bianco, Simona; Chiariello, Andrea M; Annunziatella, Carlo; Esposito, Andrea; Nicodemi, Mario
2017-03-01
We review the picture of chromatin large-scale 3D organization emerging from the analysis of Hi-C data and polymer modeling. In higher mammals, Hi-C contact maps reveal a complex higher-order organization, extending from the sub-Mb to chromosomal scales, hierarchically folded in a structure of domains-within-domains (metaTADs). The domain folding hierarchy is partially conserved throughout differentiation, and deeply correlated to epigenomic features. Rearrangements in the metaTAD topology relate to gene expression modifications: in particular, in neuronal differentiation models, topologically associated domains (TADs) tend to have coherent expression changes within architecturally conserved metaTAD niches. To identify the nature of architectural domains and their molecular determinants within a principled approach, we discuss models based on polymer physics. We show that basic concepts of interacting polymer physics explain chromatin spatial organization across chromosomal scales and cell types. The 3D structure of genomic loci can be derived with high accuracy and its molecular determinants identified by crossing information with epigenomic databases. In particular, we illustrate the case of the Sox9 locus, linked to human congenital disorders. The model in-silico predictions on the effects of genomic rearrangements are confirmed by available 5C data. That can help establishing new diagnostic tools for diseases linked to chromatin mis-folding, such as congenital disorders and cancer.
Meng, Delong; Chen, Yuanyuan; Yun, Dapeng; Zhao, Yingjie; Wang, Jingkun; Xu, Tao; Li, Xiaoying; Wang, Yuqi; Yuan, Li; Sun, Ruochuan; Song, Xiao; Huai, Cong; Hu, Lingna; Yang, Song; Min, Taishan; Chen, Juxiang; Chen, Hongyan; Lu, Daru
2015-03-10
Glioma is the most malignant brain tumor and glioblastoma (GBM) is the most aggressive type. The involvement of N-myc (and STAT) interactor (NMI) in tumorigenesis was sporadically reported but far from elucidation. This study aims to investigate roles of NMI in human glioma. Three independent cohorts, the Chinese tissue microarray (TMA) cohort (N = 209), the Repository for Molecular Brain Neoplasia Data (Rembrandt) cohort (N = 371) and The Cancer Genome Atlas (TCGA) cohort (N = 528 or 396) were employed. Transcriptional or protein levels of NMI expression were significantly increased according to tumor grade in all three cohorts. High expression of NMI predicted significantly unfavorable clinical outcome for GBM patients, which was further determined as an independent prognostic factor. Additionally, expression and prognostic value of NMI were associated with molecular features of GBM including PTEN deletion and EGFR amplification in TCGA cohort. Furthermore, overexpression or depletion of NMI revealed its regulation on G1/S progression and cell proliferation (both in vitro and in vivo), and this effect was partially dependent on STAT1, which interacted with and was regulated by NMI. These data demonstrate that NMI may serve as a novel prognostic biomarker and a potential therapeutic target for glioblastoma.
Glioma CpG island methylator phenotype (G-CIMP): biological and clinical implications.
Malta, Tathiane M; de Souza, Camila F; Sabedot, Thais S; Silva, Tiago C; Mosella, Maritza S; Kalkanis, Steven N; Snyder, James; Castro, Ana Valeria B; Noushmehr, Houtan
2018-04-09
Gliomas are a heterogeneous group of brain tumors with distinct biological and clinical properties. Despite advances in surgical techniques and clinical regimens, treatment of high-grade glioma remains challenging and carries dismal rates of therapeutic success and overall survival. Challenges include the molecular complexity of gliomas, as well as inconsistencies in histopathological grading, resulting in an inaccurate prediction of disease progression and failure in the use of standard therapy. The updated 2016 World Health Organization (WHO) classification of tumors of the central nervous system reflects a refinement of tumor diagnostics by integrating the genotypic and phenotypic features, thereby narrowing the defined subgroups. The new classification recommends molecular diagnosis of isocitrate dehydrogenase (IDH) mutational status in gliomas. IDH-mutant gliomas manifest the cytosine-phosphate-guanine (CpG) island methylator phenotype (G-CIMP). Notably, the recent identification of clinically relevant subsets of G-CIMP tumors (G-CIMP-high and G-CIMP-low) provides a further refinement in glioma classification that is independent of grade and histology. This scheme may be useful for predicting patient outcome and may be translated into effective therapeutic strategies tailored to each patient. In this review, we highlight the evolution of our understanding of the G-CIMP subsets and how recent advances in characterizing the genome and epigenome of gliomas may influence future basic and translational research.
NASA Astrophysics Data System (ADS)
Lan, Ping; Xie, Mei-Qi; Yao, Yue-Mei; Chen, Wan-Na; Chen, Wei-Min
2010-12-01
Fructose-1,6-biphophatase has been regarded as a novel therapeutic target for the treatment of type 2 diabetes mellitus (T2DM). 3D-QSAR and docking studies were performed on a series of [5-(4-amino-1 H-benzoimidazol-2-yl)-furan-2-yl]-phosphonic acid derivatives as fructose-1,6-biphophatase inhibitors. The CoMFA and CoMSIA models using thirty-seven molecules in the training set gave r cv 2 values of 0.614 and 0.598, r 2 values of 0.950 and 0.928, respectively. The external validation indicated that our CoMFA and CoMSIA models possessed high predictive powers with r 0 2 values of 0.994 and 0.994, r m 2 values of 0.751 and 0.690, respectively. Molecular docking studies revealed that a phosphonic group was essential for binding to the receptor, and some key features were also identified. A set of forty new analogues were designed by utilizing the results revealed in the present study, and were predicted with significantly improved potencies in the developed models. The findings can be quite useful to aid the designing of new fructose-1,6-biphophatase inhibitors with improved biological response.
Computationally Guided Design of Polymer Electrolytes for Battery Applications
NASA Astrophysics Data System (ADS)
Wang, Zhen-Gang; Webb, Michael; Savoie, Brett; Miller, Thomas
We develop an efficient computational framework for guiding the design of polymer electrolytes for Li battery applications. Short-times molecular dynamics (MD) simulations are employed to identify key structural and dynamic features in the solvation and motion of Li ions, such as the structure of the solvation shells, the spatial distribution of solvation sites, and the polymer segmental mobility. Comparative studies on six polyester-based polymers and polyethylene oxide (PEO) yield good agreement with experimental data on the ion conductivities, and reveal significant differences in the ion diffusion mechanism between PEO and the polyesters. The molecular insights from the MD simulations are used to build a chemically specific coarse-grained model in the spirit of the dynamic bond percolation model of Druger, Ratner and Nitzan. We apply this coarse-grained model to characterize Li ion diffusion in several existing and yet-to-be synthesized polyethers that differ by oxygen content and backbone stiffness. Good agreement is obtained between the predictions of the coarse-grained model and long-timescale atomistic MD simulations, thus providing validation of the model. Our study predicts higher Li ion diffusivity in poly(trimethylene oxide-alt-ethylene oxide) than in PEO. These results demonstrate the potential of this computational framework for rapid screening of new polymer electrolytes based on ion diffusivity.
Verma, Meghna; Erwin, Samantha; Abedi, Vida; Hontecillas, Raquel; Hoops, Stefan; Leber, Andrew; Bassaganya-Riera, Josep; Ciupe, Stanca M
2017-01-01
Human immunodeficiency virus (HIV)-infected patients are at an increased risk of co-infection with human papilloma virus (HPV), and subsequent malignancies such as oral cancer. To determine the role of HIV-associated immune suppression on HPV persistence and pathogenesis, and to investigate the mechanisms underlying the modulation of HPV infection and oral cancer by HIV, we developed a mathematical model of HIV/HPV co-infection. Our model captures known immunological and molecular features such as impaired HPV-specific effector T helper 1 (Th1) cell responses, and enhanced HPV infection due to HIV. We used the model to determine HPV prognosis in the presence of HIV infection, and identified conditions under which HIV infection alters HPV persistence in the oral mucosa system. The model predicts that conditions leading to HPV persistence during HIV/HPV co-infection are the permissive immune environment created by HIV and molecular interactions between the two viruses. The model also determines when HPV infection continues to persist in the short run in a co-infected patient undergoing antiretroviral therapy. Lastly, the model predicts that, under efficacious antiretroviral treatment, HPV infections will decrease in the long run due to the restoration of CD4+ T cell numbers and protective immune responses.
Influence of Na+ and Mg2+ ions on RNA structures studied with molecular dynamics simulations.
Fischer, Nina M; Polêto, Marcelo D; Steuer, Jakob; van der Spoel, David
2018-06-01
The structure of ribonucleic acid (RNA) polymers is strongly dependent on the presence of, in particular Mg2+ cations to stabilize structural features. Only in high-resolution X-ray crystallography structures can ions be identified reliably. Here, we perform molecular dynamics simulations of 24 RNA structures with varying ion concentrations. Twelve of the structures were helical and the others complex folded. The aim of the study is to predict ion positions but also to evaluate the impact of different types of ions (Na+ or Mg2+) and the ionic strength on structural stability and variations of RNA. As a general conclusion Mg2+ is found to conserve the experimental structure better than Na+ and, where experimental ion positions are available, they can be reproduced with reasonable accuracy. If a large surplus of ions is present the added electrostatic screening makes prediction of binding-sites less reproducible. Distinct differences in ion-binding between helical and complex folded structures are found. The strength of binding (ΔG‡ for breaking RNA atom-ion interactions) is found to differ between roughly 10 and 26 kJ/mol for the different RNA atoms. Differences in stability between helical and complex folded structures and of the influence of metal ions on either are discussed.
Functional odor classification through a medicinal chemistry approach.
Poivet, Erwan; Tahirova, Narmin; Peterlin, Zita; Xu, Lu; Zou, Dong-Jing; Acree, Terry; Firestein, Stuart
2018-02-01
Crucial for any hypothesis about odor coding is the classification and prediction of sensory qualities in chemical compounds. The relationship between perceptual quality and molecular structure has occupied olfactory scientists throughout the 20th century, but details of the mechanism remain elusive. Odor molecules are typically organic compounds of low molecular weight that may be aliphatic or aromatic, may be saturated or unsaturated, and may have diverse functional polar groups. However, many molecules conforming to these characteristics are odorless. One approach recently used to solve this problem was to apply machine learning strategies to a large set of odors and human classifiers in an attempt to find common and unique chemical features that would predict a chemical's odor. We use an alternative method that relies more on the biological responses of olfactory sensory neurons and then applies the principles of medicinal chemistry, a technique widely used in drug discovery. We demonstrate the effectiveness of this strategy through a classification for esters, an important odorant for the creation of flavor in wine. Our findings indicate that computational approaches that do not account for biological responses will be plagued by both false positives and false negatives and fail to provide meaningful mechanistic data. However, the two approaches used in tandem could resolve many of the paradoxes in odor perception.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Kuiken, Benjamin E.; Valiev, Marat; Daifuku, Stephanie L.
2013-05-01
Ruthenium L2,3-edge X-ray absorption (XA) spectroscopy probes transitions from core 2p orbitals to the 4d levels of the atom and is a powerful tool for interrogating the local electronic and molecular structure around the metal atom. However, a molecular-level interpretation of the Ru L2,3-edge spectral lineshapes is often complicated by spin–orbit coupling (SOC) and multiplet effects. In this study, we develop spin-free time-dependent density functional theory (TDDFT) as a viable and predictive tool to simulate the Ru L3-edge spectra. We successfully simulate and analyze the ground state Ru L3-edge XA spectra of a series of RuII and RuIII complexes: [Ru(NH3)6]2+/3+,more » [Ru(CN)6]4-/3-, [RuCl6]4-/3-, and the ground (1A1) and photoexcited (3MLCT) transient states of [Ru(bpy)3]2+ and Ru(dcbpy)2(NCS)2 (termed N3). The TDDFT simulations reproduce all the experimentally observed features in Ru L3-edge XA spectra. The advantage of using TDDFT to assign complicated Ru L3-edge spectra is illustrated by its ability to identify ligand specific charge transfer features in complex molecules. We conclude that the B3LYP functional is the most reliable functional for accurately predicting the location of charge transfer features in these spectra. Experimental and simulated Ru L3-edge XA spectra are presented for the transition metal mixed-valence dimers [(NC)5MII-CN-RuIII(NH3)5]- (where M = Fe or Ru) dissolved in water. We explore the spectral signatures of electron delocalization in Ru L3-edge XA spectroscopy and our simulations reveal that the inclusion of explicit solvent molecules is crucial for reproducing the experimentally determined valencies, highlighting the importance of the role of the solvent in transition metal charge transfer chemistry.« less
Ma, Xin; Guo, Jing; Sun, Xiao
2015-01-01
The prediction of RNA-binding proteins is one of the most challenging problems in computation biology. Although some studies have investigated this problem, the accuracy of prediction is still not sufficient. In this study, a highly accurate method was developed to predict RNA-binding proteins from amino acid sequences using random forests with the minimum redundancy maximum relevance (mRMR) method, followed by incremental feature selection (IFS). We incorporated features of conjoint triad features and three novel features: binding propensity (BP), nonbinding propensity (NBP), and evolutionary information combined with physicochemical properties (EIPP). The results showed that these novel features have important roles in improving the performance of the predictor. Using the mRMR-IFS method, our predictor achieved the best performance (86.62% accuracy and 0.737 Matthews correlation coefficient). High prediction accuracy and successful prediction performance suggested that our method can be a useful approach to identify RNA-binding proteins from sequence information.
NASA Astrophysics Data System (ADS)
Sherlin, Y. Sheeba; Vijayakumar, T.; Roy, S. D. D.; Jayakumar, V. S.
2018-05-01
Molecular geometry of Parkinson's drug 2-(3,4-Dihydroxyphenyl)ethylamine hydrochloride (Dopamine, DA) has been evaluated and compared with experimental XRD data. Molecular docking and vibrational spectral analysis of DA have been carried out using FT-Raman and FT-IR spectra aided by Density Functional Theory at B3LYP/6-311++G(d,p). The present investigation deals with the analysis of structural and spectral features responsible for drug activities, nature of hydrogen bonding interactions of the molecule and the correlation of Parkinson's nature with its molecular structural features.
Harnessing atomistic simulations to predict the rate at which dislocations overcome obstacles
NASA Astrophysics Data System (ADS)
Saroukhani, S.; Nguyen, L. D.; Leung, K. W. K.; Singh, C. V.; Warner, D. H.
2016-05-01
Predicting the rate at which dislocations overcome obstacles is key to understanding the microscopic features that govern the plastic flow of modern alloys. In this spirit, the current manuscript examines the rate at which an edge dislocation overcomes an obstacle in aluminum. Predictions were made using different popular variants of Harmonic Transition State Theory (HTST) and compared to those of direct Molecular Dynamics (MD) simulations. The HTST predictions were found to be grossly inaccurate due to the large entropy barrier associated with the dislocation-obstacle interaction. Considering the importance of finite temperature effects, the utility of the Finite Temperature String (FTS) method was then explored. While this approach was found capable of identifying a prominent reaction tube, it was not capable of computing the free energy profile along the tube. Lastly, the utility of the Transition Interface Sampling (TIS) approach was explored, which does not need a free energy profile and is known to be less reliant on the choice of reaction coordinate. The TIS approach was found capable of accurately predicting the rate, relative to direct MD simulations. This finding was utilized to examine the temperature and load dependence of the dislocation-obstacle interaction in a simple periodic cell configuration. An attractive rate prediction approach combining TST and simple continuum models is identified, and the strain rate sensitivity of individual dislocation obstacle interactions is predicted.
Computational modeling of membrane proteins
Leman, Julia Koehler; Ulmschneider, Martin B.; Gray, Jeffrey J.
2014-01-01
The determination of membrane protein (MP) structures has always trailed that of soluble proteins due to difficulties in their overexpression, reconstitution into membrane mimetics, and subsequent structure determination. The percentage of MP structures in the protein databank (PDB) has been at a constant 1-2% for the last decade. In contrast, over half of all drugs target MPs, only highlighting how little we understand about drug-specific effects in the human body. To reduce this gap, researchers have attempted to predict structural features of MPs even before the first structure was experimentally elucidated. In this review, we present current computational methods to predict MP structure, starting with secondary structure prediction, prediction of trans-membrane spans, and topology. Even though these methods generate reliable predictions, challenges such as predicting kinks or precise beginnings and ends of secondary structure elements are still waiting to be addressed. We describe recent developments in the prediction of 3D structures of both α-helical MPs as well as β-barrels using comparative modeling techniques, de novo methods, and molecular dynamics (MD) simulations. The increase of MP structures has (1) facilitated comparative modeling due to availability of more and better templates, and (2) improved the statistics for knowledge-based scoring functions. Moreover, de novo methods have benefitted from the use of correlated mutations as restraints. Finally, we outline current advances that will likely shape the field in the forthcoming decade. PMID:25355688
Image Feature Types and Their Predictions of Aesthetic Preference and Naturalness
Ibarra, Frank F.; Kardan, Omid; Hunter, MaryCarol R.; Kotabe, Hiroki P.; Meyer, Francisco A. C.; Berman, Marc G.
2017-01-01
Previous research has investigated ways to quantify visual information of a scene in terms of a visual processing hierarchy, i.e., making sense of visual environment by segmentation and integration of elementary sensory input. Guided by this research, studies have developed categories for low-level visual features (e.g., edges, colors), high-level visual features (scene-level entities that convey semantic information such as objects), and how models of those features predict aesthetic preference and naturalness. For example, in Kardan et al. (2015a), 52 participants provided aesthetic preference and naturalness ratings, which are used in the current study, for 307 images of mixed natural and urban content. Kardan et al. (2015a) then developed a model using low-level features to predict aesthetic preference and naturalness and could do so with high accuracy. What has yet to be explored is the ability of higher-level visual features (e.g., horizon line position relative to viewer, geometry of building distribution relative to visual access) to predict aesthetic preference and naturalness of scenes, and whether higher-level features mediate some of the association between the low-level features and aesthetic preference or naturalness. In this study we investigated these relationships and found that low- and high- level features explain 68.4% of the variance in aesthetic preference ratings and 88.7% of the variance in naturalness ratings. Additionally, several high-level features mediated the relationship between the low-level visual features and aaesthetic preference. In a multiple mediation analysis, the high-level feature mediators accounted for over 50% of the variance in predicting aesthetic preference. These results show that high-level visual features play a prominent role predicting aesthetic preference, but do not completely eliminate the predictive power of the low-level visual features. These strong predictors provide powerful insights for future research relating to landscape and urban design with the aim of maximizing subjective well-being, which could lead to improved health outcomes on a larger scale. PMID:28503158
Atomic and Molecular Systems in Intense Ultrashort Laser Pulses
NASA Astrophysics Data System (ADS)
Saenz, A.
2008-07-01
The full quantum mechanical treatment of atomic and molecular systems exposed to intense laser pulses is a so far unsolved challenge, even for systems as small as molecular hydrogen. Therefore, a number of simplified qualitative and quantitative models have been introduced in order to provide at least some interpretational tools for experimental data. The assessment of these models describing the molecular response is complicated, since a comparison to experiment requires often a number of averages to be performed. This includes in many cases averaging of different orientations of the molecule with respect to the laser field, focal volume effects, etc. Furthermore, the pulse shape and even the peak intensity is experimentally not known with very high precision; considering, e.g., the exponential intensity dependence of the ionization signal. Finally, experiments usually provide only relative yields. As a consequence of all these averagings and uncertainties, it is possible that different models may successfully explain some experimental results or features, although these models disagree substantially, if their predictions are compared before averaging. Therefore, fully quantum-mechanical approaches at least for small atomic and molecular systems are highly desirable and have been developed in our group. This includes efficient codes for solving the time-dependent Schrodinger equation of atomic hydrogen, helium or other effective one- or two-electron atoms as well as for the electronic motion in linear (effective) one-and two-electron diatomic molecules like H_2.Very recently, a code for larger molecular systems that adopts the so-called single-active electron approximation was also successfully implemented and applied. In the first part of this talk popular models describing intense laser-field ionization of atoms and their extensions to molecules are described. Then their validity is discussed on the basis of quantum-mechanical calculations. Finally, some peculiar molecular strong-field effects and the possibility of strong-field control mechanisms will be demonstrated. This includes phenomena like enhanced ionization and bond softening as well as the creation of vibrational wavepacket in the non-ionized electronic ground state of H_2 by creating a Schrodinger-cat state between the ionized and the non-ionized molecules. The latter, theoretically predicted phenomenon was very recently experimentally observed and lead to the real-time observation of the so far fastest molecular motion.
Molecular clouds and galactic spiral structure
NASA Technical Reports Server (NTRS)
Dame, T. M.
1984-01-01
Galactic CO line emission at 115 GHz was surveyed in order to study the distribution of molecular clouds in the inner galaxy. Comparison of this survey with similar H1 data reveals a detailed correlation with the most intense 21 cm features. To each of the classical 21 cm H1 spiral arms of the inner galaxy there corresponds a CO molecular arm which is generally more clearly defined and of higher contrast. A simple model is devised for the galactic distribution of molecular clouds. The modeling results suggest that molecular clouds are essentially transient objects, existing for 15 to 40 million years after their formation in a spiral arm, and are largely confined to spiral features about 300 pc wide.
Zheng, Lu-Lu; Niu, Shen; Hao, Pei; Feng, KaiYan; Cai, Yu-Dong; Li, Yixue
2011-01-01
Pyrrolidone carboxylic acid (PCA) is formed during a common post-translational modification (PTM) of extracellular and multi-pass membrane proteins. In this study, we developed a new predictor to predict the modification sites of PCA based on maximum relevance minimum redundancy (mRMR) and incremental feature selection (IFS). We incorporated 727 features that belonged to 7 kinds of protein properties to predict the modification sites, including sequence conservation, residual disorder, amino acid factor, secondary structure and solvent accessibility, gain/loss of amino acid during evolution, propensity of amino acid to be conserved at protein-protein interface and protein surface, and deviation of side chain carbon atom number. Among these 727 features, 244 features were selected by mRMR and IFS as the optimized features for the prediction, with which the prediction model achieved a maximum of MCC of 0.7812. Feature analysis showed that all feature types contributed to the modification process. Further site-specific feature analysis showed that the features derived from PCA's surrounding sites contributed more to the determination of PCA sites than other sites. The detailed feature analysis in this paper might provide important clues for understanding the mechanism of the PCA formation and guide relevant experimental validations. PMID:22174779
NASA Astrophysics Data System (ADS)
Ma, Chuang; Bao, Zhong-Kui; Zhang, Hai-Feng
2017-10-01
So far, many network-structure-based link prediction methods have been proposed. However, these methods only highlight one or two structural features of networks, and then use the methods to predict missing links in different networks. The performances of these existing methods are not always satisfied in all cases since each network has its unique underlying structural features. In this paper, by analyzing different real networks, we find that the structural features of different networks are remarkably different. In particular, even in the same network, their inner structural features are utterly different. Therefore, more structural features should be considered. However, owing to the remarkably different structural features, the contributions of different features are hard to be given in advance. Inspired by these facts, an adaptive fusion model regarding link prediction is proposed to incorporate multiple structural features. In the model, a logistic function combing multiple structural features is defined, then the weight of each feature in the logistic function is adaptively determined by exploiting the known structure information. Last, we use the "learnt" logistic function to predict the connection probabilities of missing links. According to our experimental results, we find that the performance of our adaptive fusion model is better than many similarity indices.
Wright, Bernice; Watson, Kimberly A; McGuffin, Liam J; Lovegrove, Julie A; Gibbins, Jonathan M
2015-11-01
Flavonoids reduce cardiovascular disease risk through anti-inflammatory, anti-coagulant and anti-platelet actions. One key flavonoid inhibitory mechanism is blocking kinase activity that drives these processes. Flavonoids attenuate activities of kinases including phosphoinositide-3-kinase, Fyn, Lyn, Src, Syk, PKC, PIM1/2, ERK, JNK and PKA. X-ray crystallographic analyses of kinase-flavonoid complexes show that flavonoid ring systems and their hydroxyl substitutions are important structural features for their binding to kinases. A clearer understanding of structural interactions of flavonoids with kinases is necessary to allow construction of more potent and selective counterparts. We examined flavonoid (quercetin, apigenin and catechin) interactions with Src family kinases (Lyn, Fyn and Hck) applying the Sybyl docking algorithm and GRID. A homology model (Lyn) was used in our analyses to demonstrate that high-quality predicted kinase structures are suitable for flavonoid computational studies. Our docking results revealed potential hydrogen bond contacts between flavonoid hydroxyls and kinase catalytic site residues. Identification of plausible contacts indicated that quercetin formed the most energetically stable interactions, apigenin lacked hydroxyl groups necessary for important contacts and the non-planar structure of catechin could not support predicted hydrogen bonding patterns. GRID analysis using a hydroxyl functional group supported docking results. Based on these findings, we predicted that quercetin would inhibit activities of Src family kinases with greater potency than apigenin and catechin. We validated this prediction using in vitro kinase assays. We conclude that our study can be used as a basis to construct virtual flavonoid interaction libraries to guide drug discovery using these compounds as molecular templates. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Beyond [lambda][subscript max] Part 2: Predicting Molecular Color
ERIC Educational Resources Information Center
Williams, Darren L.; Flaherty, Thomas J.; Alnasleh, Bassam K.
2009-01-01
A concise roadmap for using computational chemistry programs (i.e., Gaussian 03W) to predict the color of a molecular species is presented. A color-predicting spreadsheet is available with the online material that uses transition wavelengths and peak-shape parameters to predict the visible absorbance spectrum, transmittance spectrum, chromaticity…
Lauria, Antonino; Tutone, Marco; Almerico, Anna Maria
2011-09-01
In the last years the application of computational methodologies in the medicinal chemistry fields has found an amazing development. All the efforts were focused on the searching of new leads featuring a close affinity on a specific biological target. Thus, different molecular modeling approaches in simulation of molecular behavior for a specific biological target were employed. In spite of the increasing reliability of computational methodologies, not always the designed lead, once synthesized and screened, are suitable for the chosen biological target. To give another chance to these compounds, this work tries to resume the old concept of Fischer lock-and-key model. The same can be done for the "re-purposing" of old drugs. In fact, it is known that drugs may have many physiological targets, therefore it may be useful to identify them. This aspect, called "polypharmacology", is known to be therapeutically essential in the different treatments. The proposed protocol, the virtual lock-and-key approach (VLKA), consists in the "virtualization" of biological targets through the respectively known inhibitors. In order to release a real lock it is necessary the key fits the pins of the lock. The molecular descriptors could be considered as pins. A tested compound can be considered a potential inhibitor of a biological target if the values of its molecular descriptors fall in the calculated range values for the set of known inhibitors. The proposed protocol permits to transform a biological target in a "lock model" starting from its known inhibitors. To release a real lock all pins must fit. In the proposed protocol, it was supposed that the higher is the number of fit pins, the higher will be the affinity to the considered biological target. Therefore, each biological target was converted in a sequence of "weighted" molecular descriptor range values (locks) by using the structural features of the known inhibitors. Each biological target lock was tested by performing a molecular descriptors "fitting" on known inhibitors not used in the model construction (keys or test set). The results showed a good predictive capability of the protocol (confidence level 80%). This method gives interesting and convenient results because of the user-defined descriptors and biological targets choice in the process of new inhibitors discovery. Copyright © 2011 Elsevier Masson SAS. All rights reserved.
Gao, Yu-Fei; Li, Bi-Qing; Cai, Yu-Dong; Feng, Kai-Yan; Li, Zhan-Dong; Jiang, Yang
2013-01-27
Identification of catalytic residues plays a key role in understanding how enzymes work. Although numerous computational methods have been developed to predict catalytic residues and active sites, the prediction accuracy remains relatively low with high false positives. In this work, we developed a novel predictor based on the Random Forest algorithm (RF) aided by the maximum relevance minimum redundancy (mRMR) method and incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility to predict active sites of enzymes and achieved an overall accuracy of 0.885687 and MCC of 0.689226 on an independent test dataset. Feature analysis showed that every category of the features except disorder contributed to the identification of active sites. It was also shown via the site-specific feature analysis that the features derived from the active site itself contributed most to the active site determination. Our prediction method may become a useful tool for identifying the active sites and the key features identified by the paper may provide valuable insights into the mechanism of catalysis.
Towards enhanced and interpretable clustering/classification in integrative genomics
Lu, Yang Young; Lv, Jinchi; Fuhrman, Jed A.
2017-01-01
Abstract High-throughput technologies have led to large collections of different types of biological data that provide unprecedented opportunities to unravel molecular heterogeneity of biological processes. Nevertheless, how to jointly explore data from multiple sources into a holistic, biologically meaningful interpretation remains challenging. In this work, we propose a scalable and tuning-free preprocessing framework, Heterogeneity Rescaling Pursuit (Hetero-RP), which weighs important features more highly than less important ones in accord with implicitly existing auxiliary knowledge. Finally, we demonstrate effectiveness of Hetero-RP in diverse clustering and classification applications. More importantly, Hetero-RP offers an interpretation of feature importance, shedding light on the driving forces of the underlying biology. In metagenomic contig binning, Hetero-RP automatically weighs abundance and composition profiles according to the varying number of samples, resulting in markedly improved performance of contig binning. In RNA-binding protein (RBP) binding site prediction, Hetero-RP not only improves the prediction performance measured by the area under the receiver operating characteristic curves (AUC), but also uncovers the evidence supported by independent studies, including the distribution of the binding sites of IGF2BP and PUM2, the binding competition between hnRNPC and U2AF2, and the intron–exon boundary of U2AF2 [availability: https://github.com/younglululu/Hetero-RP]. PMID:28977511
Informing the Human Plasma Protein Binding of ...
The free fraction of a xenobiotic in plasma (Fub) is an important determinant of chemical adsorption, distribution, metabolism, elimination, and toxicity, yet experimental plasma protein binding data is scarce for environmentally relevant chemicals. The presented work explores the merit of utilizing available pharmaceutical data to predict Fub for environmentally relevant chemicals via machine learning techniques. Quantitative structure-activity relationship (QSAR) models were constructed with k nearest neighbors (kNN), support vector machines (SVM), and random forest (RF) machine learning algorithms from a training set of 1045 pharmaceuticals. The models were then evaluated with independent test sets of pharmaceuticals (200 compounds) and environmentally relevant ToxCast chemicals (406 total, in two groups of 238 and 168 compounds). The selection of a minimal feature set of 10-15 2D molecular descriptors allowed for both informative feature interpretation and practical applicability domain assessment via a bounded box of descriptor ranges and principal component analysis. The diverse pharmaceutical and environmental chemical sets exhibit similarities in terms of chemical space (99-82% overlap), as well as comparable bias and variance in constructed learning curves. All the models exhibit significant predictability with mean absolute errors (MAE) in the range of 0.10-0.18 Fub. The models performed best for highly bound chemicals (MAE 0.07-0.12), neutrals (MAE 0
García-Jiménez, Beatriz; Pons, Tirso; Sanchis, Araceli; Valencia, Alfonso
2014-01-01
Biological pathways are important elements of systems biology and in the past decade, an increasing number of pathway databases have been set up to document the growing understanding of complex cellular processes. Although more genome-sequence data are becoming available, a large fraction of it remains functionally uncharacterized. Thus, it is important to be able to predict the mapping of poorly annotated proteins to original pathway models. We have developed a Relational Learning-based Extension (RLE) system to investigate pathway membership through a function prediction approach that mainly relies on combinations of simple properties attributed to each protein. RLE searches for proteins with molecular similarities to specific pathway components. Using RLE, we associated 383 uncharacterized proteins to 28 pre-defined human Reactome pathways, demonstrating relative confidence after proper evaluation. Indeed, in specific cases manual inspection of the database annotations and the related literature supported the proposed classifications. Examples of possible additional components of the Electron transport system, Telomere maintenance and Integrin cell surface interactions pathways are discussed in detail. All the human predicted proteins in the 2009 and 2012 releases 30 and 40 of Reactome are available at http://rle.bioinfo.cnio.es.
Antunes, Deborah; Jorge, Natasha A. N.; Caffarena, Ernesto R.; Passetti, Fabio
2018-01-01
RNA molecules are essential players in many fundamental biological processes. Prokaryotes and eukaryotes have distinct RNA classes with specific structural features and functional roles. Computational prediction of protein structures is a research field in which high confidence three-dimensional protein models can be proposed based on the sequence alignment between target and templates. However, to date, only a few approaches have been developed for the computational prediction of RNA structures. Similar to proteins, RNA structures may be altered due to the interaction with various ligands, including proteins, other RNAs, and metabolites. A riboswitch is a molecular mechanism, found in the three kingdoms of life, in which the RNA structure is modified by the binding of a metabolite. It can regulate multiple gene expression mechanisms, such as transcription, translation initiation, and mRNA splicing and processing. Due to their nature, these entities also act on the regulation of gene expression and detection of small metabolites and have the potential to helping in the discovery of new classes of antimicrobial agents. In this review, we describe software and web servers currently available for riboswitch aptamer identification and secondary and tertiary structure prediction, including applications. PMID:29403526
NASA Astrophysics Data System (ADS)
Dell, Zachary E.; Schweizer, Kenneth S.
A unified, microscopic, theoretical understanding of polymer dynamics in concentrated liquids from segmental to macromolecular scales remains an open problem. We have formulated a statistical mechanical theory for this problem that explicitly accounts for intra- and inter-molecular forces at the Kuhn segment level. The theory is self-consistently closed at the level of a matrix of dynamical second moments of a tagged chain. Two distinct regimes of isotropic transient localization are predicted. In semidilute solutions, weak localization is predicted on a mesoscopic length scale between segment and chain scales which is a power law function of the invariant packing length. This is consistent with the breakdown of Rouse dynamics and the emergence of entanglements. The chain structural correlations in the dynamically arrested state are also computed. In dense melts, strong localization is predicted on a scale much smaller than the segment size which is weakly dependent on chain connectivity and signals the onset of glassy dynamics. Predictions of the dynamic plateau shear modulus are consistent with the known features of emergent rubbery and glassy elasticity. Generalizations to treat the effects of chemical crosslinking and physical bond formation in polymer gels are possible.
ERIC Educational Resources Information Center
Kelly, Resa M.; Jones, Loretta L.
2007-01-01
Animations of molecular structure and dynamics are often used to help students understand the abstract ideas of chemistry. This qualitative study investigated how the features of two different styles of molecular-level animation affected students' explanations of how sodium chloride dissolves in water. In small group sessions 18 college-level…
Predicting age groups of Twitter users based on language and metadata features
Morgan-Lopez, Antonio A.; Chew, Robert F.; Ruddle, Paul
2017-01-01
Health organizations are increasingly using social media, such as Twitter, to disseminate health messages to target audiences. Determining the extent to which the target audience (e.g., age groups) was reached is critical to evaluating the impact of social media education campaigns. The main objective of this study was to examine the separate and joint predictive validity of linguistic and metadata features in predicting the age of Twitter users. We created a labeled dataset of Twitter users across different age groups (youth, young adults, adults) by collecting publicly available birthday announcement tweets using the Twitter Search application programming interface. We manually reviewed results and, for each age-labeled handle, collected the 200 most recent publicly available tweets and user handles’ metadata. The labeled data were split into training and test datasets. We created separate models to examine the predictive validity of language features only, metadata features only, language and metadata features, and words/phrases from another age-validated dataset. We estimated accuracy, precision, recall, and F1 metrics for each model. An L1-regularized logistic regression model was conducted for each age group, and predicted probabilities between the training and test sets were compared for each age group. Cohen’s d effect sizes were calculated to examine the relative importance of significant features. Models containing both Tweet language features and metadata features performed the best (74% precision, 74% recall, 74% F1) while the model containing only Twitter metadata features were least accurate (58% precision, 60% recall, and 57% F1 score). Top predictive features included use of terms such as “school” for youth and “college” for young adults. Overall, it was more challenging to predict older adults accurately. These results suggest that examining linguistic and Twitter metadata features to predict youth and young adult Twitter users may be helpful for informing public health surveillance and evaluation research. PMID:28850620
Predicting age groups of Twitter users based on language and metadata features.
Morgan-Lopez, Antonio A; Kim, Annice E; Chew, Robert F; Ruddle, Paul
2017-01-01
Health organizations are increasingly using social media, such as Twitter, to disseminate health messages to target audiences. Determining the extent to which the target audience (e.g., age groups) was reached is critical to evaluating the impact of social media education campaigns. The main objective of this study was to examine the separate and joint predictive validity of linguistic and metadata features in predicting the age of Twitter users. We created a labeled dataset of Twitter users across different age groups (youth, young adults, adults) by collecting publicly available birthday announcement tweets using the Twitter Search application programming interface. We manually reviewed results and, for each age-labeled handle, collected the 200 most recent publicly available tweets and user handles' metadata. The labeled data were split into training and test datasets. We created separate models to examine the predictive validity of language features only, metadata features only, language and metadata features, and words/phrases from another age-validated dataset. We estimated accuracy, precision, recall, and F1 metrics for each model. An L1-regularized logistic regression model was conducted for each age group, and predicted probabilities between the training and test sets were compared for each age group. Cohen's d effect sizes were calculated to examine the relative importance of significant features. Models containing both Tweet language features and metadata features performed the best (74% precision, 74% recall, 74% F1) while the model containing only Twitter metadata features were least accurate (58% precision, 60% recall, and 57% F1 score). Top predictive features included use of terms such as "school" for youth and "college" for young adults. Overall, it was more challenging to predict older adults accurately. These results suggest that examining linguistic and Twitter metadata features to predict youth and young adult Twitter users may be helpful for informing public health surveillance and evaluation research.
Molecular Pathogenesis and Diagnostic, Prognostic and Predictive Molecular Markers in Sarcoma.
Mariño-Enríquez, Adrián; Bovée, Judith V M G
2016-09-01
Sarcomas are infrequent mesenchymal neoplasms characterized by notable morphological and molecular heterogeneity. Molecular studies in sarcoma provide refinements to morphologic classification, and contribute diagnostic information (frequently), prognostic stratification (rarely) and predict therapeutic response (occasionally). Herein, we summarize the major molecular mechanisms underlying sarcoma pathogenesis and present clinically useful diagnostic, prognostic and predictive molecular markers for sarcoma. Five major molecular alterations are discussed, illustrated with representative sarcoma types, including 1. the presence of chimeric transcription factors, in vascular tumors; 2. abnormal kinase signaling, in gastrointestinal stromal tumor; 3. epigenetic deregulation, in chondrosarcoma, chondroblastoma, and other tumors; 4. deregulated cell survival and proliferation, due to focal copy number alterations, in dedifferentiated liposarcoma; 5. extreme genomic instability, in conventional osteosarcoma as a representative example of sarcomas with highly complex karyotype. Copyright © 2016 Elsevier Inc. All rights reserved.
Jelonek, Karol; Pietrowska, Monika; Widlak, Piotr
2017-07-01
Blood is the most common replacement tissue used to study systemic responses of organisms to different types of pathological conditions and environmental insults. Local irradiation during cancer radiotherapy induces whole body responses that can be observed at the blood proteome and metabolome levels. Hence, comparative blood proteomics and metabolomics are emerging approaches used in the discovery of radiation biomarkers. These techniques enable the simultaneous measurement of hundreds of molecules and the identification of sets of components that can discriminate different physiological states of the human body. Radiation-induced changes are affected by the dose and volume of irradiated tissues; hence, the molecular composition of blood is a hypothetical source of biomarkers for dose assessment and the prediction and monitoring of systemic responses to radiation. This review aims to provide a comprehensive overview on the available evidence regarding molecular responses to ionizing radiation detected at the level of the human blood proteome and metabolome. It focuses on patients exposed to radiation during cancer radiotherapy and emphasizes effects related to radiation-induced toxicity and inflammation. Systemic responses to radiation detected at the blood proteome and metabolome levels are primarily related to the intensity of radiation-induced toxicity, including inflammatory responses. Thus, several inflammation-associated molecules can be used to monitor or even predict radiation-induced toxicity. However, these abundant molecular features have a rather limited applicability as universal biomarkers for dose assessment, reflecting the individual predisposition of the immune system and tissue-specific mechanisms involved in radiation-induced damage.
Zhang, Jun; Hao, Qing-Qing; Liu, Xin; Jing, Zhi; Jia, Wen-Qing; Wang, Shu-Qing; Xu, Wei-Ren; Cheng, Xian-Chao; Wang, Run-Ling
2017-01-01
Telmisartan, a bifunctional agent of blood pressure lowering and glycemia reduction, was previously reported to antagonize angiotensin II type 1 (AT1) receptor and partially activate peroxisome proliferator-activated receptor γ (PPARγ) simultaneously. Through the modification to telmisartan, researchers designed and obtained imidazo-\\pyridine derivatives with the IC50s of 0.49∼94.1 nM against AT1 and EC50s of 20∼3640 nM towards PPARγ partial activation. For minutely inquiring the interaction modes with the relevant receptor and analyzing the structure-activity relationships, molecular docking and 3D-QSAR (Quantitative structure-activity relationships) analysis of these imidazo-\\pyridines on dual targets were conducted in this work. Docking approaches of these derivatives with both receptors provided explicit interaction behaviors and excellent matching degree with the binding pockets. The best CoMFA (Comparative Molecular Field Analysis) models exhibited predictive results of q2=0.553, r2=0.954, SEE=0.127, r2pred=0.779 for AT1 and q2=0.503, r2=1.00, SEE=0.019, r2pred=0.604 for PPARγ, respectively. The contour maps from the optimal model showed detailed information of structural features (steric and electrostatic fields) towards the biological activity. Combining the bioisosterism with the valuable information from above studies, we designed six molecules with better predicted activities towards AT1 and PPARγ partial activation. Overall, these results could be useful for designing potential dual AT1 antagonists and partial PPARγ agonists. PMID:28445965