Science.gov

Sample records for molecular features predicting

  1. Radiomic analysis reveals DCE-MRI features for prediction of molecular subtypes of breast cancer.

    PubMed

    Fan, Ming; Li, Hui; Wang, Shijian; Zheng, Bin; Zhang, Juan; Li, Lihua

    2017-01-01

    The purpose of this study was to investigate the role of features derived from breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) and to incorporated clinical information to predict the molecular subtypes of breast cancer. In particular, 60 breast cancers with the following four molecular subtypes were analyzed: luminal A, luminal B, human epidermal growth factor receptor-2 (HER2)-over-expressing and basal-like. The breast region was segmented and the suspicious tumor was depicted on sequentially scanned MR images from each case. In total, 90 features were obtained, including 88 imaging features related to morphology and texture as well as dynamic features from tumor and background parenchymal enhancement (BPE) and 2 clinical information-based parameters, namely, age and menopausal status. An evolutionary algorithm was used to select an optimal subset of features for classification. Using these features, we trained a multi-class logistic regression classifier that calculated the area under the receiver operating characteristic curve (AUC). The results of a prediction model using 24 selected features showed high overall classification performance, with an AUC value of 0.869. The predictive model discriminated among the luminal A, luminal B, HER2 and basal-like subtypes, with AUC values of 0.867, 0.786, 0.888 and 0.923, respectively. An additional independent dataset with 36 patients was utilized to validate the results. A similar classification analysis of the validation dataset showed an AUC of 0.872 using 15 image features, 10 of which were identical to those from the first cohort. We identified clinical information and 3D imaging features from DCE-MRI as candidate biomarkers for discriminating among four molecular subtypes of breast cancer.

  2. Radiomic analysis reveals DCE-MRI features for prediction of molecular subtypes of breast cancer

    PubMed Central

    Fan, Ming; Li, Hui; Wang, Shijian; Zheng, Bin; Zhang, Juan; Li, Lihua

    2017-01-01

    The purpose of this study was to investigate the role of features derived from breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) and to incorporated clinical information to predict the molecular subtypes of breast cancer. In particular, 60 breast cancers with the following four molecular subtypes were analyzed: luminal A, luminal B, human epidermal growth factor receptor-2 (HER2)-over-expressing and basal-like. The breast region was segmented and the suspicious tumor was depicted on sequentially scanned MR images from each case. In total, 90 features were obtained, including 88 imaging features related to morphology and texture as well as dynamic features from tumor and background parenchymal enhancement (BPE) and 2 clinical information-based parameters, namely, age and menopausal status. An evolutionary algorithm was used to select an optimal subset of features for classification. Using these features, we trained a multi-class logistic regression classifier that calculated the area under the receiver operating characteristic curve (AUC). The results of a prediction model using 24 selected features showed high overall classification performance, with an AUC value of 0.869. The predictive model discriminated among the luminal A, luminal B, HER2 and basal-like subtypes, with AUC values of 0.867, 0.786, 0.888 and 0.923, respectively. An additional independent dataset with 36 patients was utilized to validate the results. A similar classification analysis of the validation dataset showed an AUC of 0.872 using 15 image features, 10 of which were identical to those from the first cohort. We identified clinical information and 3D imaging features from DCE-MRI as candidate biomarkers for discriminating among four molecular subtypes of breast cancer. PMID:28166261

  3. Evaluation of tumor-derived MRI-texture features for discrimination of molecular subtypes and prediction of 12-month survival status in glioblastoma

    PubMed Central

    Yang, Dalu; Rao, Ganesh; Martinez, Juan; Veeraraghavan, Ashok; Rao, Arvind

    2015-01-01

    Purpose: Glioblastoma multiforme (GBM) is the most common and aggressive primary brain cancer. Four molecular subtypes of GBM have been described but can only be determined by an invasive brain biopsy. The goal of this study is to evaluate the utility of texture features extracted from magnetic resonance imaging (MRI) scans as a potential noninvasive method to characterize molecular subtypes of GBM and to predict 12-month overall survival status for GBM patients. Methods: The authors manually segmented the tumor regions from postcontrast T1 weighted and T2 fluid-attenuated inversion recovery (FLAIR) MRI scans of 82 patients with de novo GBM. For each patient, the authors extracted five sets of computer-extracted texture features, namely, 48 segmentation-based fractal texture analysis (SFTA) features, 576 histogram of oriented gradients (HOGs) features, 44 run-length matrix (RLM) features, 256 local binary patterns features, and 52 Haralick features, from the tumor slice corresponding to the maximum tumor area in axial, sagittal, and coronal planes, respectively. The authors used an ensemble classifier called random forest on each feature family to predict GBM molecular subtypes and 12-month survival status (a dichotomized version of overall survival at the 12-month time point indicating if the patient was alive or not at 12 months). The performance of the prediction was quantified and compared using receiver operating characteristic (ROC) curves. Results: With the appropriate combination of texture feature set, image plane (axial, coronal, or sagittal), and MRI sequence, the area under ROC curve values for predicting different molecular subtypes and 12-month survival status are 0.72 for classical (with Haralick features on T1 postcontrast axial scan), 0.70 for mesenchymal (with HOG features on T2 FLAIR axial scan), 0.75 for neural (with RLM features on T2 FLAIR axial scan), 0.82 for proneural (with SFTA features on T1 postcontrast coronal scan), and 0.69 for 12

  4. Semen molecular and cellular features: these parameters can reliably predict subsequent ART outcome in a goat model

    PubMed Central

    Berlinguer, Fiammetta; Madeddu, Manuela; Pasciu, Valeria; Succu, Sara; Spezzigu, Antonio; Satta, Valentina; Mereu, Paolo; Leoni, Giovanni G; Naitana, Salvatore

    2009-01-01

    Currently, the assessment of sperm function in a raw or processed semen sample is not able to reliably predict sperm ability to withstand freezing and thawing procedures and in vivo fertility and/or assisted reproductive biotechnologies (ART) outcome. The aim of the present study was to investigate which parameters among a battery of analyses could predict subsequent spermatozoa in vitro fertilization ability and hence blastocyst output in a goat model. Ejaculates were obtained by artificial vagina from 3 adult goats (Capra hircus) aged 2 years (A, B and C). In order to assess the predictive value of viability, computer assisted sperm analyzer (CASA) motility parameters and ATP intracellular concentration before and after thawing and of DNA integrity after thawing on subsequent embryo output after an in vitro fertility test, a logistic regression analysis was used. Individual differences in semen parameters were evident for semen viability after thawing and DNA integrity. Results of IVF test showed that spermatozoa collected from A and B lead to higher cleavage rates (0 < 0.01) and blastocysts output (p < 0.05) compared with C. Logistic regression analysis model explained a deviance of 72% (p < 0.0001), directly related with the mean percentage of rapid spermatozoa in fresh semen (p < 0.01), semen viability after thawing (p < 0.01), and with two of the three comet parameters considered, i.e tail DNA percentage and comet length (p < 0.0001). DNA integrity alone had a high predictive value on IVF outcome with frozen/thawed semen (deviance explained: 57%). The model proposed here represents one of the many possible ways to explain differences found in embryo output following IVF with different semen donors and may represent a useful tool to select the most suitable donors for semen cryopreservation. PMID:19900288

  5. Predicting aqueous solubility of environmentally relevant compounds from molecular features: a simple but highly effective four-dimensional model based on Project to Latent Structures.

    PubMed

    Xiao, Feng; Gulliver, John S; Simcik, Matt F

    2013-09-15

    The aqueous solubility (log S) of xenobiotic chemicals has been identified as a key characteristic in determining their bioaccessibility/bioavailability and their fate and transport in aquatic environments. We here explore and evaluate the use of a state-of-the-art data analysis technique (Project to Latent Structures, PLS) to estimate log S of environmentally relevant chemicals. A large number (n = 624) of molecular descriptors was computed for over 1400 organic chemicals, and then refined by a feature selection technique. Candidate predictor descriptors were fitted to data by means of PLS, which was optimized by an internal leave-one-out cross-validation technique and validated by an external data set. The final (best) PLS model with only four variables (AlogP, X1sol, Mv, and E) exhibited noteworthy stability and good predictive power. It was able to explain 91% of the data (n = 1400) variance with an average absolute error of 0.5 log units through the solubilities span over 12 orders of magnitude. The newly proposed model is transparent, easily portable from one user to another, and robust enough to accurately estimate log S of a wide range of emerging contaminants.

  6. Predicting discovery rates of genomic features.

    PubMed

    Gravel, Simon

    2014-06-01

    Successful sequencing experiments require judicious sample selection. However, this selection must often be performed on the basis of limited preliminary data. Predicting the statistical properties of the final sample based on preliminary data can be challenging, because numerous uncertain model assumptions may be involved. Here, we ask whether we can predict "omics" variation across many samples by sequencing only a fraction of them. In the infinite-genome limit, we find that a pilot study sequencing 5% of a population is sufficient to predict the number of genetic variants in the entire population within 6% of the correct value, using an estimator agnostic to demography, selection, or population structure. To reach similar accuracy in a finite genome with millions of polymorphisms, the pilot study would require ∼15% of the population. We present computationally efficient jackknife and linear programming methods that exhibit substantially less bias than the state of the art when applied to simulated data and subsampled 1000 Genomes Project data. Extrapolating based on the National Heart, Lung, and Blood Institute Exome Sequencing Project data, we predict that 7.2% of sites in the capture region would be variable in a sample of 50,000 African Americans and 8.8% in a European sample of equal size. Finally, we show how the linear programming method can also predict discovery rates of various genomic features, such as the number of transcription factor binding sites across different cell types.

  7. Clinical features and molecular bases of neuroacanthocytosis.

    PubMed

    Rampoldi, Luca; Danek, Adrian; Monaco, Anthony P

    2002-08-01

    The term acanthocytosis is derived from the Greek for "thorn" and is used to describe a peculiar spiky appearance of erythrocytes. Acanthocytosis is found to be associated with at least three hereditary neurological disorders that are generally referred to as neuroacanthocytosis. Abetalipoproteinaemia is an autosomal recessive condition, characterised by absence of serum apolipoprotein B containing lipoproteins leading to fat intolerance and fat-soluble vitamin deficiency. This results in a progressive spinocerebellar ataxia with peripheral neuropathy and retinitis pigmentosa. Chorea-acanthocytosis is also an autosomal recessive condition and is characterised by chorea, orofaciolingual dyskinesia, dysphagia, dysarthria, areflexia, seizures and dementia. Some of its features, including choreic movements, peripheral neuropathy with areflexia, elevated serum creatine kinase levels and myopathy are shared by another form of neuroacanthocytosis, McLeod syndrome. Patients affected by this X-linked disorder also show abnormal expression of Kell blood group antigens and a permanent haemolytic state. In addition to these cases, acanthocytosis is occasionally associated with other neurological disorders, such as Hallervorden-Spatz disease. For each of the neuroacanthocytosis syndromes we review the main clinical features and their molecular bases. The recent molecular genetics findings are the first step towards the understanding of the pathogenetic mechanisms and eventually the search for effective treatments.

  8. Molecular absorption features in translucent clouds

    NASA Astrophysics Data System (ADS)

    Krelowski, Jacek

    2007-12-01

    Interstellar clouds, composed of neutral hydrogen, consist about 90% of the total mass of interstellar medium. Their absorption spectra contain: continuous extinction, atomic lines, molecular features and the unidentified diffuse interstellar bands (DIBs). The latter are also believed to be carried by some, rather complex molecules. A vast majority of DIBs is characterized by small central depths. This is why they became observable only since the solid state detectors are widely applied in astrophysics. It is to be emphasized that interstellar absorptions, seen along the same line of sight, may be in fact originated in several, different environments (clouds). The extensive database of echelle spectra allowed to prove that the CaII column density evidently correlates with parallaxes of OB-3 stars in contrast to other interstellar species. Thus CaII is quite evenly distributed in the interstellar medium while other species (NaI, KI, CaI, CH, CN, DIB carriers) are not. This fact is of basic importance as the ob- served spectra cannot be physically interpreted if they mix features originated in different clouds, i.e. in different environments. The abundance ratios of interstellar molecules (identified and DIB carriers) differ from cloud to cloud due to different physical processes which govern their formation. High resolution, high S/N spectra, prove that also profiles of diffuse bands vary from cloud to cloud - this fact strongly supports a molecular origin of these, still nidentified, features and motivates investigation of their relations to other molecules; they can reveal physical conditions which facilitate formation of the DIB carriers and lead to their identification.

  9. Feature Selection for Neural Network Based Stock Prediction

    NASA Astrophysics Data System (ADS)

    Sugunnasil, Prompong; Somhom, Samerkae

    We propose a new methodology of feature selection for stock movement prediction. The methodology is based upon finding those features which minimize the correlation relation function. We first produce all the combination of feature and evaluate each of them by using our evaluate function. We search through the generated set with hill climbing approach. The self-organizing map based stock prediction model is utilized as the prediction method. We conduct the experiment on data sets of the Microsoft Corporation, General Electric Co. and Ford Motor Co. The results show that our feature selection method can improve the efficiency of the neural network based stock prediction.

  10. Prediction of DNA-binding proteins from relational features

    PubMed Central

    2012-01-01

    Background The process of protein-DNA binding has an essential role in the biological processing of genetic information. We use relational machine learning to predict DNA-binding propensity of proteins from their structures. Automatically discovered structural features are able to capture some characteristic spatial configurations of amino acids in proteins. Results Prediction based only on structural relational features already achieves competitive results to existing methods based on physicochemical properties on several protein datasets. Predictive performance is further improved when structural features are combined with physicochemical features. Moreover, the structural features provide some insights not revealed by physicochemical features. Our method is able to detect common spatial substructures. We demonstrate this in experiments with zinc finger proteins. Conclusions We introduced a novel approach for DNA-binding propensity prediction using relational machine learning which could potentially be used also for protein function prediction in general. PMID:23146001

  11. Learning through Feature Prediction: An Initial Investigation into Teaching Categories to Children with Autism through Predicting Missing Features

    ERIC Educational Resources Information Center

    Sweller, Naomi

    2015-01-01

    Individuals with autism have difficulty generalising information from one situation to another, a process that requires the learning of categories and concepts. Category information may be learned through: (1) classifying items into categories, or (2) predicting missing features of category items. Predicting missing features has to this point been…

  12. Protein molecular function prediction by Bayesian phylogenomics.

    PubMed

    Engelhardt, Barbara E; Jordan, Michael I; Muratore, Kathryn E; Brenner, Steven E

    2005-10-01

    We present a statistical graphical model to infer specific molecular function for unannotated protein sequences using homology. Based on phylogenomic principles, SIFTER (Statistical Inference of Function Through Evolutionary Relationships) accurately predicts molecular function for members of a protein family given a reconciled phylogeny and available function annotations, even when the data are sparse or noisy. Our method produced specific and consistent molecular function predictions across 100 Pfam families in comparison to the Gene Ontology annotation database, BLAST, GOtcha, and Orthostrapper. We performed a more detailed exploration of functional predictions on the adenosine-5'-monophosphate/adenosine deaminase family and the lactate/malate dehydrogenase family, in the former case comparing the predictions against a gold standard set of published functional characterizations. Given function annotations for 3% of the proteins in the deaminase family, SIFTER achieves 96% accuracy in predicting molecular function for experimentally characterized proteins as reported in the literature. The accuracy of SIFTER on this dataset is a significant improvement over other currently available methods such as BLAST (75%), GeneQuiz (64%), GOtcha (89%), and Orthostrapper (11%). We also experimentally characterized the adenosine deaminase from Plasmodium falciparum, confirming SIFTER's prediction. The results illustrate the predictive power of exploiting a statistical model of function evolution in phylogenomic problems. A software implementation of SIFTER is available from the authors.

  13. Stabilizing l1-norm prediction models by supervised feature grouping.

    PubMed

    Kamkar, Iman; Gupta, Sunil Kumar; Phung, Dinh; Venkatesh, Svetha

    2016-02-01

    Emerging Electronic Medical Records (EMRs) have reformed the modern healthcare. These records have great potential to be used for building clinical prediction models. However, a problem in using them is their high dimensionality. Since a lot of information may not be relevant for prediction, the underlying complexity of the prediction models may not be high. A popular way to deal with this problem is to employ feature selection. Lasso and l1-norm based feature selection methods have shown promising results. But, in presence of correlated features, these methods select features that change considerably with small changes in data. This prevents clinicians to obtain a stable feature set, which is crucial for clinical decision making. Grouping correlated variables together can improve the stability of feature selection, however, such grouping is usually not known and needs to be estimated for optimal performance. Addressing this problem, we propose a new model that can simultaneously learn the grouping of correlated features and perform stable feature selection. We formulate the model as a constrained optimization problem and provide an efficient solution with guaranteed convergence. Our experiments with both synthetic and real-world datasets show that the proposed model is significantly more stable than Lasso and many existing state-of-the-art shrinkage and classification methods. We further show that in terms of prediction performance, the proposed method consistently outperforms Lasso and other baselines. Our model can be used for selecting stable risk factors for a variety of healthcare problems, so it can assist clinicians toward accurate decision making.

  14. Deep Feature Transfer Learning in Combination with Traditional Features Predicts Survival Among Patients with Lung Adenocarcinoma

    PubMed Central

    Paul, Rahul; Hawkins, Samuel H.; Balagurunathan, Yoganand; Schabath, Matthew B.; Gillies, Robert J.; Hall, Lawrence O.; Goldgof, Dmitry B.

    2016-01-01

    Lung cancer is the most common cause of cancer-related deaths in the USA. It can be detected and diagnosed using computed tomography images. For an automated classifier, identifying predictive features from medical images is a key concern. Deep feature extraction using pretrained convolutional neural networks (CNNs) has recently been successfully applied in some image domains. Here, we applied a pretrained CNN to extract deep features from 40 computed tomography images, with contrast, of non-small cell adenocarcinoma lung cancer, and combined deep features with traditional image features and trained classifiers to predict short- and long-term survivors. We experimented with several pretrained CNNs and several feature selection strategies. The best previously reported accuracy when using traditional quantitative features was 77.5% (area under the curve [AUC], 0.712), which was achieved by a decision tree classifier. The best reported accuracy from transfer learning and deep features was 77.5% (AUC, 0.713) using a decision tree classifier. When extracted deep neural network features were combined with traditional quantitative features, we obtained an accuracy of 90% (AUC, 0.935) with the 5 best post-rectified linear unit features extracted from a vgg-f pretrained CNN and the 5 best traditional features. The best results were achieved with the symmetric uncertainty feature ranking algorithm followed by a random forests classifier. PMID:28066809

  15. Deep Feature Transfer Learning in Combination with Traditional Features Predicts Survival Among Patients with Lung Adenocarcinoma.

    PubMed

    Paul, Rahul; Hawkins, Samuel H; Balagurunathan, Yoganand; Schabath, Matthew B; Gillies, Robert J; Hall, Lawrence O; Goldgof, Dmitry B

    2016-12-01

    Lung cancer is the most common cause of cancer-related deaths in the USA. It can be detected and diagnosed using computed tomography images. For an automated classifier, identifying predictive features from medical images is a key concern. Deep feature extraction using pretrained convolutional neural networks (CNNs) has recently been successfully applied in some image domains. Here, we applied a pretrained CNN to extract deep features from 40 computed tomography images, with contrast, of non-small cell adenocarcinoma lung cancer, and combined deep features with traditional image features and trained classifiers to predict short- and long-term survivors. We experimented with several pretrained CNNs and several feature selection strategies. The best previously reported accuracy when using traditional quantitative features was 77.5% (area under the curve [AUC], 0.712), which was achieved by a decision tree classifier. The best reported accuracy from transfer learning and deep features was 77.5% (AUC, 0.713) using a decision tree classifier. When extracted deep neural network features were combined with traditional quantitative features, we obtained an accuracy of 90% (AUC, 0.935) with the 5 best post-rectified linear unit features extracted from a vgg-f pretrained CNN and the 5 best traditional features. The best results were achieved with the symmetric uncertainty feature ranking algorithm followed by a random forests classifier.

  16. Generalized perceptual linear prediction features for animal vocalization analysis.

    PubMed

    Clemins, Patrick J; Johnson, Michael T

    2006-07-01

    A new feature extraction model, generalized perceptual linear prediction (gPLP), is developed to calculate a set of perceptually relevant features for digital signal analysis of animal vocalizations. The gPLP model is a generalized adaptation of the perceptual linear prediction model, popular in human speech processing, which incorporates perceptual information such as frequency warping and equal loudness normalization into the feature extraction process. Since such perceptual information is available for a number of animal species, this new approach integrates that information into a generalized model to extract perceptually relevant features for a particular species. To illustrate, qualitative and quantitative comparisons are made between the species-specific model, generalized perceptual linear prediction (gPLP), and the original PLP model using a set of vocalizations collected from captive African elephants (Loxodonta africana) and wild beluga whales (Delphinapterus leucas). The models that incorporate perceptional information outperform the original human-based models in both visualization and classification tasks.

  17. Feature Fusion Based SVM Classifier for Protein Subcellular Localization Prediction.

    PubMed

    Rahman, Julia; Mondal, Md Nazrul Islam; Islam, Md Khaled Ben; Hasan, Md Al Mehedi

    2016-12-18

    For the importance of protein subcellular localization in different branches of life science and drug discovery, researchers have focused their attentions on protein subcellular localization prediction. Effective representation of features from protein sequences plays a most vital role in protein subcellular localization prediction specially in case of machine learning techniques. Single feature representation-like pseudo amino acid composition (PseAAC), physiochemical property models (PPM), and amino acid index distribution (AAID) contains insufficient information from protein sequences. To deal with such problems, we have proposed two feature fusion representations, AAIDPAAC and PPMPAAC, to work with Support Vector Machine classifiers, which fused PseAAC with PPM and AAID accordingly. We have evaluated the performance for both single and fused feature representation of a Gram-negative bacterial dataset. We have got at least 3% more actual accuracy by AAIDPAAC and 2% more locative accuracy by PPMPAAC than single feature representation.

  18. Actigraphy features for predicting mobility disability in older adults.

    PubMed

    Kheirkhahan, Matin; Tudor-Locke, Catrine; Axtell, Robert; Buman, Matthew P; Fielding, Roger A; Glynn, Nancy W; Guralnik, Jack M; King, Abby C; White, Daniel K; Miller, Michael E; Siddique, Juned; Brubaker, Peter; Rejeski, W Jack; Ranshous, Stephen; Pahor, Marco; Ranka, Sanjay; Manini, Todd M

    2016-09-21

    Actigraphy has attracted much attention for assessing physical activity in the past decade. Many algorithms have been developed to automate the analysis process, but none has targeted a general model to discover related features for detecting or predicting mobility function, or more specifically, mobility impairment and major mobility disability (MMD). Men (N  =  357) and women (N  =  778) aged 70-89 years wore a tri-axial accelerometer (Actigraph GT3X) on the right hip during free-living conditions for 8.4  ±  3.0 d. One-second epoch data were summarized into 67 features. Several machine learning techniques were used to select features from the free-living condition to predict mobility impairment, defined as 400 m walking speed  <0.80 m s(-1). Selected features were also included in a model to predict the first occurrence of MMD-defined as the loss in the ability to walk 400 m. Each method yielded a similar estimate of 400 m walking speed with a root mean square error of ~0.07 m s(-1) and an R-squared values ranging from 0.37-0.41. Sensitivity and specificity of identifying slow walkers was approximately 70% and 80% for all methods, respectively. The top five features, which were related to movement pace and amount (activity counts and steps), length in activity engagement (bout length), accumulation patterns of activity, and movement variability significantly improved the prediction of MMD beyond that found with common covariates (age, diseases, anthropometry, etc). This study identified a subset of actigraphy features collected in free-living conditions that are moderately accurate in identifying persons with clinically-assessed mobility impaired and significantly improve the prediction of MMD. These findings suggest that the combination of features as opposed to a specific feature is important to consider when choosing features and/or combinations of features for prediction of mobility phenotypes in older adults.

  19. SCRATCH: a protein structure and structural feature prediction server

    PubMed Central

    Cheng, J.; Randall, A. Z.; Sweredoski, M. J.; Baldi, P.

    2005-01-01

    SCRATCH is a server for predicting protein tertiary structure and structural features. The SCRATCH software suite includes predictors for secondary structure, relative solvent accessibility, disordered regions, domains, disulfide bridges, single mutation stability, residue contacts versus average, individual residue contacts and tertiary structure. The user simply provides an amino acid sequence and selects the desired predictions, then submits to the server. Results are emailed to the user. The server is available at . PMID:15980571

  20. Protein Function Prediction using Text-based Features extracted from the Biomedical Literature: The CAFA Challenge

    PubMed Central

    2013-01-01

    Background Advances in sequencing technology over the past decade have resulted in an abundance of sequenced proteins whose function is yet unknown. As such, computational systems that can automatically predict and annotate protein function are in demand. Most computational systems use features derived from protein sequence or protein structure to predict function. In an earlier work, we demonstrated the utility of biomedical literature as a source of text features for predicting protein subcellular location. We have also shown that the combination of text-based and sequence-based prediction improves the performance of location predictors. Following up on this work, for the Critical Assessment of Function Annotations (CAFA) Challenge, we developed a text-based system that aims to predict molecular function and biological process (using Gene Ontology terms) for unannotated proteins. In this paper, we present the preliminary work and evaluation that we performed for our system, as part of the CAFA challenge. Results We have developed a preliminary system that represents proteins using text-based features and predicts protein function using a k-nearest neighbour classifier (Text-KNN). We selected text features for our classifier by extracting key terms from biomedical abstracts based on their statistical properties. The system was trained and tested using 5-fold cross-validation over a dataset of 36,536 proteins. System performance was measured using the standard measures of precision, recall, F-measure and overall accuracy. The performance of our system was compared to two baseline classifiers: one that assigns function based solely on the prior distribution of protein function (Base-Prior) and one that assigns function based on sequence similarity (Base-Seq). The overall prediction accuracy of Text-KNN, Base-Prior, and Base-Seq for molecular function classes are 62%, 43%, and 58% while the overall accuracy for biological process classes are 17%, 11%, and 28

  1. How to Predict Molecular Interactions between Species?

    PubMed Central

    Schulze, Sylvie; Schleicher, Jana; Guthke, Reinhard; Linde, Jörg

    2016-01-01

    Organisms constantly interact with other species through physical contact which leads to changes on the molecular level, for example the transcriptome. These changes can be monitored for all genes, with the help of high-throughput experiments such as RNA-seq or microarrays. The adaptation of the gene expression to environmental changes within cells is mediated through complex gene regulatory networks. Often, our knowledge of these networks is incomplete. Network inference predicts gene regulatory interactions based on transcriptome data. An emerging application of high-throughput transcriptome studies are dual transcriptomics experiments. Here, the transcriptome of two or more interacting species is measured simultaneously. Based on a dual RNA-seq data set of murine dendritic cells infected with the fungal pathogen Candida albicans, the software tool NetGenerator was applied to predict an inter-species gene regulatory network. To promote further investigations of molecular inter-species interactions, we recently discussed dual RNA-seq experiments for host-pathogen interactions and extended the applied tool NetGenerator (Schulze et al., 2015). The updated version of NetGenerator makes use of measurement variances in the algorithmic procedure and accepts gene expression time series data with missing values. Additionally, we tested multiple modeling scenarios regarding the stimuli functions of the gene regulatory network. Here, we summarize the work by Schulze et al. (2015) and put it into a broader context. We review various studies making use of the dual transcriptomics approach to investigate the molecular basis of interacting species. Besides the application to host-pathogen interactions, dual transcriptomics data are also utilized to study mutualistic and commensalistic interactions. Furthermore, we give a short introduction into additional approaches for the prediction of gene regulatory networks and discuss their application to dual transcriptomics data. We

  2. Universality and predictability in molecular quantitative genetics.

    PubMed

    Nourmohammad, Armita; Held, Torsten; Lässig, Michael

    2013-12-01

    Molecular traits, such as gene expression levels or protein binding affinities, are increasingly accessible to quantitative measurement by modern high-throughput techniques. Such traits measure molecular functions and, from an evolutionary point of view, are important as targets of natural selection. We review recent developments in evolutionary theory and experiments that are expected to become building blocks of a quantitative genetics of molecular traits. We focus on universal evolutionary characteristics: these are largely independent of a trait's genetic basis, which is often at least partially unknown. We show that universal measurements can be used to infer selection on a quantitative trait, which determines its evolutionary mode of conservation or adaptation. Furthermore, universality is closely linked to predictability of trait evolution across lineages. We argue that universal trait statistics extends over a range of cellular scales and opens new avenues of quantitative evolutionary systems biology.

  3. Gene essentiality prediction based on fractal features and machine learning.

    PubMed

    Yu, Yongming; Yang, Licai; Liu, Zhiping; Zhu, Chuansheng

    2017-02-28

    Essential genes are required for the viability of an organism. Accurate and rapid identification of new essential genes is of substantial theoretical interest to synthetic biology and has practical applications in biomedicine. Fractals provide facilitated access to genetic structure analysis on a different scale. In this study, machine learning-based methods using solely fractal features are presented and the problem of predicting essential genes in bacterial genomes is evaluated. Six fractal features were investigated to learn the parameters of five supervised classification methods for the binary classification task. The optimal parameters of these classifiers are determined via grid-based searching technique. All the currently available identified genes from the database of essential genes were utilized to build the classifiers. The fractal features were proven to be more robust and powerful in the prediction performance. In a statistical sense, the ELM method shows superiority in predicting the essential genes. Non-parameter tests of the average AUC and ACC showed that the fractal feature is much better than other five compared features sets. Our approach is promising and convenient to identify new bacterial essential genes.

  4. Interpretable Topic Features for Post-ICU Mortality Prediction

    PubMed Central

    Luo, Yen-Fu; Rumshisky, Anna

    2016-01-01

    Electronic health records provide valuable resources for understanding the correlation between various diseases and mortality. The analysis of post-discharge mortality is critical for healthcare professionals to follow up potential causes of death after a patient is discharged from the hospital and give prompt treatment. Moreover, it may reduce the cost derived from readmissions and improve the quality of healthcare. Our work focused on post-discharge ICU mortality prediction. In addition to features derived from physiological measurements, we incorporated ICD-9-CM hierarchy into Bayesian topic model learning and extracted topic features from medical notes. We achieved highest AUCs of 0.835 and 0.829 for 30-day and 6-month post-discharge mortality prediction using baseline and topic proportions derived from Labeled-LDA. Moreover, our work emphasized the interpretability of topic features derived from topic model which may facilitates the understanding and investigation of the complexity between mortality and diseases. PMID:28269879

  5. Clinical and molecular features of young-onset colorectal cancer

    PubMed Central

    Ballester, Veroushka; Rashtak, Shahrooz; Boardman, Lisa

    2016-01-01

    Colorectal cancer (CRC) is one of the leading causes of cancer related mortality worldwide. Although young-onset CRC raises the possibility of a hereditary component, hereditary CRC syndromes only explain a minority of young-onset CRC cases. There is evidence to suggest that young-onset CRC have a different molecular profile than late-onset CRC. While the pathogenesis of young-onset CRC is well characterized in individuals with an inherited CRC syndrome, knowledge regarding the molecular features of sporadic young-onset CRC is limited. Understanding the molecular mechanisms of young-onset CRC can help us tailor specific screening and management strategies. While the incidence of late-onset CRC has been decreasing, mainly attributed to an increase in CRC screening, the incidence of young-onset CRC is increasing. Differences in the molecular biology of these tumors and low suspicion of CRC in young symptomatic individuals, may be possible explanations. Currently there is no evidence that supports that screening of average risk individuals less than 50 years of age will translate into early detection or increased survival. However, increasing understanding of the underlying molecular mechanisms of young-onset CRC could help us tailor specific screening and management strategies. The purpose of this review is to evaluate the current knowledge about young-onset CRC, its clinicopathologic features, and the newly recognized molecular alterations involved in tumor progression. PMID:26855533

  6. Predicting polymer nanofiber interactions via molecular simulations.

    PubMed

    Buell, Sezen; Rutledge, Gregory C; Vliet, Krystyn J Van

    2010-04-01

    Physical and functional properties of nonwoven textiles and other fiberlike materials depend strongly on the number and type of fiber-fiber interactions. For nanoscale polymeric fibers in particular, these interactions are governed by the surfaces of and contacts between fibers. We employ both molecular dynamics (MD) simulations at a temperature below the glass transition temperature T(g) of the polymer bulk, and molecular statics (MS), or energy minimization, to study the interfiber interactions between prototypical polymeric fibers of 4.6 nm diameter, comprising multiple macromolecular chains each of 100 carbon atoms per chain (C100). Our MD simulations show that fibers aligned parallel and within 9 nm of one another experience a significant force of attraction. These fibers tend toward coalescence on a very short time scale, even below T(g). In contrast, our MS calculations suggest an interfiber interaction that transitions from an attractive to a repulsive force at a separation distance of 6 nm. The results of either approach can be used to obtain a quantitative, closed-form relation describing fiber-fiber interaction energies U(s). However, the predicted form of interaction is quite different for the two approaches, and can be understood in terms of differences in the extent of molecular mobility within and between fibers for these different modeling perspectives. The results of these molecular-scale calculations of U(s) are used to interpret experimental observations for electrospun polymer nanofiber mats. These findings highlight the role of temperature and kinetically accessible molecular configurations in predicting interface-dominated interactions at polymer fiber surfaces, and prompt further experiments and simulations to confirm these effects in the properties of nonwoven mats comprising such nanoscale fibers.

  7. Prediction of subjective ratings of emotional pictures by EEG features

    NASA Astrophysics Data System (ADS)

    McFarland, Dennis J.; Parvaz, Muhammad A.; Sarnacki, William A.; Goldstein, Rita Z.; Wolpaw, Jonathan R.

    2017-02-01

    Objective. Emotion dysregulation is an important aspect of many psychiatric disorders. Brain-computer interface (BCI) technology could be a powerful new approach to facilitating therapeutic self-regulation of emotions. One possible BCI method would be to provide stimulus-specific feedback based on subject-specific electroencephalographic (EEG) responses to emotion-eliciting stimuli. Approach. To assess the feasibility of this approach, we studied the relationships between emotional valence/arousal and three EEG features: amplitude of alpha activity over frontal cortex; amplitude of theta activity over frontal midline cortex; and the late positive potential over central and posterior mid-line areas. For each feature, we evaluated its ability to predict emotional valence/arousal on both an individual and a group basis. Twenty healthy participants (9 men, 11 women; ages 22-68) rated each of 192 pictures from the IAPS collection in terms of valence and arousal twice (96 pictures on each of 4 d over 2 weeks). EEG was collected simultaneously and used to develop models based on canonical correlation to predict subject-specific single-trial ratings. Separate models were evaluated for the three EEG features: frontal alpha activity; frontal midline theta; and the late positive potential. In each case, these features were used to simultaneously predict both the normed ratings and the subject-specific ratings. Main results. Models using each of the three EEG features with data from individual subjects were generally successful at predicting subjective ratings on training data, but generalization to test data was less successful. Sparse models performed better than models without regularization. Significance. The results suggest that the frontal midline theta is a better candidate than frontal alpha activity or the late positive potential for use in a BCI-based paradigm designed to modify emotional reactions.

  8. Exploiting Information Diffusion Feature for Link Prediction in Sina Weibo

    NASA Astrophysics Data System (ADS)

    Li, Dong; Zhang, Yongchao; Xu, Zhiming; Chu, Dianhui; Li, Sheng

    2016-01-01

    The rapid development of online social networks (e.g., Twitter and Facebook) has promoted research related to social networks in which link prediction is a key problem. Although numerous attempts have been made for link prediction based on network structure, node attribute and so on, few of the current studies have considered the impact of information diffusion on link creation and prediction. This paper mainly addresses Sina Weibo, which is the largest microblog platform with Chinese characteristics, and proposes the hypothesis that information diffusion influences link creation and verifies the hypothesis based on real data analysis. We also detect an important feature from the information diffusion process, which is used to promote link prediction performance. Finally, the experimental results on Sina Weibo dataset have demonstrated the effectiveness of our methods.

  9. Common features of microRNA target prediction tools.

    PubMed

    Peterson, Sarah M; Thompson, Jeffrey A; Ufkin, Melanie L; Sathyanarayana, Pradeep; Liaw, Lucy; Congdon, Clare Bates

    2014-01-01

    The human genome encodes for over 1800 microRNAs (miRNAs), which are short non-coding RNA molecules that function to regulate gene expression post-transcriptionally. Due to the potential for one miRNA to target multiple gene transcripts, miRNAs are recognized as a major mechanism to regulate gene expression and mRNA translation. Computational prediction of miRNA targets is a critical initial step in identifying miRNA:mRNA target interactions for experimental validation. The available tools for miRNA target prediction encompass a range of different computational approaches, from the modeling of physical interactions to the incorporation of machine learning. This review provides an overview of the major computational approaches to miRNA target prediction. Our discussion highlights three tools for their ease of use, reliance on relatively updated versions of miRBase, and range of capabilities, and these are DIANA-microT-CDS, miRanda-mirSVR, and TargetScan. In comparison across all miRNA target prediction tools, four main aspects of the miRNA:mRNA target interaction emerge as common features on which most target prediction is based: seed match, conservation, free energy, and site accessibility. This review explains these features and identifies how they are incorporated into currently available target prediction tools. MiRNA target prediction is a dynamic field with increasing attention on development of new analysis tools. This review attempts to provide a comprehensive assessment of these tools in a manner that is accessible across disciplines. Understanding the basis of these prediction methodologies will aid in user selection of the appropriate tools and interpretation of the tool output.

  10. Critical Features of Fragment Libraries for Protein Structure Prediction.

    PubMed

    Trevizani, Raphael; Custódio, Fábio Lima; Dos Santos, Karina Baptista; Dardenne, Laurent Emmanuel

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.

  11. Critical Features of Fragment Libraries for Protein Structure Prediction

    PubMed Central

    dos Santos, Karina Baptista

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction. PMID:28085928

  12. Quantitative imaging features to predict cancer status in lung nodules

    NASA Astrophysics Data System (ADS)

    Liu, Ying; Balagurunathan, Yoganand; Atwater, Thomas; Antic, Sanja; Li, Qian; Walker, Ronald; Smith, Gary T.; Massion, Pierre P.; Schabath, Matthew B.; Gillies, Robert J.

    2016-03-01

    Background: We propose a systematic methodology to quantify incidentally identified lung nodules based on observed radiological traits on a point scale. These quantitative traits classification model was used to predict cancer status. Materials and Methods: We used 102 patients' low dose computed tomography (LDCT) images for this study, 24 semantic traits were systematically scored from each image. We built a machine learning classifier in cross validation setting to find best predictive imaging features to differentiate malignant from benign lung nodules. Results: The best feature triplet to discriminate malignancy was based on long axis, concavity and lymphadenopathy with average AUC of 0.897 (Accuracy of 76.8%, Sensitivity of 64.3%, Specificity of 90%). A similar semantic triplet optimized on Sensitivity/Specificity (Youden's J index) included long axis, vascular convergence and lymphadenopathy which had an average AUC of 0.875 (Accuracy of 81.7%, Sensitivity of 76.2%, Specificity of 95%). Conclusions: Quantitative radiological image traits can differentiate malignant from benign lung nodules. These semantic features along with size measurement enhance the prediction accuracy.

  13. Application of optimal prediction to molecular dynamics

    SciTech Connect

    Barber, IV, John Letherman

    2004-12-01

    Optimal prediction is a general system reduction technique for large sets of differential equations. In this method, which was devised by Chorin, Hald, Kast, Kupferman, and Levy, a projection operator formalism is used to construct a smaller system of equations governing the dynamics of a subset of the original degrees of freedom. This reduced system consists of an effective Hamiltonian dynamics, augmented by an integral memory term and a random noise term. Molecular dynamics is a method for simulating large systems of interacting fluid particles. In this thesis, I construct a formalism for applying optimal prediction to molecular dynamics, producing reduced systems from which the properties of the original system can be recovered. These reduced systems require significantly less computational time than the original system. I initially consider first-order optimal prediction, in which the memory and noise terms are neglected. I construct a pair approximation to the renormalized potential, and ignore three-particle and higher interactions. This produces a reduced system that correctly reproduces static properties of the original system, such as energy and pressure, at low-to-moderate densities. However, it fails to capture dynamical quantities, such as autocorrelation functions. I next derive a short-memory approximation, in which the memory term is represented as a linear frictional force with configuration-dependent coefficients. This allows the use of a Fokker-Planck equation to show that, in this regime, the noise is δ-correlated in time. This linear friction model reproduces not only the static properties of the original system, but also the autocorrelation functions of dynamical variables.

  14. A Prediction Model for Membrane Proteins Using Moments Based Features

    PubMed Central

    Butt, Ahmad Hassan; Khan, Sher Afzal; Jamil, Hamza; Rasool, Nouman; Khan, Yaser Daanial

    2016-01-01

    The most expedient unit of the human body is its cell. Encapsulated within the cell are many infinitesimal entities and molecules which are protected by a cell membrane. The proteins that are associated with this lipid based bilayer cell membrane are known as membrane proteins and are considered to play a significant role. These membrane proteins exhibit their effect in cellular activities inside and outside of the cell. According to the scientists in pharmaceutical organizations, these membrane proteins perform key task in drug interactions. In this study, a technique is presented that is based on various computationally intelligent methods used for the prediction of membrane protein without the experimental use of mass spectrometry. Statistical moments were used to extract features and furthermore a Multilayer Neural Network was trained using backpropagation for the prediction of membrane proteins. Results show that the proposed technique performs better than existing methodologies. PMID:26966690

  15. Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular pathway activities

    PubMed Central

    Itakura, Haruka; Achrol, Achal S.; Mitchell, Lex A.; Loya, Joshua J.; Liu, Tiffany; Westbroek, Erick M.; Feroze, Abdullah H.; Rodriguez, Scott; Echegaray, Sebastian; Azad, Tej D.; Yeom, Kristen W.; Napel, Sandy; Rubin, Daniel L.; Chang, Steven D.; Harsh, Griffith R.; Gevaert, Olivier

    2015-01-01

    Glioblastoma (GBM) is the most common and highly lethal primary malignant brain tumor in adults. There is a dire need for easily accessible, noninvasive biomarkers that can delineate underlying molecular activities and predict response to therapy. To this end, we sought to identify subtypes of GBM, differentiated solely by quantitative MR imaging features, that could be used for better management of GBM patients. Quantitative image features capturing the shape, texture, and edge sharpness of each lesion were extracted from MR images of 121 patients with de novo, solitary, unilateral GBM. Three distinct phenotypic “clusters” emerged in the development cohort using consensus clustering with 10,000 iterations on these image features. These three clusters—pre-multifocal, spherical, and rim-enhancing, names reflecting their image features—were validated in an independent cohort consisting of 144 multi-institution patients with similar tumor characteristics from The Cancer Genome Atlas (TCGA). Each cluster mapped to a unique set of molecular signaling pathways using pathway activity estimates derived from analysis of TCGA tumor copy number and gene expression data with the PARADIGM algorithm. Distinct pathways, such as c-Kit and FOXA, were enriched in each cluster, indicating differential molecular activities as determined by image features. Each cluster also demonstrated differential probabilities of survival, indicating prognostic importance. Our imaging method offers a noninvasive approach to stratify GBM patients and also provides unique sets of molecular signatures to inform targeted therapy and personalized treatment of GBM. PMID:26333934

  16. Predicting drug pharmacokinetic properties using molecular interaction fields and SIMCA

    NASA Astrophysics Data System (ADS)

    Wolohan, Philippa R. N.; Clark, Robert D.

    2003-01-01

    We have developed a method that combines molecular interaction fields with soft independent modeling of class analogy (SIMCA) Wold:1977 to predict pharmacokinetic drug properties. Several additional considerations to those made in traditional QSAR are required in order to develop a successful QSPR strategy that is capable of accommodating the many complex factors that contribute to key pharmacokinetic properties such as ADME (absorption, distribution, metabolism, and excretion) and toxicology. An accurate prediction of oral bioavailability, for example, requires that absorption and first-pass hepatic elimination both be taken into consideration. To accomplish this, general properties of molecules must be related to their solubility and ability to penetrate biological membranes, and specific features must be related to their particular metabolic and toxicological profiles. Here we describe a method, which is applicable to structurally diverse data sets while utilizing as much detailed structural information as possible. We address the issue of the molecular alignment of a structurally diverse set of compounds using idiotropic field orientation (IFO), a generalization of inertial field orientation Clark:1998. We have developed a second flavor of this method, which directly incorporates electrostatics into the molecular alignment. Both variations of IFO produce a characteristic orientation for each structure and the corresponding molecular fields can then be analyzed using SIMCA. Models are presented for human intestinal absorption, blood-brain barrier penetration and bioavailability to demonstrate ways in which this tool can be used early in the drug development process to identify leads likely to exhibit poor pharmacokinetic behavior in pre-clinical studies, and we have explored the influence of conformation and molecular field type on the statistical properties of the models obtained.

  17. Features Predicting Sentinel Lymph Node Positivity in Merkel Cell Carcinoma

    PubMed Central

    Schwartz, Jennifer L.; Griffith, Kent A.; Lowe, Lori; Wong, Sandra L.; McLean, Scott A.; Fullen, Douglas R.; Lao, Christopher D.; Hayman, James A.; Bradford, Carol R.; Rees, Riley S.; Johnson, Timothy M.; Bichakjian, Christopher K.

    2011-01-01

    Purpose Merkel cell carcinoma (MCC) is a relatively rare, potentially aggressive cutaneous malignancy. We examined the clinical and histologic features of primary MCC that may correlate with the probability of a positive sentinel lymph node (SLN). Methods Ninety-five patients with MCC who underwent SLN biopsy at the University of Michigan were identified. SLN biopsy was performed on 97 primary tumors, and an SLN was identified in 93 instances. These were reviewed for clinical and histologic features and associated SLN positivity. Univariate associations between these characteristics and a positive SLN were tested for by using either the χ2 or the Fisher's exact test. A backward elimination algorithm was used to help create a best multiple variable model to explain a positive SLN. Results SLN positivity was significantly associated with the clinical size of the lesion, greatest horizontal histologic dimension, tumor thickness, mitotic rate, and histologic growth pattern. Two competing multivariate models were generated to predict a positive SLN. The histologic growth pattern was present in both models and combined with either tumor thickness or mitotic rate. Conclusion Increasing clinical size, increasing tumor thickness, increasing mitotic rate, and infiltrative tumor growth pattern were significantly associated with a greater likelihood of a positive SLN. By using the growth pattern and tumor thickness model, no subgroup of patients was predicted to have a lower than 15% to 20% likelihood of a positive SLN. This suggests that all patients presenting with MCC without clinical evidence of regional lymph node disease should be considered for SLN biopsy. PMID:21300936

  18. Predicting Presynaptic and Postsynaptic Neurotoxins by Developing Feature Selection Technique

    PubMed Central

    Yang, Yunchun; Zhang, Chunmei; Chen, Rong; Huang, Po

    2017-01-01

    Presynaptic and postsynaptic neurotoxins are proteins which act at the presynaptic and postsynaptic membrane. Correctly predicting presynaptic and postsynaptic neurotoxins will provide important clues for drug-target discovery and drug design. In this study, we developed a theoretical method to discriminate presynaptic neurotoxins from postsynaptic neurotoxins. A strict and objective benchmark dataset was constructed to train and test our proposed model. The dipeptide composition was used to formulate neurotoxin samples. The analysis of variance (ANOVA) was proposed to find out the optimal feature set which can produce the maximum accuracy. In the jackknife cross-validation test, the overall accuracy of 94.9% was achieved. We believe that the proposed model will provide important information to study neurotoxins. PMID:28303250

  19. Predicting the Presence of Large Fish through Benthic Geomorphic Features

    NASA Astrophysics Data System (ADS)

    Knuth, F.; Sautter, L.; Levine, N. S.; Kracker, L.

    2013-12-01

    Marine Protected Areas are critical in sustaining the resilience of fish populations to commercial fishing operations. Using acoustic data to survey these areas promises efficiency, accuracy, and minimal environmental impact. In July, 2013, the NOAA Ship Pisces collected bathymetric, backscatter and water column data for 10 proposed MPA sites along the U.S. Southeast Atlantic continental shelf. A total of 205 km2 of seafloor were mapped between Mayport, FL and Wilmington, NC, using the SIMRAD ME70 and EK60 echosounder systems. These data were processed in Caris HIPS, QPS FMGT, MATLAB and ArcGIS. The backscatter and bathymetry reveal various benthic geomorphic features, including flat sand, rippled sand, and rugose hard bottom. Water column data directly above highly rugose hardbottom contains the greatest counts for large fish populations. Using spatial statistics, such as a geographically weighted regression model, we aim to identify features of the benthic profile, including rugosity, curvature and slope, that can predict the presence of large fish. The success of this approach will greatly expedite fishery surveys, minimize operational cost and aid in making timely management decisions.

  20. Mixed learning algorithms and features ensemble in hepatotoxicity prediction

    NASA Astrophysics Data System (ADS)

    Liew, Chin Yee; Lim, Yen Ching; Yap, Chun Wei

    2011-09-01

    Drug-induced liver injury, although infrequent, is an important safety concern that can lead to fatality in patients and failure in drug developments. In this study, we have used an ensemble of mixed learning algorithms and mixed features for the development of a model to predict hepatic effects. This robust method is based on the premise that no single learning algorithm is optimum for all modelling problems. An ensemble model of 617 base classifiers was built from a diverse set of 1,087 compounds. The ensemble model was validated internally with five-fold cross-validation and 25 rounds of y-randomization. In the external validation of 120 compounds, the ensemble model had achieved an accuracy of 75.0%, sensitivity of 81.9% and specificity of 64.6%. The model was also able to identify 22 of 23 withdrawn drugs or drugs with black box warning against hepatotoxicity. Dronedarone which is associated with severe liver injuries, announced in a recent FDA drug safety communication, was predicted as hepatotoxic by the ensemble model. It was found that the ensemble model was capable of classifying positive compounds (with hepatic effects) well, but less so on negatives compounds when they were structurally similar. The ensemble model built in this study is made available for public use.

  1. Molecular Features of Wheat Endosperm Arabinoxylan Inclusion in Functional Bread

    PubMed Central

    Li, Weili; Hu, Hui; Wang, Qi; Brennan, Charles J.

    2013-01-01

    Arabinoxylan (AX) is a major dietary fibre component found in a variety of cereals. Numerous health benefits of arabinoxylans have been reported to be associated with their solubility and molecular features. The current study reports the development of a functional bread using a combination of AX-enriched material (AEM) and optimal commercial endoxylanase. The total AX content of bread was increased to 8.2 g per 100 g available carbohydrates. The extractability of AX in breads with and without endoxylanase was determined. The results demonstrate that water-extractable AX (WE-AX) increased progressively through the bread making process. The application of endoxylanase also increased WE-AX content. The presence of 360 ppm of endoxylanase had positive effects on the bread characteristics in terms of bread volume and firmness by converting the water unextractable (WU)-AX to WE-AX. In addition, the molecular weight (Mw) distribution of the WE-AX of bread with and without endoxylanase was characterized by size-exclusion chromatography. The results show that as the portion of WE-AX increased, the amount of high Mw WE-AX (higher than 100 kDa) decreased, whereas the amount of low Mw WE-AX (lower than 100 kDa) increased from 33.2% to 44.2% through the baking process. The low Mw WE-AX further increased to 75.5% with the application of the optimal endoxylanase (360 ppm). PMID:28239111

  2. Predictive Features of a Cockpit Traffic Display: A Workload Assessment

    NASA Technical Reports Server (NTRS)

    Wickens, Christopher D.; Morphew, Ephimia

    1997-01-01

    Eighteen pilots flew a series of traffic avoidance maneuvers in an experiment designed to assess the support offered and workload imposed by different levels of traffic display information in a free flight simulation. Three display prototypes were compared which differed in traffic information provided. A BASELINE (BL) display provided current and (2nd order) predicted information regarding ownship and current information of an intruder aircraft, represented on lateral and vertical displays in a coplanar suite. An INTRUDER PREDICTOR (IP) display, augmented the baseline display by providing lateral and vertical prediction of the intruder aircraft. A THREAT VECTOR (TV) display added to the IP display a vector that indicates the direction from ownship to the intruder at the predicted point of closest contact (POCC). The length of the vector corresponds to the radius of the protected zone, and the distance of the intersection of the vector with ownship predictor, corresponds to the time available till POCC or loss of separation. Pilots time shared the traffic avoidance task with a secondary task requiring them to monitor the top of the display for faint targets. This task simulated the visual demands of out-of-cockpit scanning, and hence was used to estimate the head-down time required by the different display formats. The results revealed that both display augmentations improved performance (safety) as assessed by predicted and actual loss of separation (i.e., penetration of the protected zone). Both enhancements also reduced workload, as assessed by the NASA TLX scale. The intruder predictor display produced these benefits with no substantial impact on the qualitative nature of the avoidance maneuvers that were selected. The threat vector produced the safety benefits by inducing a greater degree of (effective) lateral maneuvering, thus partially offsetting the benefits of reduced workload. The three displays did not differ in terms of their effect on performance of

  3. Beyond [lambda][subscript max] Part 2: Predicting Molecular Color

    ERIC Educational Resources Information Center

    Williams, Darren L.; Flaherty, Thomas J.; Alnasleh, Bassam K.

    2009-01-01

    A concise roadmap for using computational chemistry programs (i.e., Gaussian 03W) to predict the color of a molecular species is presented. A color-predicting spreadsheet is available with the online material that uses transition wavelengths and peak-shape parameters to predict the visible absorbance spectrum, transmittance spectrum, chromaticity…

  4. Delta hepatitis: molecular biology and clinical and epidemiological features.

    PubMed Central

    Polish, L B; Gallagher, M; Fields, H A; Hadler, S C

    1993-01-01

    Hepatitis delta virus, discovered in 1977, requires the help of hepatitis B virus to replicate in hepatocytes and is an important cause of acute, fulminant, and chronic liver disease in many regions of the world. Because of the helper function of hepatitis delta virus, infection with it occurs either as a coinfection with hepatitis B or as a superinfection of a carrier of hepatitis B surface antigen. Although the mechanisms of transmission are similar to those of hepatitis B virus, the patterns of transmission of delta virus vary widely around the world. In regions of the world in which hepatitis delta virus infection is not endemic, the disease is confined to groups at high risk of acquiring hepatitis B infection and high-risk hepatitis B carriers. Because of the propensity of this viral infection to cause fulminant as well as chronic liver disease, continued incursion of hepatitis delta virus into areas of the world where persistent hepatitis B infection is endemic will have serious implications. Prevention depends on the widespread use of hepatitis B vaccine. This review focuses on the molecular biology and the clinical and epidemiologic features of this important viral infection. PMID:8358704

  5. GOPred: GO Molecular Function Prediction by Combined Classifiers

    PubMed Central

    Saraç, Ömer Sinan; Atalay, Volkan; Cetin-Atalay, Rengul

    2010-01-01

    Functional protein annotation is an important matter for in vivo and in silico biology. Several computational methods have been proposed that make use of a wide range of features such as motifs, domains, homology, structure and physicochemical properties. There is no single method that performs best in all functional classification problems because information obtained using any of these features depends on the function to be assigned to the protein. In this study, we portray a novel approach that combines different methods to better represent protein function. First, we formulated the function annotation problem as a classification problem defined on 300 different Gene Ontology (GO) terms from molecular function aspect. We presented a method to form positive and negative training examples while taking into account the directed acyclic graph (DAG) structure and evidence codes of GO. We applied three different methods and their combinations. Results show that combining different methods improves prediction accuracy in most cases. The proposed method, GOPred, is available as an online computational annotation tool (http://kinaz.fen.bilkent.edu.tr/gopred). PMID:20824206

  6. Lung Cancer Prediction Using Neural Network Ensemble with Histogram of Oriented Gradient Genomic Features

    PubMed Central

    Adetiba, Emmanuel; Olugbara, Oludayo O.

    2015-01-01

    This paper reports an experimental comparison of artificial neural network (ANN) and support vector machine (SVM) ensembles and their “nonensemble” variants for lung cancer prediction. These machine learning classifiers were trained to predict lung cancer using samples of patient nucleotides with mutations in the epidermal growth factor receptor, Kirsten rat sarcoma viral oncogene, and tumor suppressor p53 genomes collected as biomarkers from the IGDB.NSCLC corpus. The Voss DNA encoding was used to map the nucleotide sequences of mutated and normal genomes to obtain the equivalent numerical genomic sequences for training the selected classifiers. The histogram of oriented gradient (HOG) and local binary pattern (LBP) state-of-the-art feature extraction schemes were applied to extract representative genomic features from the encoded sequences of nucleotides. The ANN ensemble and HOG best fit the training dataset of this study with an accuracy of 95.90% and mean square error of 0.0159. The result of the ANN ensemble and HOG genomic features is promising for automated screening and early detection of lung cancer. This will hopefully assist pathologists in administering targeted molecular therapy and offering counsel to early stage lung cancer patients and persons in at risk populations. PMID:25802891

  7. Radiogenomics of Glioblastoma: Machine Learning-based Classification of Molecular Characteristics by Using Multiparametric and Multiregional MR Imaging Features.

    PubMed

    Kickingereder, Philipp; Bonekamp, David; Nowosielski, Martha; Kratz, Annekathrin; Sill, Martin; Burth, Sina; Wick, Antje; Eidel, Oliver; Schlemmer, Heinz-Peter; Radbruch, Alexander; Debus, Jürgen; Herold-Mende, Christel; Unterberg, Andreas; Jones, David; Pfister, Stefan; Wick, Wolfgang; von Deimling, Andreas; Bendszus, Martin; Capper, David

    2016-12-01

    Purpose To evaluate the association of multiparametric and multiregional magnetic resonance (MR) imaging features with key molecular characteristics in patients with newly diagnosed glioblastoma. Materials and Methods Retrospective data evaluation was approved by the local ethics committee, and the requirement to obtain informed consent was waived. Preoperative MR imaging features were correlated with key molecular characteristics within a single-institution cohort of 152 patients with newly diagnosed glioblastoma. Preoperative MR imaging features (n = 31) included multiparametric (anatomic and diffusion-, perfusion-, and susceptibility-weighted images) and multiregional (contrast-enhancing regions and hyperintense regions at nonenhanced fluid-attenuated inversion recovery imaging) information with histogram quantification of tumor volumes, volume ratios, apparent diffusion coefficients, cerebral blood flow, cerebral blood volume, and intratumoral susceptibility signals. Molecular characteristics determined included global DNA methylation subgroups (eg, mesenchymal, RTK I "PGFRA," RTK II "classic"), MGMT promoter methylation status, and hallmark copy number variations (EGFR, PDGFRA, MDM4, and CDK4 amplification; PTEN, CDKN2A, NF1, and RB1 loss). Univariate analyses (voxel-lesion symptom mapping for tumor location, Wilcoxon test for all other MR imaging features) and machine learning models were applied to study the strength of association and discriminative value of MR imaging features for predicting underlying molecular characteristics. Results There was no tumor location predilection for any of the assessed molecular parameters (permutation-adjusted P > .05). Univariate imaging parameter associations were noted for EGFR amplification and CDKN2A loss, with both demonstrating increased Gaussian-normalized relative cerebral blood volume and Gaussian-normalized relative cerebral blood flow values (area under the receiver operating characteristics curve: 63

  8. Molecular biology of testicular germ cell tumors: unique features awaiting clinical application.

    PubMed

    Boublikova, Ludmila; Buchler, Tomas; Stary, Jan; Abrahamova, Jitka; Trka, Jan

    2014-03-01

    Testicular germ cell tumors (TGCTs) are the most common solid tumors in young adult men characterized by distinct biologic features and clinical behavior. Both genetic predispositions and environmental factors probably play a substantial role in their etiology. TGTCs arise from a malignant transformation of primordial germ cells in a process that starts prenatally, is often associated with a certain degree of gonadal dysgenesis, and involves the acquirement of several specific aberrations, including activation of SCF-CKIT, amplification of 12p with up-regulation of stem cell genes, and subsequent genetic and epigenetic alterations. Their embryonic and germ origin determines the unique sensitivity of TGCTs to platinum-based chemotherapy. Contrary to the vast majority of other malignancies, no molecular prognostic/predictive factors nor targeted therapy is available for patients with these tumors. This review summarizes the principal molecular characteristics of TGCTs that could represent a potential basis for development of novel diagnostic and treatment approaches.

  9. Exploiting heterogeneous features to improve in silico prediction of peptide status – amyloidogenic or non-amyloidogenic

    PubMed Central

    2011-01-01

    Background Prediction of short stretches in protein sequences capable of forming amyloid-like fibrils is important in understanding the underlying cause of amyloid illnesses thereby aiding in the discovery of sequence-targeted anti-aggregation pharmaceuticals. Due to the constraints of experimental molecular techniques in identifying such motif segments, it is highly desirable to develop computational methods to provide better and affordable in silico predictions. Results Accurate in silico prediction techniques of amyloidogenic peptide regions rely on the cooperation between informative features and classifier design. In this research article, we propose one such efficient fibril prediction implementation exploiting heterogeneous features based on bio-physio-chemical (BPC) properties, auto-correlation function of carefully selected amino acid indices and atomic composition within a protein fragment of amino acids in a window. In an attempt to get an optimal number of BPC features, an evolutionary Support Vector Machine (SVM) integrating a novel implementation of hybrid Genetic Algorithm termed Memetic Algorithm and SVM is utilized. Five prediction modules designed using Artificial Neural Network (ANN) models are trained with independent and integrated features in order to validate the fibril forming motifs. The results provide evidence that incorporating new feature namely auto-correlation function besides BPC, attempt to strengthen the sequence interaction effect in forming the feature vector thereby obtaining better prediction quality in terms of sensitivity, specificity, Mathews Correlation Coefficient and Area under the Receiver Operating Characteristics curve. Conclusion A significant improvement in performance is observed by introducing features like auto-correlation function that maintains sequence order effect, in addition to the conventional BPC properties selected through a novel optimization strategy to predict the peptide status – amyloidogenic or

  10. Feature Parameter Optimization for Seizure Detection/Prediction

    DTIC Science & Technology

    2007-11-02

    the window length for the feature under consideration. Figure 4 illustrates the variation of the k-factor for the fractal dimension feature, as...r Figure 4: K-Factor from the Fractal Dimension for Different Window Sizes Typically, the window sizes that maximized the k-factor were...Esteller R., Ph.D dissertation “Detection of seizure onset in epileptic patients from intracranial EEG signals ”, Georgia Institute of Technology

  11. Visual Prediction Error Spreads Across Object Features in Human Visual Cortex.

    PubMed

    Jiang, Jiefeng; Summerfield, Christopher; Egner, Tobias

    2016-12-14

    Visual cognition is thought to rely heavily on contextual expectations. Accordingly, previous studies have revealed distinct neural signatures for expected versus unexpected stimuli in visual cortex. However, it is presently unknown how the brain combines multiple concurrent stimulus expectations such as those we have for different features of a familiar object. To understand how an unexpected object feature affects the simultaneous processing of other expected feature(s), we combined human fMRI with a task that independently manipulated expectations for color and motion features of moving-dot stimuli. Behavioral data and neural signals from visual cortex were then interrogated to adjudicate between three possible ways in which prediction error (surprise) in the processing of one feature might affect the concurrent processing of another, expected feature: (1) feature processing may be independent; (2) surprise might "spread" from the unexpected to the expected feature, rendering the entire object unexpected; or (3) pairing a surprising feature with an expected feature might promote the inference that the two features are not in fact part of the same object. To formalize these rival hypotheses, we implemented them in a simple computational model of multifeature expectations. Across a range of analyses, behavior and visual neural signals consistently supported a model that assumes a mixing of prediction error signals across features: surprise in one object feature spreads to its other feature(s), thus rendering the entire object unexpected. These results reveal neurocomputational principles of multifeature expectations and indicate that objects are the unit of selection for predictive vision.

  12. Radiogenomic analysis of breast cancer: dynamic contrast enhanced - magnetic resonance imaging based features are associated with molecular subtypes

    NASA Astrophysics Data System (ADS)

    Wang, Shijian; Fan, Ming; Zhang, Juan; Zheng, Bin; Wang, Xiaojia; Li, Lihua

    2016-03-01

    Breast cancer is one of the most common malignant tumor with upgrading incidence in females. The key to decrease the mortality is early diagnosis and reasonable treatment. Molecular classification could provide better insights into patient-directed therapy and prognosis prediction of breast cancer. It is known that different molecular subtypes have different characteristics in magnetic resonance imaging (MRI) examination. Therefore, we assumed that imaging features can reflect molecular information in breast cancer. In this study, we investigated associations between dynamic contrasts enhanced MRI (DCE-MRI) features and molecular subtypes in breast cancer. Sixty patients with breast cancer were enrolled and the MR images were pre-processed for noise reduction, registration and segmentation. Sixty-five dimensional imaging features including statistical characteristics, morphology, texture and dynamic enhancement in breast lesion and background regions were semiautomatically extracted. The associations between imaging features and molecular subtypes were assessed by using statistical analyses, including univariate logistic regression and multivariate logistic regression. The results of multivariate regression showed that imaging features are significantly associated with molecular subtypes of Luminal A (p=0.00473), HER2-enriched (p=0.00277) and Basal like (p=0.0117), respectively. The results indicated that three molecular subtypes are correlated with DCE-MRI features in breast cancer. Specifically, patients with a higher level of compactness or lower level of skewness in breast lesion are more likely to be Luminal A subtype. Besides, the higher value of the dynamic enhancement at T1 time in normal side reflect higher possibility of HER2-enriched subtype in breast cancer.

  13. Heart failure: molecular, genetic and epigenetic features of the disease.

    PubMed

    D'Alessandro, R; Roselli, T; Valente, F; Iannaccone, M; Capogrosso, C; Petti, G; Alfano, G; Masarone, D; Ziello, B; Fimiani, F; Pacileo, G; Russo, M G; Calabrò, P; Limongelli, G; Maddaloni, V; Calabrò, R

    2012-12-01

    Factors that compete to establish heart failure (HF) are not completely known. In the last years the several technological improvements allowed us to deeply study the molecular and genetic aspects of this complex syndrome. This new approach to HF based on molecular biology new discoveries shows us more clearly the pathophysiological bases of this disease, and a future scenery where the genetics may be useful in the clinical practice, as screening of high risk populations, as well as in the diagnosis and therapy of underlying myocardial diseases. The purpose of this review was to analyse the molecular, genetic and epigenetic factors of HF. We described the molecular anatomy of the sarcomere and the pathogenesis of the heart muscle diseases, abandoning the previous monogenic theory for the concept of a polygenic disease. Different actors play a role to cause the illness by themselves, modifying the expression of the disease and, eventually, the prognosis of the patient.

  14. Predicting new molecular targets for known drugs

    PubMed Central

    Keiser, Michael J.; Setola, Vincent; Irwin, John J.; Laggner, Christian; Abbas, Atheir; Hufeisen, Sandra J.; Jensen, Niels H.; Kuijer, Michael B.; Matos, Roberto C.; Tran, Thuy B.; Whaley, Ryan; Glennon, Richard A.; Hert, Jérôme; Thomas, Kelan L.H.; Edwards, Douglas D.; Shoichet, Brian K.; Roth, Bryan L.

    2009-01-01

    Whereas drugs are intended to be selective, at least some bind to several physiologic targets, explaining both side effects and efficacy. As many drug-target combinations exist, it would be useful to explore possible interactions computationally. Here, we compared 3,665 FDA-approved and investigational drugs against hundreds of targets, defining each target by its ligands. Chemical similarities between drugs and ligand sets predicted thousands of unanticipated associations. Thirty were tested experimentally, including the antagonism of the β1 receptor by the transporter inhibitor Prozac, the inhibition of the 5-HT transporter by the ion channel drug Vadilex, and antagonism of the histamine H4 receptor by the enzyme inhibitor Rescriptor. Overall, 23 new drug-target associations were confirmed, five of which were potent (< 100 nM). The physiological relevance of one such, the drug DMT on serotonergic receptors, was confirmed in a knock-out mouse. The chemical similarity approach is systematic and comprehensive, and may suggest side-effects and new indications for many drugs. PMID:19881490

  15. Structure and functional features of olive pollen pectin methylesterase using homology modeling and molecular docking methods.

    PubMed

    Jimenez-Lopez, Jose C; Kotchoni, Simeon O; Rodríguez-García, María I; Alché, Juan D

    2012-12-01

    Pectin methylesterases (PMEs), a multigene family of proteins with multiple differentially regulated isoforms, are key enzymes implicated in the carbohydrates (pectin) metabolism of cell walls. Olive pollen PME has been identified as a new allergen (Ole e 11) of potential relevance in allergy amelioration, since it exhibits high prevalence among atopic patients. In this work, the structural and functional characterization of two olive pollen PME isoforms and their comparison with other PME plants was performed by using different approaches: (1) the physicochemical properties and functional-regulatory motifs characterization, (2) primary sequence analysis, 2D and 3D comparative structural features study, (3) conservation and evolutionary analysis, (4) catalytic activity and regulation based on molecular docking analysis of a homologue PME inhibitor, and (5) B-cell epitopes prediction by sequence and structural based methods and protein-protein interaction tools, while T-cell epitopes by inhibitory concentration and binding score methods. Our results indicate that the structural differences and low conservation of residues, together with differences in physicochemical and posttranslational motifs might be a mechanism for PME isovariants generation, regulation, and differential surface epitopes generation. Olive PMEs perform a processive catalytic mechanism, and a differential molecular interaction with specific PME inhibitor, opening new possibilities for PME activity regulation. Despite the common function of PMEs, differential features found in this study will lead to a better understanding of the structural and functional characterization of plant PMEs and help to improve the component-resolving diagnosis and immunotherapy of olive pollen allergy by epitopes identification.

  16. Analysis and prediction of drug-drug interaction by minimum redundancy maximum relevance and incremental feature selection.

    PubMed

    Liu, Lili; Chen, Lei; Zhang, Yu-Hang; Wei, Lai; Cheng, Shiwen; Kong, Xiangyin; Zheng, Mingyue; Huang, Tao; Cai, Yu-Dong

    2017-02-01

    Drug-drug interaction (DDI) defines a situation in which one drug affects the activity of another when both are administered together. DDI is a common cause of adverse drug reactions and sometimes also leads to improved therapeutic effects. Therefore, it is of great interest to discover novel DDIs according to their molecular properties and mechanisms in a robust and rigorous way. This paper attempts to predict effective DDIs using the following properties: (1) chemical interaction between drugs; (2) protein interactions between the targets of drugs; and (3) target enrichment of KEGG pathways. The data consisted of 7323 pairs of DDIs collected from the DrugBank and 36,615 pairs of drugs constructed by randomly combining two drugs. Each drug pair was represented by 465 features derived from the aforementioned three categories of properties. The random forest algorithm was adopted to train the prediction model. Some feature selection techniques, including minimum redundancy maximum relevance and incremental feature selection, were used to extract key features as the optimal input for the prediction model. The extracted key features may help to gain insights into the mechanisms of DDIs and provide some guidelines for the relevant clinical medication developments, and the prediction model can give new clues for identification of novel DDIs.

  17. Epileptic Seizure Prediction based on Ratio and Differential Linear Univariate Features

    PubMed Central

    Rasekhi, Jalil; Mollaei, Mohammad Reza Karami; Bandarabadi, Mojtaba; Teixeira, César A.; Dourado, António

    2015-01-01

    Bivariate features, obtained from multichannel electroencephalogram recordings, quantify the relation between different brain regions. Studies based on bivariate features have shown optimistic results for tackling epileptic seizure prediction problem in patients suffering from refractory epilepsy. A new bivariate approach using univariate features is proposed here. Differences and ratios of 22 linear univariate features were calculated using pairwise combination of 6 electroencephalograms channels, to create 330 differential, and 330 relative features. The feature subsets were classified using support vector machines separately, as one of the two classes of preictal and nonpreictal. Furthermore, minimum Redundancy Maximum Relevance feature reduction method is employed to improve the predictions and reduce the number of false alarms. The studies were carried out on features obtained from 10 patients. For reduced subset of 30 features and using differential approach, the seizures were on average predicted in 60.9% of the cases (28 out of 46 in 737.9 h of test data), with a low false prediction rate of 0.11 h−1. Results of bivariate approaches were compared with those achieved from original linear univariate features, extracted from 6 channels. The advantage of proposed bivariate features is the smaller number of false predictions in comparison to the original 22 univariate features. In addition, reduction in feature dimension could provide a less complex and the more cost-effective algorithm. Results indicate that applying machine learning methods on a multidimensional feature space resulting from relative/differential pairwise combination of 22 univariate features could predict seizure onsets with high performance. PMID:25709936

  18. Weighted feature value based Drug Target Protein prediction.

    PubMed

    Hyun, Bo-ra; Jung, Hwiesung; Jang, Woo-Hyuk; Jung, Suk Hoon; Han, Dong-Soo

    2008-01-01

    Drug discovery is a long process in which only a few successful new therapeutic discoveries are made and identification of drug target candidate proteins requires considerable time and efforts. However, the accumulation of information on drugs has made it possible to devise new computational methods for classifying drug target candidates. In this paper, we devise a Drug Target Protein (DT-P) classification method by the summation of weighted features which is extracted from known DT-P. The method is validated using Bayesian decision theory and SVM, and it was revealed to achieve high specificity of 89.5% with 88% accuracy.

  19. Personalized Cancer Medicine: Molecular Diagnostics, Predictive biomarkers, and Drug Resistance

    PubMed Central

    Gonzalez de Castro, D; Clarke, P A; Al-Lazikani, B; Workman, P

    2013-01-01

    The progressive elucidation of the molecular pathogenesis of cancer has fueled the rational development of targeted drugs for patient populations stratified by genetic characteristics. Here we discuss general challenges relating to molecular diagnostics and describe predictive biomarkers for personalized cancer medicine. We also highlight resistance mechanisms for epidermal growth factor receptor (EGFR) kinase inhibitors in lung cancer. We envisage a future requiring the use of longitudinal genome sequencing and other omics technologies alongside combinatorial treatment to overcome cellular and molecular heterogeneity and prevent resistance caused by clonal evolution. PMID:23361103

  20. Accelerating ab initio molecular dynamics simulations by linear prediction methods

    NASA Astrophysics Data System (ADS)

    Herr, Jonathan D.; Steele, Ryan P.

    2016-09-01

    Acceleration of ab initio molecular dynamics (AIMD) simulations can be reliably achieved by extrapolation of electronic data from previous timesteps. Existing techniques utilize polynomial least-squares regression to fit previous steps' Fock or density matrix elements. In this work, the recursive Burg 'linear prediction' technique is shown to be a viable alternative to polynomial regression, and the extrapolation-predicted Fock matrix elements were three orders of magnitude closer to converged elements. Accelerations of 1.8-3.4× were observed in test systems, and in all cases, linear prediction outperformed polynomial extrapolation. Importantly, these accelerations were achieved without reducing the MD integration timestep.

  1. Protein-ligand binding region prediction (PLB-SAVE) based on geometric features and CUDA acceleration

    PubMed Central

    2013-01-01

    Background Protein-ligand interactions are key processes in triggering and controlling biological functions within cells. Prediction of protein binding regions on the protein surface assists in understanding the mechanisms and principles of molecular recognition. In silico geometrical shape analysis plays a primary step in analyzing the spatial characteristics of protein binding regions and facilitates applications of bioinformatics in drug discovery and design. Here, we describe the novel software, PLB-SAVE, which uses parallel processing technology and is ideally suited to extract the geometrical construct of solid angles from surface atoms. Representative clusters and corresponding anchors were identified from all surface elements and were assigned according to the ranking of their solid angles. In addition, cavity depth indicators were obtained by proportional transformation of solid angles and cavity volumes were calculated by scanning multiple directional vectors within each selected cavity. Both depth and volume characteristics were combined with various weighting coefficients to rank predicted potential binding regions. Results Two test datasets from LigASite, each containing 388 bound and unbound structures, were used to predict binding regions using PLB-SAVE and two well-known prediction systems, SiteHound and MetaPocket2.0 (MPK2). PLB-SAVE outperformed the other programs with accuracy rates of 94.3% for unbound proteins and 95.5% for bound proteins via a tenfold cross-validation process. Additionally, because the parallel processing architecture was designed to enhance the computational efficiency, we obtained an average of 160-fold increase in computational time. Conclusions In silico binding region prediction is considered the initial stage in structure-based drug design. To improve the efficacy of biological experiments for drug development, we developed PLB-SAVE, which uses only geometrical features of proteins and achieves a good overall performance

  2. Predictable convergence in hemoglobin function has unpredictable molecular underpinnings.

    PubMed

    Natarajan, Chandrasekhar; Hoffmann, Federico G; Weber, Roy E; Fago, Angela; Witt, Christopher C; Storz, Jay F

    2016-10-21

    To investigate the predictability of genetic adaptation, we examined the molecular basis of convergence in hemoglobin function in comparisons involving 56 avian taxa that have contrasting altitudinal range limits. Convergent increases in hemoglobin-oxygen affinity were pervasive among high-altitude taxa, but few such changes were attributable to parallel amino acid substitutions at key residues. Thus, predictable changes in biochemical phenotype do not have a predictable molecular basis. Experiments involving resurrected ancestral proteins revealed that historical substitutions have context-dependent effects, indicating that possible adaptive solutions are contingent on prior history. Mutations that produce an adaptive change in one species may represent precluded possibilities in other species because of differences in genetic background.

  3. Skeletal Muscle Laminopathies: A Review of Clinical and Molecular Features

    PubMed Central

    Maggi, Lorenzo; Carboni, Nicola; Bernasconi, Pia

    2016-01-01

    LMNA-related disorders are caused by mutations in the LMNA gene, which encodes for the nuclear envelope proteins, lamin A and C, via alternative splicing. Laminopathies are associated with a wide range of disease phenotypes, including neuromuscular, cardiac, metabolic disorders and premature aging syndromes. The most frequent diseases associated with mutations in the LMNA gene are characterized by skeletal and cardiac muscle involvement. This review will focus on genetics and clinical features of laminopathies affecting primarily skeletal muscle. Although only symptomatic treatment is available for these patients, many achievements have been made in clarifying the pathogenesis and improving the management of these diseases. PMID:27529282

  4. Application of high-dimensional feature selection: evaluation for genomic prediction in man.

    PubMed

    Bermingham, M L; Pong-Wong, R; Spiliopoulou, A; Hayward, C; Rudan, I; Campbell, H; Wright, A F; Wilson, J F; Agakov, F; Navarro, P; Haley, C S

    2015-05-19

    In this study, we investigated the effect of five feature selection approaches on the performance of a mixed model (G-BLUP) and a Bayesian (Bayes C) prediction method. We predicted height, high density lipoprotein cholesterol (HDL) and body mass index (BMI) within 2,186 Croatian and into 810 UK individuals using genome-wide SNP data. Using all SNP information Bayes C and G-BLUP had similar predictive performance across all traits within the Croatian data, and for the highly polygenic traits height and BMI when predicting into the UK data. Bayes C outperformed G-BLUP in the prediction of HDL, which is influenced by loci of moderate size, in the UK data. Supervised feature selection of a SNP subset in the G-BLUP framework provided a flexible, generalisable and computationally efficient alternative to Bayes C; but careful evaluation of predictive performance is required when supervised feature selection has been used.

  5. Synthesis of a specified, silica molecular sieve by using computationally predicted organic structure-directing agents.

    PubMed

    Schmidt, Joel E; Deem, Michael W; Davis, Mark E

    2014-08-04

    Crystalline molecular sieves are used in numerous applications, where the properties exploited for each technology are the direct consequence of structural features. New materials are typically discovered by trial and error, and in many cases, organic structure-directing agents (OSDAs) are used to direct their formation. Here, we report the first successful synthesis of a specified molecular sieve through the use of an OSDA that was predicted from a recently developed computational method that constructs chemically synthesizable OSDAs. Pentamethylimidazolium is computationally predicted to have the largest stabilization energy in the STW framework, and is experimentally shown to strongly direct the synthesis of pure-silica STW. Other OSDAs with lower stabilization energies did not form STW. The general method demonstrated here to create STW may lead to new, simpler OSDAs for existing frameworks and provide a way to predict OSDAs for desired, theoretical frameworks.

  6. Discrete Biogeography Based Optimization for Feature Selection in Molecular Signatures.

    PubMed

    Liu, Bo; Tian, Meihong; Zhang, Chunhua; Li, Xiangtao

    2015-04-01

    Biomarker discovery from high-dimensional data is a complex task in the development of efficient cancer diagnoses and classification. However, these data are usually redundant and noisy, and only a subset of them present distinct profiles for different classes of samples. Thus, selecting high discriminative genes from gene expression data has become increasingly interesting in the field of bioinformatics. In this paper, a discrete biogeography based optimization is proposed to select the good subset of informative gene relevant to the classification. In the proposed algorithm, firstly, the fisher-markov selector is used to choose fixed number of gene data. Secondly, to make biogeography based optimization suitable for the feature selection problem; discrete migration model and discrete mutation model are proposed to balance the exploration and exploitation ability. Then, discrete biogeography based optimization, as we called DBBO, is proposed by integrating discrete migration model and discrete mutation model. Finally, the DBBO method is used for feature selection, and three classifiers are used as the classifier with the 10 fold cross-validation method. In order to show the effective and efficiency of the algorithm, the proposed algorithm is tested on four breast cancer dataset benchmarks. Comparison with genetic algorithm, particle swarm optimization, differential evolution algorithm and hybrid biogeography based optimization, experimental results demonstrate that the proposed method is better or at least comparable with previous method from literature when considering the quality of the solutions obtained.

  7. Imaging patterns predict patient survival and molecular subtype in glioblastoma via machine learning techniques

    PubMed Central

    Macyszyn, Luke; Akbari, Hamed; Pisapia, Jared M.; Da, Xiao; Attiah, Mark; Pigrish, Vadim; Bi, Yingtao; Pal, Sharmistha; Davuluri, Ramana V.; Roccograndi, Laura; Dahmane, Nadia; Martinez-Lage, Maria; Biros, George; Wolf, Ronald L.; Bilello, Michel; O'Rourke, Donald M.; Davatzikos, Christos

    2016-01-01

    Background MRI characteristics of brain gliomas have been used to predict clinical outcome and molecular tumor characteristics. However, previously reported imaging biomarkers have not been sufficiently accurate or reproducible to enter routine clinical practice and often rely on relatively simple MRI measures. The current study leverages advanced image analysis and machine learning algorithms to identify complex and reproducible imaging patterns predictive of overall survival and molecular subtype in glioblastoma (GB). Methods One hundred five patients with GB were first used to extract approximately 60 diverse features from preoperative multiparametric MRIs. These imaging features were used by a machine learning algorithm to derive imaging predictors of patient survival and molecular subtype. Cross-validation ensured generalizability of these predictors to new patients. Subsequently, the predictors were evaluated in a prospective cohort of 29 new patients. Results Survival curves yielded a hazard ratio of 10.64 for predicted long versus short survivors. The overall, 3-way (long/medium/short survival) accuracy in the prospective cohort approached 80%. Classification of patients into the 4 molecular subtypes of GB achieved 76% accuracy. Conclusions By employing machine learning techniques, we were able to demonstrate that imaging patterns are highly predictive of patient survival. Additionally, we found that GB subtypes have distinctive imaging phenotypes. These results reveal that when imaging markers related to infiltration, cell density, microvascularity, and blood–brain barrier compromise are integrated via advanced pattern analysis methods, they form very accurate predictive biomarkers. These predictive markers used solely preoperative images, hence they can significantly augment diagnosis and treatment of GB patients. PMID:26188015

  8. Molecular features in arsenic-induced lung tumors

    PubMed Central

    2013-01-01

    Arsenic is a well-known human carcinogen, which potentially affects ~160 million people worldwide via exposure to unsafe levels in drinking water. Lungs are one of the main target organs for arsenic-related carcinogenesis. These tumors exhibit particular features, such as squamous cell-type specificity and high incidence among never smokers. Arsenic-induced malignant transformation is mainly related to the biotransformation process intended for the metabolic clearing of the carcinogen, which results in specific genetic and epigenetic alterations that ultimately affect key pathways in lung carcinogenesis. Based on this, lung tumors induced by arsenic exposure could be considered an additional subtype of lung cancer, especially in the case of never-smokers, where arsenic is a known etiological agent. In this article, we review the current knowledge on the various mechanisms of arsenic carcinogenicity and the specific roles of this metalloid in signaling pathways leading to lung cancer. PMID:23510327

  9. Adaptive reliance on the most stable sensory predictions enhances perceptual feature extraction of moving stimuli.

    PubMed

    Kumar, Neeraj; Mutha, Pratik K

    2016-03-01

    The prediction of the sensory outcomes of action is thought to be useful for distinguishing self- vs. externally generated sensations, correcting movements when sensory feedback is delayed, and learning predictive models for motor behavior. Here, we show that aspects of another fundamental function-perception-are enhanced when they entail the contribution of predicted sensory outcomes and that this enhancement relies on the adaptive use of the most stable predictions available. We combined a motor-learning paradigm that imposes new sensory predictions with a dynamic visual search task to first show that perceptual feature extraction of a moving stimulus is poorer when it is based on sensory feedback that is misaligned with those predictions. This was possible because our novel experimental design allowed us to override the "natural" sensory predictions present when any action is performed and separately examine the influence of these two sources on perceptual feature extraction. We then show that if the new predictions induced via motor learning are unreliable, rather than just relying on sensory information for perceptual judgments, as is conventionally thought, then subjects adaptively transition to using other stable sensory predictions to maintain greater accuracy in their perceptual judgments. Finally, we show that when sensory predictions are not modified at all, these judgments are sharper when subjects combine their natural predictions with sensory feedback. Collectively, our results highlight the crucial contribution of sensory predictions to perception and also suggest that the brain intelligently integrates the most stable predictions available with sensory information to maintain high fidelity in perceptual decisions.

  10. Predictive Value of Morphological Features in Patients with Autism versus Normal Controls

    ERIC Educational Resources Information Center

    Ozgen, H.; Hellemann, G. S.; de Jonge, M. V.; Beemer, F. A.; van Engeland, H.

    2013-01-01

    We investigated the predictive power of morphological features in 224 autistic patients and 224 matched-pairs controls. To assess the relationship between the morphological features and autism, we used the receiver operator curves (ROC). In addition, we used recursive partitioning (RP) to determine a specific pattern of abnormalities that is…

  11. Feature Biases in Early Word Learning: Network Distinctiveness Predicts Age of Acquisition

    ERIC Educational Resources Information Center

    Engelthaler, Tomas; Hills, Thomas T.

    2017-01-01

    Do properties of a word's features influence the order of its acquisition in early word learning? Combining the principles of mutual exclusivity and shape bias, the present work takes a network analysis approach to understanding how feature distinctiveness predicts the order of early word learning. Distance networks were built from nouns with edge…

  12. Molecular predictive markers in tumors of the gastrointestinal tract

    PubMed Central

    Papadopoulou, Eirini; Metaxa-Mariatou, Vasiliki; Tsaousis, Georgios; Tsoulos, Nikolaos; Tsirigoti, Angeliki; Efstathiadou, Chrisoula; Apessos, Angela; Agiannitopoulos, Konstantinos; Pepe, Georgia; Bourkoula, Eugenia; Nasioulas, George

    2016-01-01

    Gastrointestinal malignancies are among the leading causes of cancer-related deaths worldwide. Like all human malignancies they are characterized by accumulation of mutations which lead to inactivation of tumor suppressor genes or activation of oncogenes. Advances in Molecular Biology techniques have allowed for more accurate analysis of tumors’ genetic profiling using new breakthrough technologies such as next generation sequencing (NGS), leading to the development of targeted therapeutical approaches based upon biomarker-selection. During the last 10 years tremendous advances in the development of targeted therapies for patients with advanced cancer have been made, thus various targeted agents, associated with predictive biomarkers, have been developed or are in development for the treatment of patients with gastrointestinal cancer patients. This review summarizes the advances in the field of molecular biomarkers in tumors of the gastrointestinal tract, with focus on the available NGS platforms that enable comprehensive tumor molecular profile analysis. PMID:27895815

  13. Separate and concurrent symbolic predictions of sound features are processed differently

    PubMed Central

    Pieszek, Marika; Schröger, Erich; Widmann, Andreas

    2014-01-01

    The studies investigated the impact of predictive visual information about the pitch and location of a forthcoming sound on the sound processing. In Symbol-to-Sound matching paradigms, symbols induced predictions of particular sounds. The brain's error signals (IR and N2b components of the event-related potential) were measured in response to occasional violations of the prediction, i.e., when a sound was incongruent to the corresponding symbol. IR and N2b index the detection of prediction violations at different levels, IR at a sensory and N2b at a cognitive level. Participants evaluated the congruency between prediction and actual sound by button press. When the prediction referred to only the pitch or only the location feature (Experiment 1), the violation of each feature elicited IR and N2b. The IRs to pitch and location violations revealed differences in the in time course and topography, suggesting that they were generated in feature-specific sensory areas. When the prediction referred to both features concurrently (Experiment 2), that is, the symbol predicted the sound's pitch and location, either one or both predictions were violated. Unexpectedly, no significant effects in the IR range were obtained. However, N2b was elicited in response to all violations. N2b in response to concurrent violations of pitch and location had a shorter latency. We conclude that associative predictions can be established by arbitrary rule-based symbols and for different sound features, and that concurrent violations are processed in parallel. In complex situations as in Experiment 2, capacity limitations appear to affect processing in a hierarchical manner. While predictions were presumably not reliably established at sensory levels (absence of IR), they were established at more cognitive levels, where sounds are represented categorially (presence of N2b). PMID:25477832

  14. Clinical, Epidemiologic, Histopathologic and Molecular Features of an Unexplained Dermopathy

    PubMed Central

    Pearson, Michele L.; Selby, Joseph V.; Katz, Kenneth A.; Cantrell, Virginia; Braden, Christopher R.; Parise, Monica E.; Paddock, Christopher D.; Lewin-Smith, Michael R.; Kalasinsky, Victor F.; Goldstein, Felicia C.; Hightower, Allen W.; Papier, Arthur; Lewis, Brian; Motipara, Sarita; Eberhard, Mark L.

    2012-01-01

    Background Morgellons is a poorly characterized constellation of symptoms, with the primary manifestations involving the skin. We conducted an investigation of this unexplained dermopathy to characterize the clinical and epidemiologic features and explore potential etiologies. Methods A descriptive study was conducted among persons at least 13 years of age and enrolled in Kaiser Permanente Northern California (KPNC) during 2006–2008. A case was defined as the self-reported emergence of fibers or materials from the skin accompanied by skin lesions and/or disturbing skin sensations. We collected detailed epidemiologic data, performed clinical evaluations and geospatial analyses and analyzed materials collected from participants' skin. Results We identified 115 case-patients. The prevalence was 3.65 (95% CI = 2.98, 4.40) cases per 100,000 enrollees. There was no clustering of cases within the 13-county KPNC catchment area (p = .113). Case-patients had a median age of 52 years (range: 17–93) and were primarily female (77%) and Caucasian (77%). Multi-system complaints were common; 70% reported chronic fatigue and 54% rated their overall health as fair or poor with mean Physical Component Scores and Mental Component Scores of 36.63 (SD = 12.9) and 35.45 (SD = 12.89), respectively. Cognitive deficits were detected in 59% of case-patients and 63% had evidence of clinically significant somatic complaints; 50% had drugs detected in hair samples and 78% reported exposure to solvents. Solar elastosis was the most common histopathologic abnormality (51% of biopsies); skin lesions were most consistent with arthropod bites or chronic excoriations. No parasites or mycobacteria were detected. Most materials collected from participants' skin were composed of cellulose, likely of cotton origin. Conclusions This unexplained dermopathy was rare among this population of Northern California residents, but associated with significantly reduced health-related quality of

  15. Special Feature: Liquids and Structural Glasses Special Feature: An active biopolymer network controlled by molecular motors

    NASA Astrophysics Data System (ADS)

    Koenderink, Gijsje H.; Dogic, Zvonimir; Nakamura, Fumihiko; Bendix, Poul M.; MacKintosh, Frederick C.; Hartwig, John H.; Stossel, Thomas P.; Weitz, David A.

    2009-09-01

    We describe an active polymer network in which processive molecular motors control network elasticity. This system consists of actin filaments cross-linked by filamin A (FLNa) and contracted by bipolar filaments of muscle myosin II. The myosin motors stiffen the network by more than two orders of magnitude by pulling on actin filaments anchored in the network by FLNa cross-links, thereby generating internal stress. The stiffening response closely mimics the effects of external stress applied by mechanical shear. Both internal and external stresses can drive the network into a highly nonlinear, stiffened regime. The active stress reaches values that are equivalent to an external stress of 14 Pa, consistent with a 1-pN force per myosin head. This active network mimics many mechanical properties of cells and suggests that adherent cells exert mechanical control by operating in a nonlinear regime where cell stiffness is sensitive to changes in motor activity. This design principle may be applicable to engineering novel biologically inspired, active materials that adjust their own stiffness by internal catalytic control.

  16. Prediction of biomechanical trabecular bone properties with geometric features using MR imaging

    NASA Astrophysics Data System (ADS)

    Huber, Markus B.; Lancianese, Sarah L.; Ikpot, Imoh; Nagarajan, Mahesh B.; Lerner, Amy L.; Wismüller, Axel

    2010-03-01

    Trabecular bone parameters extracted from magnetic resonance (MR) images are compared in their ability to predict biomechanical properties determined through mechanical testing. Trabecular bone density and structural changes throughout the proximal tibia are indicative of several musculoskeletal disorders of the knee joint involving changes in the bone quality and the surrounding soft tissue. Recent studies have shown that MR imaging, most frequently applied in soft tissue imaging, also allows non-invasive 3-dimensional characterization of bone microstructure. Sophisticated MR image features that estimate local structural and geometric properties of the trabecular bone may improve the ability of MR imaging to determine local bone quality in vivo. The purpose of the current study is to use whole joint MR images to compare the performance of trabecular bone features extracted from the images in predicting biomechanical strength properties measured on the corresponding ex vivo specimens. The regional apparent bone volume fraction (appBVF) and scaling index method (SIM) derived features were calculated; a Multilayer Radial Basis Functions Network was then optimized to calculate the prediction accuracy as measured by the root mean square error (RSME) for each bone feature. The best prediction result was obtained with a SIM feature with the lowest prediction error (RSME=0.246) and the highest coefficient of determination (R2 = 0.769). The current study demonstrates that the combination of sophisticated bone structure features and supervised learning techniques can improve MR imaging as an in vivo imaging tool in determining local trabecular bone quality.

  17. Feature selection for outcome prediction in oesophageal cancer using genetic algorithm and random forest classifier.

    PubMed

    Paul, Desbordes; Su, Ruan; Romain, Modzelewski; Sébastien, Vauclin; Pierre, Vera; Isabelle, Gardin

    2016-12-28

    The outcome prediction of patients can greatly help to personalize cancer treatment. A large amount of quantitative features (clinical exams, imaging, …) are potentially useful to assess the patient outcome. The challenge is to choose the most predictive subset of features. In this paper, we propose a new feature selection strategy called GARF (genetic algorithm based on random forest) extracted from positron emission tomography (PET) images and clinical data. The most relevant features, predictive of the therapeutic response or which are prognoses of the patient survival 3 years after the end of treatment, were selected using GARF on a cohort of 65 patients with a local advanced oesophageal cancer eligible for chemo-radiation therapy. The most relevant predictive results were obtained with a subset of 9 features leading to a random forest misclassification rate of 18±4% and an areas under the of receiver operating characteristic (ROC) curves (AUC) of 0.823±0.032. The most relevant prognostic results were obtained with 8 features leading to an error rate of 20±7% and an AUC of 0.750±0.108. Both predictive and prognostic results show better performances using GARF than using 4 other studied methods.

  18. Weekly fluctuations in nonjudging predict borderline personality disorder feature expression in women

    PubMed Central

    Peters, Jessica R.; Chamberlain, Kaitlyn D.; Rodriguez, Marcus

    2015-01-01

    Objectives Borderline personality disorder (BPD) features have been linked to deficits in mindfulness, or nonjudgmental attention to present-moment stimuli. However, no previous work has examined the role of fluctuations in mindfulness over time in predicting BPD features. The present study examines the impact of both between-person differences and within-person changes in mindfulness. Design 40 women recruited to achieve a flat distribution of BPD features completed 4 weekly assessments of mindfulness (Five Facet Mindfulness Questionnaire; FFMQ) and BPD features. Multilevel models predicted each outcome from both 1) a person’s average levels of each facet and 2) weekly deviations from a person’s average for each facet. Results Average acting with awareness, nonjudging, and nonreactivity predicted lower BPD features at the between-person level, and weekly deviations above one’s average (i.e., higher-than-usual) nonjudging predicted lower BPD feature expression at the within-person level. Conclusions Within-person fluctuations in the nonjudging facet of mindfulness may be relevant to the daily expression of BPD features over and above dispositional mindfulness. PMID:27231408

  19. Feature maps driven no-reference image quality prediction of authentically distorted images

    NASA Astrophysics Data System (ADS)

    Ghadiyaram, Deepti; Bovik, Alan C.

    2015-03-01

    Current blind image quality prediction models rely on benchmark databases comprised of singly and synthetically distorted images, thereby learning image features that are only adequate to predict human perceived visual quality on such inauthentic distortions. However, real world images often contain complex mixtures of multiple distortions. Rather than a) discounting the effect of these mixtures of distortions on an image's perceptual quality and considering only the dominant distortion or b) using features that are only proven to be efficient for singly distorted images, we deeply study the natural scene statistics of authentically distorted images, in different color spaces and transform domains. We propose a feature-maps-driven statistical approach which avoids any latent assumptions about the type of distortion(s) contained in an image, and focuses instead on modeling the remarkable consistencies in the scene statistics of real world images in the absence of distortions. We design a deep belief network that takes model-based statistical image features derived from a very large database of authentically distorted images as input and discovers good feature representations by generalizing over different distortion types, mixtures, and severities, which are later used to learn a regressor for quality prediction. We demonstrate the remarkable competence of our features for improving automatic perceptual quality prediction on a benchmark database and on the newly designed LIVE Authentic Image Quality Challenge Database and show that our approach of combining robust statistical features and the deep belief network dramatically outperforms the state-of-the-art.

  20. PROSNET: INTEGRATING HOMOLOGY WITH MOLECULAR NETWORKS FOR PROTEIN FUNCTION PREDICTION

    PubMed Central

    Wang, Sheng; Qu, Meng

    2016-01-01

    Automated annotation of protein function has become a critical task in the post-genomic era. Network-based approaches and homology-based approaches have been widely used and recently tested in large-scale community-wide assessment experiments. It is natural to integrate network data with homology information to further improve the predictive performance. However, integrating these two heterogeneous, high-dimensional and noisy datasets is non-trivial. In this work, we introduce a novel protein function prediction algorithm ProSNet. An integrated heterogeneous network is first built to include molecular networks of multiple species and link together homologous proteins across multiple species. Based on this integrated network, a dimensionality reduction algorithm is introduced to obtain compact low-dimensional vectors to encode proteins in the network. Finally, we develop machine learning classification algorithms that take the vectors as input and make predictions by transferring annotations both within each species and across different species. Extensive experiments on five major species demonstrate that our integration of homology with molecular networks substantially improves the predictive performance over existing approaches. PMID:27896959

  1. Improving structure-based function prediction using molecular dynamics

    PubMed Central

    Glazer, Dariya S.; Radmer, Randall J.; Altman, Russ B.

    2009-01-01

    Summary The number of molecules with solved three-dimensional structure but unknown function is increasing rapidly. Particularly problematic are novel folds with little detectable similarity to molecules of known function. Experimental assays can determine the functions of such molecules, but are time-consuming and expensive. Computational approaches can identify potential functional sites; however, these approaches generally rely on single static structures and do not use information about dynamics. In fact, structural dynamics can enhance function prediction: we coupled molecular dynamics simulations with structure-based function prediction algorithms that identify Ca2+ binding sites. When applied to 11 challenging proteins, both methods showed substantial improvement in performance, revealing 22 more sites in one case and 12 more in the other, with a modest increase in apparent false positives. Thus, we show that treating molecules as dynamic entities improves the performance of structure-based function prediction methods. PMID:19604472

  2. Toward Fully in Silico Melting Point Prediction Using Molecular Simulations.

    PubMed

    Zhang, Yong; Maginn, Edward J

    2013-03-12

    Melting point is one of the most fundamental and practically important properties of a compound. Molecular simulation methods have been developed for the accurate computation of melting points. However, all of these methods need an experimental crystal structure as input, which means that such calculations are not really predictive since the melting point can be measured easily in experiments once a crystal structure is known. On the other hand, crystal structure prediction (CSP) has become an active field and significant progress has been made, although challenges still exist. One of the main challenges is the existence of many crystal structures (polymorphs) that are very close in energy. Thermal effects and kinetic factors make the situation even more complicated, such that it is still not trivial to predict experimental crystal structures. In this work, we exploit the fact that free energy differences are often small between crystal structures. We show that accurate melting point predictions can be made by using a reasonable crystal structure from CSP as a starting point for a free energy-based melting point calculation. The key is that most crystal structures predicted by CSP have free energies that are close to that of the experimental structure. The proposed method was tested on two rigid molecules and the results suggest that a fully in silico melting point prediction method is possible.

  3. Toward Fully in Silico Melting Point Prediction Using Molecular Simulations

    SciTech Connect

    Zhang, Y; Maginn, EJ

    2013-03-01

    Melting point is one of the most fundamental and practically important properties of a compound. Molecular computation of melting points. However, all of these methods simulation methods have been developed for the accurate need an experimental crystal structure as input, which means that such calculations are not really predictive since the melting point can be measured easily in experiments once a crystal structure is known. On the other hand, crystal structure prediction (CSP) has become an active field and significant progress has been made, although challenges still exist. One of the main challenges is the existence of many crystal structures (polymorphs) that are very close in energy. Thermal effects and kinetic factors make the situation even more complicated, such that it is still not trivial to predict experimental crystal structures. In this work, we exploit the fact that free energy differences are often small between crystal structures. We show that accurate melting point predictions can be made by using a reasonable crystal structure from CSP as a starting point for a free energy-based melting point calculation. The key is that most crystal structures predicted by CSP have free energies that are close to that of the experimental structure. The proposed method was tested on two rigid molecules and the results suggest that a fully in silico melting point prediction method is possible.

  4. Harnessing Computational Biology for Exact Linear B-Cell Epitope Prediction: A Novel Amino Acid Composition-Based Feature Descriptor.

    PubMed

    Saravanan, Vijayakumar; Gautham, Namasivayam

    2015-10-01

    Proteins embody epitopes that serve as their antigenic determinants. Epitopes occupy a central place in integrative biology, not to mention as targets for novel vaccine, pharmaceutical, and systems diagnostics development. The presence of T-cell and B-cell epitopes has been extensively studied due to their potential in synthetic vaccine design. However, reliable prediction of linear B-cell epitope remains a formidable challenge. Earlier studies have reported discrepancy in amino acid composition between the epitopes and non-epitopes. Hence, this study proposed and developed a novel amino acid composition-based feature descriptor, Dipeptide Deviation from Expected Mean (DDE), to distinguish the linear B-cell epitopes from non-epitopes effectively. In this study, for the first time, only exact linear B-cell epitopes and non-epitopes have been utilized for developing the prediction method, unlike the use of epitope-containing regions in earlier reports. To evaluate the performance of the DDE feature vector, models have been developed with two widely used machine-learning techniques Support Vector Machine and AdaBoost-Random Forest. Five-fold cross-validation performance of the proposed method with error-free dataset and dataset from other studies achieved an overall accuracy between nearly 61% and 73%, with balance between sensitivity and specificity metrics. Performance of the DDE feature vector was better (with accuracy difference of about 2% to 12%), in comparison to other amino acid-derived features on different datasets. This study reflects the efficiency of the DDE feature vector in enhancing the linear B-cell epitope prediction performance, compared to other feature representations. The proposed method is made as a stand-alone tool available freely for researchers, particularly for those interested in vaccine design and novel molecular target development for systems therapeutics and diagnostics: https://github.com/brsaran/LBEEP.

  5. Prediction of structural features and application to outer membrane protein identification

    PubMed Central

    Yan, Renxiang; Wang, Xiaofeng; Huang, Lanqing; Yan, Feidi; Xue, Xiaoyu; Cai, Weiwen

    2015-01-01

    Protein three-dimensional (3D) structures provide insightful information in many fields of biology. One-dimensional properties derived from 3D structures such as secondary structure, residue solvent accessibility, residue depth and backbone torsion angles are helpful to protein function prediction, fold recognition and ab initio folding. Here, we predict various structural features with the assistance of neural network learning. Based on an independent test dataset, protein secondary structure prediction generates an overall Q3 accuracy of ~80%. Meanwhile, the prediction of relative solvent accessibility obtains the highest mean absolute error of 0.164, and prediction of residue depth achieves the lowest mean absolute error of 0.062. We further improve the outer membrane protein identification by including the predicted structural features in a scoring function using a simple profile-to-profile alignment. The results demonstrate that the accuracy of outer membrane protein identification can be improved by ~3% at a 1% false positive level when structural features are incorporated. Finally, our methods are available as two convenient and easy-to-use programs. One is PSSM-2-Features for predicting secondary structure, relative solvent accessibility, residue depth and backbone torsion angles, the other is PPA-OMP for identifying outer membrane proteins from proteomes. PMID:26104144

  6. Prediction of structural features and application to outer membrane protein identification

    NASA Astrophysics Data System (ADS)

    Yan, Renxiang; Wang, Xiaofeng; Huang, Lanqing; Yan, Feidi; Xue, Xiaoyu; Cai, Weiwen

    2015-06-01

    Protein three-dimensional (3D) structures provide insightful information in many fields of biology. One-dimensional properties derived from 3D structures such as secondary structure, residue solvent accessibility, residue depth and backbone torsion angles are helpful to protein function prediction, fold recognition and ab initio folding. Here, we predict various structural features with the assistance of neural network learning. Based on an independent test dataset, protein secondary structure prediction generates an overall Q3 accuracy of ~80%. Meanwhile, the prediction of relative solvent accessibility obtains the highest mean absolute error of 0.164, and prediction of residue depth achieves the lowest mean absolute error of 0.062. We further improve the outer membrane protein identification by including the predicted structural features in a scoring function using a simple profile-to-profile alignment. The results demonstrate that the accuracy of outer membrane protein identification can be improved by ~3% at a 1% false positive level when structural features are incorporated. Finally, our methods are available as two convenient and easy-to-use programs. One is PSSM-2-Features for predicting secondary structure, relative solvent accessibility, residue depth and backbone torsion angles, the other is PPA-OMP for identifying outer membrane proteins from proteomes.

  7. Widespread convergence in toxin resistance by predictable molecular evolution

    PubMed Central

    Ujvari, Beata; Casewell, Nicholas R.; Sunagar, Kartik; Arbuckle, Kevin; Wüster, Wolfgang; Lo, Nathan; O’Meally, Denis; Beckmann, Christa; King, Glenn F.; Deplazes, Evelyne; Madsen, Thomas

    2015-01-01

    The question about whether evolution is unpredictable and stochastic or intermittently constrained along predictable pathways is the subject of a fundamental debate in biology, in which understanding convergent evolution plays a central role. At the molecular level, documented examples of convergence are rare and limited to occurring within specific taxonomic groups. Here we provide evidence of constrained convergent molecular evolution across the metazoan tree of life. We show that resistance to toxic cardiac glycosides produced by plants and bufonid toads is mediated by similar molecular changes to the sodium-potassium-pump (Na+/K+-ATPase) in insects, amphibians, reptiles, and mammals. In toad-feeding reptiles, resistance is conferred by two point mutations that have evolved convergently on four occasions, whereas evidence of a molecular reversal back to the susceptible state in varanid lizards migrating to toad-free areas suggests that toxin resistance is maladaptive in the absence of selection. Importantly, resistance in all taxa is mediated by replacements of 2 of the 12 amino acids comprising the Na+/K+-ATPase H1–H2 extracellular domain that constitutes a core part of the cardiac glycoside binding site. We provide mechanistic insight into the basis of resistance by showing that these alterations perturb the interaction between the cardiac glycoside bufalin and the Na+/K+-ATPase. Thus, similar selection pressures have resulted in convergent evolution of the same molecular solution across the breadth of the animal kingdom, demonstrating how a scarcity of possible solutions to a selective challenge can lead to highly predictable evolutionary responses. PMID:26372961

  8. Molecular orbital predictions of the vibrational frequencies of some molecular ions

    NASA Technical Reports Server (NTRS)

    Defrees, D. J.; Mclean, A. D.

    1985-01-01

    The initial detections of IR vibration-rotation bands in polyatomic molecular ions by recent spectroscopic advances were guided by ab initio prediction of vibrational frequencies. The present calculations predict the vibrational frequencies of additional ions which are candidates for laboratory analysis. Neutral molecule vibrational frequencies were computed at three levels of theory and then compared with experimental data; the effect of scaling was also investigated, in order to determine how accurately vibrational frequencies could be predicted. For 92 percent of the frequencies examined, the relatively simple HF/6-31G theory's vibrational frequencies were within 100/cm of experimental values, with a mean absolute error of 49/cm. On this basis, the frequencies of 30 molecular ions (many possessing astrophysical significance) were computed.

  9. Perceptual quality prediction on authentically distorted images using a bag of features approach

    PubMed Central

    Ghadiyaram, Deepti; Bovik, Alan C.

    2017-01-01

    Current top-performing blind perceptual image quality prediction models are generally trained on legacy databases of human quality opinion scores on synthetically distorted images. Therefore, they learn image features that effectively predict human visual quality judgments of inauthentic and usually isolated (single) distortions. However, real-world images usually contain complex composite mixtures of multiple distortions. We study the perceptually relevant natural scene statistics of such authentically distorted images in different color spaces and transform domains. We propose a “bag of feature maps” approach that avoids assumptions about the type of distortion(s) contained in an image and instead focuses on capturing consistencies—or departures therefrom—of the statistics of real-world images. Using a large database of authentically distorted images, human opinions of them, and bags of features computed on them, we train a regressor to conduct image quality prediction. We demonstrate the competence of the features toward improving automatic perceptual quality prediction by testing a learned algorithm using them on a benchmark legacy database as well as on a newly introduced distortion-realistic resource called the LIVE In the Wild Image Quality Challenge Database. We extensively evaluate the perceptual quality prediction model and algorithm and show that it is able to achieve good-quality prediction power that is better than other leading models. PMID:28129417

  10. Hypomanic symptoms predict an increase in narcissistic and histrionic personality disorder features in suicidal young adults.

    PubMed

    Shahar, Golan; Scotti, Margaret-Ann; Rudd, M David; Joiner, Thomas E

    2008-01-01

    Consistent with the "scar hypothesis", according to which mood depression might impact personality, we examined the effect of unipolar and hypomanic mood disturbances on cluster B (i.e., narcissistic, histrionic, and borderline) personality disorder features. Data from 113 suicidal young adults were utilized, and cross-lagged associations between unipolar and hypomanic mood disturbances and cluster B personality disorder features were examined using manifest-variable structural equation modeling (SEM). Hypomanic symptoms predicted an increase in narcissistic and histrionic personality disorder features over the Time 1-Time 2 period, as well as an increase in narcissistic personality disorder features over the Time 1-Time 3 period. Unipolar depressive symptoms and borderline features were reciprocally and longitudinally associated, albeit at different time periods. The sample distinct features restrict generalization of the findings. An exclusive use of self-report measures might have contributed to shared method variance. Results are consistent with the notion that hypomanic symptoms increase narcissistic personality disorder tendencies.

  11. Reliable prediction of adsorption isotherms via genetic algorithm molecular simulation.

    PubMed

    LoftiKatooli, L; Shahsavand, A

    2017-01-01

    Conventional molecular simulation techniques such as grand canonical Monte Carlo (GCMC) strictly rely on purely random search inside the simulation box for predicting the adsorption isotherms. This blind search is usually extremely time demanding for providing a faithful approximation of the real isotherm and in some cases may lead to non-optimal solutions. A novel approach is presented in this article which does not use any of the classical steps of the standard GCMC method, such as displacement, insertation, and removal. The new approach is based on the well-known genetic algorithm to find the optimal configuration for adsorption of any adsorbate on a structured adsorbent under prevailing pressure and temperature. The proposed approach considers the molecular simulation problem as a global optimization challenge. A detailed flow chart of our so-called genetic algorithm molecular simulation (GAMS) method is presented, which is entirely different from traditions molecular simulation approaches. Three real case studies (for adsorption of CO2 and H2 over various zeolites) are borrowed from literature to clearly illustrate the superior performances of the proposed method over the standard GCMC technique. For the present method, the average absolute values of percentage errors are around 11% (RHO-H2), 5% (CHA-CO2), and 16% (BEA-CO2), while they were about 70%, 15%, and 40% for the standard GCMC technique, respectively.

  12. A molecular topology approach to predicting pesticide pollution of groundwater

    USGS Publications Warehouse

    Worrall , Fred

    2001-01-01

    Various models have proposed methods for the discrimination of polluting and nonpolluting compounds on the basis of simple parameters, typically adsorption and degradation constants. However, such attempts are prone to site variability and measurement error to the extent that compounds cannot be reliably classified nor the chemistry of pollution extrapolated from them. Using observations of pesticide occurrence in U.S. groundwater it is possible to show that polluting from nonpolluting compounds can be distinguished purely on the basis of molecular topology. Topological parameters can be derived without measurement error or site-specific variability. A logistic regression model has been developed which explains 97% of the variation in the data, with 86% of the variation being explained by the rule that a compound will be found in groundwater if 6 < 0.55. Where 6χp is the sixth-order molecular path connectivity. One group of compounds cannot be classified by this rule and prediction requires reference to higher order connectivity parameters. The use of molecular approaches for understanding pollution at the molecular level and their application to agrochemical development and risk assessment is discussed.

  13. Ultra-low-molecular-weight heparins: precise structural features impacting specific anticoagulant activities.

    PubMed

    Lima, Marcelo A; Viskov, Christian; Herman, Frederic; Gray, Angel L; de Farias, Eduardo H C; Cavalheiro, Renan P; Sassaki, Guilherme L; Hoppensteadt, Debra; Fareed, Jawed; Nader, Helena B

    2013-03-01

    Ultra-low-molecular-weight heparins (ULMWHs) with better efficacy and safety ratios are under development; however, there are few structural data available. The main structural features and molecular weight of ULMWHs were studied and compared to enoxaparin. Their monosaccharide composition and average molecular weights were determined and preparations studied by nuclear magnetic resonance spectroscopy, scanning ultraviolet spectroscopy, circular dichroism and gel permeation chromatography. In general, ULMWHs presented higher 3-O-sulphated glucosamine and unsaturated uronic acid residues, the latter being comparable with their higher degree of depolymerisation. The analysis showed that ULMWHs are structurally related to LMWHs; however, their monosaccharide/oligosaccharide compositions and average molecular weights differed considerably explaining their different anticoagulant activities. The results relate structural features to activity, assisting the development of new and improved therapeutic agents, based on depolymerised heparin, for the prophylaxis and treatment of thrombotic disorders.

  14. TU-CD-BRB-01: Normal Lung CT Texture Features Improve Predictive Models for Radiation Pneumonitis

    SciTech Connect

    Krafft, S; Briere, T; Court, L; Martel, M

    2015-06-15

    Purpose: Existing normal tissue complication probability (NTCP) models for radiation pneumonitis (RP) traditionally rely on dosimetric and clinical data but are limited in terms of performance and generalizability. Extraction of pre-treatment image features provides a potential new category of data that can improve NTCP models for RP. We consider quantitative measures of total lung CT intensity and texture in a framework for prediction of RP. Methods: Available clinical and dosimetric data was collected for 198 NSCLC patients treated with definitive radiotherapy. Intensity- and texture-based image features were extracted from the T50 phase of the 4D-CT acquired for treatment planning. A total of 3888 features (15 clinical, 175 dosimetric, and 3698 image features) were gathered and considered candidate predictors for modeling of RP grade≥3. A baseline logistic regression model with mean lung dose (MLD) was first considered. Additionally, a least absolute shrinkage and selection operator (LASSO) logistic regression was applied to the set of clinical and dosimetric features, and subsequently to the full set of clinical, dosimetric, and image features. Model performance was assessed by comparing area under the curve (AUC). Results: A simple logistic fit of MLD was an inadequate model of the data (AUC∼0.5). Including clinical and dosimetric parameters within the framework of the LASSO resulted in improved performance (AUC=0.648). Analysis of the full cohort of clinical, dosimetric, and image features provided further and significant improvement in model performance (AUC=0.727). Conclusions: To achieve significant gains in predictive modeling of RP, new categories of data should be considered in addition to clinical and dosimetric features. We have successfully incorporated CT image features into a framework for modeling RP and have demonstrated improved predictive performance. Validation and further investigation of CT image features in the context of RP NTCP

  15. Predictive testing of early CIN behaviour by molecular biomarkers.

    PubMed

    Baak, Jan P A; Kruse, Arnold-Jan; Janssen, Emiel; van Diermen, Bianca

    2005-01-01

    Each year, 330,000 new Cervical Intraepithelial Neoplasias(CIN) occur in the European Union (EU) of which 120,000 are early CIN where grade (1, 2) indicates the progression-risk to CIN-3 and therefore determines the treatment choice. However, the Positive Predictive Value (PPV) of CIN grade to predict progression is low (10% and 20% for CIN-1 and -2 respectively, 16% on average) resulting in an enormous number of over-treatments indicating worrisome grade reproducibility.Certain molecular biomarkers such as Ki-67 have a higher PPV (30%, an improvement of 14%), which in Europe alone could improve treatment for many thousands of women per year with considerable cost reduction for the health care system. The quantitative Ki-67 prognostic model has been validated in independent retrospective and prospective studies from different laboratories. Moreover, the PPV of Ki-67 alone can be improved by additional molecular biomarkers (retinoblastoma protein = Rb, cytokeratins= CK-14/-13). Combined Ki67-Rb allows a 2-tiered progression-risk subgroup assignment as very low ( approximately 0% progression, 71% of all CIN-I/II patients)and high risk (48% progression risk, incidence 32%), leaving a small (7% of all) prognostically undetermined group (17% progression). Additional CK-14 and -13 analysis can sub-classify the high-risk in an intermediate and very high risk subgroup(with 40% and 100% progression risks respectively).Thus, molecular biomarkers are potentially important determinators of early CIN lesion behaviour. Important factors for widespread acceptance of molecular biomarkers are (1) market penetration by user-friendly equipment, (2) (inter)national keeping of GLP conditions (reproducibility, independent validation), requiring customer-driven industrial efforts,governmental measures, and additional PPV improvement to further reduce over-treatment.

  16. Prediction of protein secondary structure using probability based features and a hybrid system.

    PubMed

    Ghanty, Pradip; Pal, Nikhil R; Mudi, Rajani K

    2013-10-01

    In this paper, we propose some co-occurrence probability-based features for prediction of protein secondary structure. The features are extracted using occurrence/nonoccurrence of secondary structures in the protein sequences. We explore two types of features: position-specific (based on position of amino acid on fragments of protein sequences) as well as position-independent (independent of amino acid position on fragments of protein sequences). We use a hybrid system, NEUROSVM, consisting of neural networks and support vector machines for classification of secondary structures. We propose two schemes NSVMps and NSVM for protein secondary structure prediction. The NSVMps uses position-specific probability-based features and NEUROSVM classifier whereas NSVM uses the same classifier with position-independent probability-based features. The proposed method falls in the single-sequence category of methods because it does not use any sequence profile information such as position specific scoring matrices (PSSM) derived from PSI-BLAST. Two widely used datasets RS126 and CB513 are used in the experiments. The results obtained using the proposed features and NEUROSVM classifier are better than most of the existing single-sequence prediction methods. Most importantly, the results using NSVMps that are obtained using lower dimensional features, are comparable to those by other existing methods. The NSVMps and NSVM are finally tested on target proteins of the critical assessment of protein structure prediction experiment-9 (CASP9). A larger dataset is used to compare the performance of the proposed methods with that of two recent single-sequence prediction methods. We also investigate the impact of presence of different amino acid residues (in protein sequences) that are responsible for the formation of different secondary structures.

  17. Predictive and Prognostic Molecular Biomarkers for Response to Neoadjuvant Chemoradiation in Rectal Cancer

    PubMed Central

    Dayde, Delphine; Tanaka, Ichidai; Jain, Rekha; Tai, Mei Chee; Taguchi, Ayumu

    2017-01-01

    The standard of care in locally advanced rectal cancer is neoadjuvant chemoradiation (nCRT) followed by radical surgery. Response to nCRT varies among patients and pathological complete response is associated with better outcome. However, there is a lack of effective methods to select rectal cancer patients who would or would not have a benefit from nCRT. The utility of clinicopathological and radiological features are limited due to lack of adequate sensitivity and specificity. Molecular biomarkers have the potential to predict response to nCRT at an early time point, but none have currently reached the clinic. Integration of diverse types of biomarkers including clinicopathological and imaging features, identification of mechanistic link to tumor biology, and rigorous validation using samples which represent disease heterogeneity, will allow to develop a sensitive and cost-effective molecular biomarker panel for precision medicine in rectal cancer. Here, we aim to review the recent advance in tissue- and blood-based molecular biomarker research and illustrate their potential in predicting nCRT response in rectal cancer. PMID:28272347

  18. Predictive and Prognostic Molecular Biomarkers for Response to Neoadjuvant Chemoradiation in Rectal Cancer.

    PubMed

    Dayde, Delphine; Tanaka, Ichidai; Jain, Rekha; Tai, Mei Chee; Taguchi, Ayumu

    2017-03-07

    The standard of care in locally advanced rectal cancer is neoadjuvant chemoradiation (nCRT) followed by radical surgery. Response to nCRT varies among patients and pathological complete response is associated with better outcome. However, there is a lack of effective methods to select rectal cancer patients who would or would not have a benefit from nCRT. The utility of clinicopathological and radiological features are limited due to lack of adequate sensitivity and specificity. Molecular biomarkers have the potential to predict response to nCRT at an early time point, but none have currently reached the clinic. Integration of diverse types of biomarkers including clinicopathological and imaging features, identification of mechanistic link to tumor biology, and rigorous validation using samples which represent disease heterogeneity, will allow to develop a sensitive and cost-effective molecular biomarker panel for precision medicine in rectal cancer. Here, we aim to review the recent advance in tissue- and blood-based molecular biomarker research and illustrate their potential in predicting nCRT response in rectal cancer.

  19. Adaptive reliance on the most stable sensory predictions enhances perceptual feature extraction of moving stimuli

    PubMed Central

    Kumar, Neeraj

    2016-01-01

    The prediction of the sensory outcomes of action is thought to be useful for distinguishing self- vs. externally generated sensations, correcting movements when sensory feedback is delayed, and learning predictive models for motor behavior. Here, we show that aspects of another fundamental function—perception—are enhanced when they entail the contribution of predicted sensory outcomes and that this enhancement relies on the adaptive use of the most stable predictions available. We combined a motor-learning paradigm that imposes new sensory predictions with a dynamic visual search task to first show that perceptual feature extraction of a moving stimulus is poorer when it is based on sensory feedback that is misaligned with those predictions. This was possible because our novel experimental design allowed us to override the “natural” sensory predictions present when any action is performed and separately examine the influence of these two sources on perceptual feature extraction. We then show that if the new predictions induced via motor learning are unreliable, rather than just relying on sensory information for perceptual judgments, as is conventionally thought, then subjects adaptively transition to using other stable sensory predictions to maintain greater accuracy in their perceptual judgments. Finally, we show that when sensory predictions are not modified at all, these judgments are sharper when subjects combine their natural predictions with sensory feedback. Collectively, our results highlight the crucial contribution of sensory predictions to perception and also suggest that the brain intelligently integrates the most stable predictions available with sensory information to maintain high fidelity in perceptual decisions. PMID:26823516

  20. Scoring multiple features to predict drug disease associations using information fusion and aggregation.

    PubMed

    Moghadam, H; Rahgozar, M; Gharaghani, S

    2016-08-01

    Prediction of drug-disease associations is one of the current fields in drug repositioning that has turned into a challenging topic in pharmaceutical science. Several available computational methods use network-based and machine learning approaches to reposition old drugs for new indications. However, they often ignore features of drugs and diseases as well as the priority and importance of each feature, relation, or interactions between features and the degree of uncertainty. When predicting unknown drug-disease interactions there are diverse data sources and multiple features available that can provide more accurate and reliable results. This information can be collectively mined using data fusion methods and aggregation operators. Therefore, we can use the feature fusion method to make high-level features. We have proposed a computational method named scored mean kernel fusion (SMKF), which uses a new method to score the average aggregation operator called scored mean. To predict novel drug indications, this method systematically combines multiple features related to drugs or diseases at two levels: the drug-drug level and the drug-disease level. The purpose of this study was to investigate the effect of drug and disease features as well as data fusion to predict drug-disease interactions. The method was validated against a well-established drug-disease gold-standard dataset. When compared with the available methods, our proposed method outperformed them and competed well in performance with area under cover (AUC) of 0.91, F-measure of 84.9% and Matthews correlation coefficient of 70.31%.

  1. Rigorous assessment and integration of the sequence and structure based features to predict hot spots

    PubMed Central

    2011-01-01

    Background Systematic mutagenesis studies have shown that only a few interface residues termed hot spots contribute significantly to the binding free energy of protein-protein interactions. Therefore, hot spots prediction becomes increasingly important for well understanding the essence of proteins interactions and helping narrow down the search space for drug design. Currently many computational methods have been developed by proposing different features. However comparative assessment of these features and furthermore effective and accurate methods are still in pressing need. Results In this study, we first comprehensively collect the features to discriminate hot spots and non-hot spots and analyze their distributions. We find that hot spots have lower relASA and larger relative change in ASA, suggesting hot spots tend to be protected from bulk solvent. In addition, hot spots have more contacts including hydrogen bonds, salt bridges, and atomic contacts, which favor complexes formation. Interestingly, we find that conservation score and sequence entropy are not significantly different between hot spots and non-hot spots in Ab+ dataset (all complexes). While in Ab- dataset (antigen-antibody complexes are excluded), there are significant differences in two features between hot pots and non-hot spots. Secondly, we explore the predictive ability for each feature and the combinations of features by support vector machines (SVMs). The results indicate that sequence-based feature outperforms other combinations of features with reasonable accuracy, with a precision of 0.69, a recall of 0.68, an F1 score of 0.68, and an AUC of 0.68 on independent test set. Compared with other machine learning methods and two energy-based approaches, our approach achieves the best performance. Moreover, we demonstrate the applicability of our method to predict hot spots of two protein complexes. Conclusion Experimental results show that support vector machine classifiers are quite

  2. Stable feature selection for clinical prediction: exploiting ICD tree structure using Tree-Lasso.

    PubMed

    Kamkar, Iman; Gupta, Sunil Kumar; Phung, Dinh; Venkatesh, Svetha

    2015-02-01

    Modern healthcare is getting reshaped by growing Electronic Medical Records (EMR). Recently, these records have been shown of great value towards building clinical prediction models. In EMR data, patients' diseases and hospital interventions are captured through a set of diagnoses and procedures codes. These codes are usually represented in a tree form (e.g. ICD-10 tree) and the codes within a tree branch may be highly correlated. These codes can be used as features to build a prediction model and an appropriate feature selection can inform a clinician about important risk factors for a disease. Traditional feature selection methods (e.g. Information Gain, T-test, etc.) consider each variable independently and usually end up having a long feature list. Recently, Lasso and related l1-penalty based feature selection methods have become popular due to their joint feature selection property. However, Lasso is known to have problems of selecting one feature of many correlated features randomly. This hinders the clinicians to arrive at a stable feature set, which is crucial for clinical decision making process. In this paper, we solve this problem by using a recently proposed Tree-Lasso model. Since, the stability behavior of Tree-Lasso is not well understood, we study the stability behavior of Tree-Lasso and compare it with other feature selection methods. Using a synthetic and two real-world datasets (Cancer and Acute Myocardial Infarction), we show that Tree-Lasso based feature selection is significantly more stable than Lasso and comparable to other methods e.g. Information Gain, ReliefF and T-test. We further show that, using different types of classifiers such as logistic regression, naive Bayes, support vector machines, decision trees and Random Forest, the classification performance of Tree-Lasso is comparable to Lasso and better than other methods. Our result has implications in identifying stable risk factors for many healthcare problems and therefore can

  3. Predicting the types of metabolic pathway of compounds using molecular fragments and sequential minimal optimization.

    PubMed

    Chen, Lei; Chu, Chen; Feng, Kaiyan

    2016-01-01

    A metabolic pathway is a series of biological processes providing necessary molecules and energies for an organism, which could be essential to the lives of the living organisms. Most metabolic pathways require the involvement of compounds and given a compound it is helpful to know what types of metabolic pathways the compound participates in. In this study, compounds are first represented by molecular fragments which are then delivered to a prediction engine called Sequential Minimal Optimization (SMO) for predictions. Maximum relevance and minimum redundancy (mRMR) and incremental feature selection are adopted to extract key features based on which an optimal prediction engine is established. The proposed method is effective comparing to the random forest, Dagging and a popular method that integrating chemical-chemical interactions and chemical-chemical similarities. We also make predictions using some compounds with unknown metabolic pathways and choose 17 compounds for analysis. The results indicate that the method proposed may become a useful tool in predicting and analyzing metabolic pathways.

  4. Well-characterized sequence features of eukaryote genomes and implications for ab initio gene prediction.

    PubMed

    Huang, Ying; Chen, Shi-Yi; Deng, Feilong

    2016-01-01

    In silico analysis of DNA sequences is an important area of computational biology in the post-genomic era. Over the past two decades, computational approaches for ab initio prediction of gene structure from genome sequence alone have largely facilitated our understanding on a variety of biological questions. Although the computational prediction of protein-coding genes has already been well-established, we are also facing challenges to robustly find the non-coding RNA genes, such as miRNA and lncRNA. Two main aspects of ab initio gene prediction include the computed values for describing sequence features and used algorithm for training the discriminant function, and by which different combinations are employed into various bioinformatic tools. Herein, we briefly review these well-characterized sequence features in eukaryote genomes and applications to ab initio gene prediction. The main purpose of this article is to provide an overview to beginners who aim to develop the related bioinformatic tools.

  5. Prediction of pesticides chromatographic lipophilicity from the computational molecular descriptors.

    PubMed

    Casoni, Dorina; Petre, Jana; David, Victor; Sârbu, Costel

    2011-02-01

    Quantitative structure-property relationship models were developed for the prediction of pesticides and some PAH compounds lipophilicity based on a wide set of computational molecular descriptors and a set of experimental chromatographic data. The chromatographic lipophilicity of pesticides has been evaluated by high-performance liquid chromatography (HPLC) using different chemically bonded (C18, C8, CN and Phenyl HPLC columns) stationary phases and two different organic modifiers (methanol and acetonitrile, respectively) in the mobile phase composition. Through a systematic study, by using the classic multivariate analysis, several quantitative structure-property/lipophilicity multi-dimensional models were established. Multiple linear regression and genetic algorithm for the variable subset selection were used. The internal and external statistical evaluation procedures revealed some appropriate models for the chromatographic lipophilicity prediction of pesticides. Moreover, the statistical parameters of regression and those obtained by applying t-test for the intercept (a(0)) and for the slope (a(1)) in order to evaluate relationship between experimental and predicted octanol-water partition coefficients in case of the test set compounds, revealed two statistically valid models that can be successfully used in lipophilicity prediction of pesticides.

  6. Cellular automata with object-oriented features for parallel molecular network modeling.

    PubMed

    Zhu, Hao; Wu, Yinghui; Huang, Sui; Sun, Yan; Dhar, Pawan

    2005-06-01

    Cellular automata are an important modeling paradigm for studying the dynamics of large, parallel systems composed of multiple, interacting components. However, to model biological systems, cellular automata need to be extended beyond the large-scale parallelism and intensive communication in order to capture two fundamental properties characteristic of complex biological systems: hierarchy and heterogeneity. This paper proposes extensions to a cellular automata language, Cellang, to meet this purpose. The extended language, with object-oriented features, can be used to describe the structure and activity of parallel molecular networks within cells. Capabilities of this new programming language include object structure to define molecular programs within a cell, floating-point data type and mathematical functions to perform quantitative computation, message passing capability to describe molecular interactions, as well as new operators, statements, and built-in functions. We discuss relevant programming issues of these features, including the object-oriented description of molecular interactions with molecule encapsulation, message passing, and the description of heterogeneity and anisotropy at the cell and molecule levels. By enabling the integration of modeling at the molecular level with system behavior at cell, tissue, organ, or even organism levels, the program will help improve our understanding of how complex and dynamic biological activities are generated and controlled by parallel functioning of molecular networks. Index Terms-Cellular automata, modeling, molecular network, object-oriented.

  7. Multi-Center Prediction of Hemorrhagic Transformation in Acute Ischemic Stroke using Permeability Imaging Features

    PubMed Central

    Scalzo, Fabien; Alger, Jeffry R.; Hu, Xiao; Saver, Jeffrey L.; Dani, Krishna A.; Muir, Keith W.; Demchuk, Andrew M.; Coutts, Shelagh B.; Luby, Marie; Warach, Steven; Liebeskind, David S.

    2013-01-01

    Permeability images derived from magnetic resonance (MR) perfusion images are sensitive to blood-brain barrier derangement of the brain tissue and have been shown to correlate with subsequent development of hemorrhagic transformation (HT) in acute ischemic stroke. This paper presents a multi-center retrospective study that evaluates the predictive power in terms of HT of six permeability MRI measures including contrast slope (CS), final contrast (FC), maximum peak bolus concentration (MPB), peak bolus area (PB), relative recirculation (rR), and percentage recovery (%R). Dynamic T2*-weighted perfusion MR images were collected from 263 acute ischemic stroke patients from four medical centers. An essential aspect of this study is to exploit a classifier-based framework to automatically identify predictive patterns in the overall intensity distribution of the permeability maps. The model is based on normalized intensity histograms that are used as input features to the predictive model. Linear and nonlinear predictive models are evaluated using a crossvalidation to measure generalization power on new patients and a comparative analysis is provided for the different types of parameters. Results demonstrate that perfusion imaging in acute ischemic stroke can predict HT with an average accuracy of more than 85% using a predictive model based on a nonlinear regression model. Results also indicate that the permeability feature based on the percentage of recovery performs significantly better than the other features. This novel model may be used to refine treatment decisions in acute stroke. PMID:23587928

  8. Unbiased Prediction and Feature Selection in High-Dimensional Survival Regression

    PubMed Central

    Laimighofer, Michael; Krumsiek, Jan; Theis, Fabian J.

    2016-01-01

    Abstract With widespread availability of omics profiling techniques, the analysis and interpretation of high-dimensional omics data, for example, for biomarkers, is becoming an increasingly important part of clinical medicine because such datasets constitute a promising resource for predicting survival outcomes. However, early experience has shown that biomarkers often generalize poorly. Thus, it is crucial that models are not overfitted and give accurate results with new data. In addition, reliable detection of multivariate biomarkers with high predictive power (feature selection) is of particular interest in clinical settings. We present an approach that addresses both aspects in high-dimensional survival models. Within a nested cross-validation (CV), we fit a survival model, evaluate a dataset in an unbiased fashion, and select features with the best predictive power by applying a weighted combination of CV runs. We evaluate our approach using simulated toy data, as well as three breast cancer datasets, to predict the survival of breast cancer patients after treatment. In all datasets, we achieve more reliable estimation of predictive power for unseen cases and better predictive performance compared to the standard CoxLasso model. Taken together, we present a comprehensive and flexible framework for survival models, including performance estimation, final feature selection, and final model construction. The proposed algorithm is implemented in an open source R package (SurvRank) available on CRAN. PMID:26894327

  9. Survival Prediction and Feature Selection in Patients with Breast Cancer Using Support Vector Regression

    PubMed Central

    Goli, Shahrbanoo; Faradmal, Javad; Mashayekhi, Hoda; Soltanian, Ali-Reza

    2016-01-01

    The Support Vector Regression (SVR) model has been broadly used for response prediction. However, few researchers have used SVR for survival analysis. In this study, a new SVR model is proposed and SVR with different kernels and the traditional Cox model are trained. The models are compared based on different performance measures. We also select the best subset of features using three feature selection methods: combination of SVR and statistical tests, univariate feature selection based on concordance index, and recursive feature elimination. The evaluations are performed using available medical datasets and also a Breast Cancer (BC) dataset consisting of 573 patients who visited the Oncology Clinic of Hamadan province in Iran. Results show that, for the BC dataset, survival time can be predicted more accurately by linear SVR than nonlinear SVR. Based on the three feature selection methods, metastasis status, progesterone receptor status, and human epidermal growth factor receptor 2 status are the best features associated to survival. Also, according to the obtained results, performance of linear and nonlinear kernels is comparable. The proposed SVR model performs similar to or slightly better than other models. Also, SVR performs similar to or better than Cox when all features are included in model. PMID:27882074

  10. Endosonographic features predictive of benign and malignant gastrointestinal stromal cell tumours

    PubMed Central

    Palazzo, L; Landi, B; Cellier, C; Cuillerier, E; Roseau, G; Barbier, J

    2000-01-01

    BACKGROUND/AIM—Some endoscopic ultrasonographic (EUS) features have been reported to be suggestive of malignancy in gastrointestinal stromal cell tumours (SCTs). The aim of this study was to assess the predictive value of these features for malignancy.
METHODS—A total of 56 histologically proven cases of SCT studied by EUS between 1989 and 1996 were reviewed. There were 42 gastric tumours, 12 oesophageal tumours, and two rectal tumours. The tumours were divided into two groups: (a) benign SCT, comprising benign leiomyoma (n = 34); (b) malignant or borderline SCT (n = 22), comprising leiomyosarcoma (n = 9), leiomyoblastoma (n = 9), and leiomyoma of uncertain malignant potential (n = 4). The main EUS features recorded were tumour size, ulceration, echo pattern, cystic spaces, extraluminal margins, and lymph nodes with a malignant pattern. The two groups were compared by univariate and multivariate analysis.
RESULTS—Irregular extraluminal margins, cystic spaces, and lymph nodes with a malignant pattern were most predictive of malignant or borderline SCT. Pairwise combinations of the three features had a specificity and positive predictive value of 100% for malignant or borderline SCT, but a sensitivity of only 23%. The presence of at least one of these three criteria had 91% sensitivity, 88% specificity, and 83% predictive positive value. In multivariate analysis, cystic spaces and irregular margins were the only two features independently predictive of malignant potential. The features most predictive of benign SCTs were regular margins, tumour size ⩽30 mm, and a homogeneous echo pattern. When the three features were combined, histology confirmed a benign SCT in all cases.
CONCLUSIONS—The combined presence of two out of three EUS features (irregular extraluminal margins, cystic spaces, and lymph nodes with a malignant pattern) had a positive predictive value of 100% for malignant or borderline gastrointestinal SCT. Tumours less than 30

  11. Patient feature based dosimetric Pareto front prediction in esophageal cancer radiotherapy

    SciTech Connect

    Wang, Jiazhou; Zhao, Kuaike; Peng, Jiayuan; Xie, Jiang; Chen, Junchao; Zhang, Zhen; Hu, Weigang; Jin, Xiance; Studenski, Matthew

    2015-02-15

    Purpose: To investigate the feasibility of the dosimetric Pareto front (PF) prediction based on patient’s anatomic and dosimetric parameters for esophageal cancer patients. Methods: Eighty esophagus patients in the authors’ institution were enrolled in this study. A total of 2928 intensity-modulated radiotherapy plans were obtained and used to generate PF for each patient. On average, each patient had 36.6 plans. The anatomic and dosimetric features were extracted from these plans. The mean lung dose (MLD), mean heart dose (MHD), spinal cord max dose, and PTV homogeneity index were recorded for each plan. Principal component analysis was used to extract overlap volume histogram (OVH) features between PTV and other organs at risk. The full dataset was separated into two parts; a training dataset and a validation dataset. The prediction outcomes were the MHD and MLD. The spearman’s rank correlation coefficient was used to evaluate the correlation between the anatomical features and dosimetric features. The stepwise multiple regression method was used to fit the PF. The cross validation method was used to evaluate the model. Results: With 1000 repetitions, the mean prediction error of the MHD was 469 cGy. The most correlated factor was the first principal components of the OVH between heart and PTV and the overlap between heart and PTV in Z-axis. The mean prediction error of the MLD was 284 cGy. The most correlated factors were the first principal components of the OVH between heart and PTV and the overlap between lung and PTV in Z-axis. Conclusions: It is feasible to use patients’ anatomic and dosimetric features to generate a predicted Pareto front. Additional samples and further studies are required improve the prediction model.

  12. Clinical, pathologic, and molecular features of early-onset colorectal carcinoma.

    PubMed

    Yantiss, Rhonda K; Goodarzi, Mahmoud; Zhou, Xi K; Rennert, Hanna; Pirog, Edyta C; Banner, Barbara F; Chen, Yao-Tseng

    2009-04-01

    The incidence of colorectal carcinoma has increased among patients <40 years of age for unclear reasons. In this study, we describe the clinical, pathologic, and molecular features of colorectal carcinomas that developed in young patients. We compiled a study group of 24 patients <40 years of age with colorectal carcinoma, and 45 patients > or =40 years of age served as controls. Cases were evaluated for clinical risk factors of malignancy and pathologic features predictive of outcome. The tumors were immunohistochemically stained for O6-methylguanine methyltransferase, MLH-1, MSH-2, MSH-6, beta-catenin, chemokine (C-X-C motif) receptor 4, epidermal growth factor receptor, TP53, p16, survivin, and alpha-methylacyl-CoA racemase; assessed for microsatellite instability and mutations in beta-catenin, APC, EGFR, PIK3CA, KRAS, and BRAF; evaluated for micro-RNA expression (miR-21, miR-20a, miR-183, miR-192, miR-145, miR-106a, miR-181b, and miR-203); and examined for evidence of human papillomavirus infection. One study patient each had ulcerative colitis and hereditary nonpolyposis colorectal cancer. Ninety-two percent of tumors from young patients occurred in the distal colon (P=0.006), particularly the rectum (58%, P=0.02), and 75% were stage III or IV. Tumors from young patients showed more frequent lymphovascular (81%, P=0.03) and/or venous (48%, P=0.003) invasion, an infiltrative growth pattern (81%, P=0.03), and alpha-methylacyl-CoA racemase expression (83%, P=0.02) compared with controls. Carcinomas in this group showed significantly increased expression of miR-21, miR-20a, miR-145, miR-181b, and miR-203 (P< or =0.005 for all comparisons with controls). These results indicate that early-onset carcinomas commonly show pathologic features associated with aggressive behavior. Posttranslational regulation of mRNA and subsequent protein expression may be particularly important to the development of colorectal carcinomas in young patients.

  13. PROSPER: An Integrated Feature-Based Tool for Predicting Protease Substrate Cleavage Sites

    PubMed Central

    Perry, Andrew J.; Akutsu, Tatsuya; Webb, Geoffrey I.; Whisstock, James C.; Pike, Robert N.

    2012-01-01

    The ability to catalytically cleave protein substrates after synthesis is fundamental for all forms of life. Accordingly, site-specific proteolysis is one of the most important post-translational modifications. The key to understanding the physiological role of a protease is to identify its natural substrate(s). Knowledge of the substrate specificity of a protease can dramatically improve our ability to predict its target protein substrates, but this information must be utilized in an effective manner in order to efficiently identify protein substrates by in silico approaches. To address this problem, we present PROSPER, an integrated feature-based server for in silico identification of protease substrates and their cleavage sites for twenty-four different proteases. PROSPER utilizes established specificity information for these proteases (derived from the MEROPS database) with a machine learning approach to predict protease cleavage sites by using different, but complementary sequence and structure characteristics. Features used by PROSPER include local amino acid sequence profile, predicted secondary structure, solvent accessibility and predicted native disorder. Thus, for proteases with known amino acid specificity, PROSPER provides a convenient, pre-prepared tool for use in identifying protein substrates for the enzymes. Systematic prediction analysis for the twenty-four proteases thus far included in the database revealed that the features we have included in the tool strongly improve performance in terms of cleavage site prediction, as evidenced by their contribution to performance improvement in terms of identifying known cleavage sites in substrates for these enzymes. In comparison with two state-of-the-art prediction tools, PoPS and SitePrediction, PROSPER achieves greater accuracy and coverage. To our knowledge, PROSPER is the first comprehensive server capable of predicting cleavage sites of multiple proteases within a single substrate sequence using

  14. DYNAMICS OF ATOMIC AND MOLECULAR EMISSION FEATURES FROM NANOSECOND, FEMTOSECOND LASER AND FILAMENT PRODUCED PLASMAS

    SciTech Connect

    Harilal, Sivanandan S.; Yeak, J.; Brumfield, Brian E.; Phillips, Mark C.

    2016-08-08

    In this presentation, the persistence of atomic, and molecular emission features and its relation to fundamental properties (temperature and density) of ablation plumes generated using various irradiation methods (ns, fs, filaments) will be discussed in detail along with its implications for remote sensing applications.

  15. Critical Features Predicting Sustained Implementation of School-Wide Positive Behavioral Interventions and Supports

    ERIC Educational Resources Information Center

    Mathews, Susanna; McIntosh, Kent; Frank, Jennifer L.; May, Seth L.

    2014-01-01

    The current study explored the extent to which a common measure of perceived implementation of critical features of Positive Behavioral Interventions and Supports (PBIS) predicted fidelity of implementation 3 years later. Respondents included school personnel from 261 schools across the United States implementing PBIS. School teams completed the…

  16. Critical Features Predicting Sustained Implementation of School-Wide Positive Behavior Support

    ERIC Educational Resources Information Center

    Mathews, Susanna; McIntosh, Kent; Frank, Jennifer; May, Seth

    2014-01-01

    The current study explored the extent to which a common measure of perceived implementation of critical features of School-wide Positive Behavior Support (SWPBS) predicted fidelity of implementation 3 years later. Respondents included school personnel from 261 schools across the United States implementing SWPBS. School teams completed the…

  17. Synergistic combination of clinical and imaging features predicts abnormal imaging patterns of pulmonary infections

    PubMed Central

    Bagci, Ulas; Jaster-Miller, Kirsten; Olivier, Kenneth N.; Yao, Jianhua; Mollura, Daniel J.

    2013-01-01

    We designed and tested a novel hybrid statistical model that accepts radiologic image features and clinical variables, and integrates this information in order to automatically predict abnormalities in chest computed-tomography (CT) scans and identify potentially important infectious disease biomarkers. In 200 patients, 160 with various pulmonary infections and 40 healthy controls, we extracted 34 clinical variables from laboratory tests and 25 textural features from CT images. From the CT scans, pleural effusion (PE), linear opacity (or thickening) (LT), tree-in-bud (TIB), pulmonary nodules, ground glass opacity (GGO), and consolidation abnormality patterns were analyzed and predicted through clinical, textural (imaging), or combined attributes. The presence and severity of each abnormality pattern was validated by visual analysis of the CT scans. The proposed biomarker identification system included two important steps: (i) a coarse identification of an abnormal imaging pattern by adaptively selected features (AmRMR), and (ii) a fine selection of the most important features from the previous step, and assigning them as biomarkers, depending on the prediction accuracy. Selected biomarkers were used to classify normal and abnormal patterns by using a boosted decision tree (BDT) classifier. For all abnormal imaging patterns, an average prediction accuracy of 76.15% was obtained. Experimental results demonstrated that our proposed biomarker identification approach is promising and may advance the data processing in clinical pulmonary infection research and diagnostic techniques. PMID:23930819

  18. Improved Species-Specific Lysine Acetylation Site Prediction Based on a Large Variety of Features Set

    PubMed Central

    Wuyun, Qiqige; Zheng, Wei; Zhang, Yanping; Ruan, Jishou; Hu, Gang

    2016-01-01

    Lysine acetylation is a major post-translational modification. It plays a vital role in numerous essential biological processes, such as gene expression and metabolism, and is related to some human diseases. To fully understand the regulatory mechanism of acetylation, identification of acetylation sites is first and most important. However, experimental identification of protein acetylation sites is often time consuming and expensive. Therefore, the alternative computational methods are necessary. Here, we developed a novel tool, KA-predictor, to predict species-specific lysine acetylation sites based on support vector machine (SVM) classifier. We incorporated different types of features and employed an efficient feature selection on each type to form the final optimal feature set for model learning. And our predictor was highly competitive for the majority of species when compared with other methods. Feature contribution analysis indicated that HSE features, which were firstly introduced for lysine acetylation prediction, significantly improved the predictive performance. Particularly, we constructed a high-accurate structure dataset of H.sapiens from PDB to analyze the structural properties around lysine acetylation sites. Our datasets and a user-friendly local tool of KA-predictor can be freely available at http://sourceforge.net/p/ka-predictor. PMID:27183223

  19. Prediction of Protein Structural Class Based on Gapped-Dipeptides and a Recursive Feature Selection Approach.

    PubMed

    Liu, Taigang; Qin, Yufang; Wang, Yongjie; Wang, Chunhua

    2015-12-24

    The prior knowledge of protein structural class may offer useful clues on understanding its functionality as well as its tertiary structure. Though various significant efforts have been made to find a fast and effective computational approach to address this problem, it is still a challenging topic in the field of bioinformatics. The position-specific score matrix (PSSM) profile has been shown to provide a useful source of information for improving the prediction performance of protein structural class. However, this information has not been adequately explored. To this end, in this study, we present a feature extraction technique which is based on gapped-dipeptides composition computed directly from PSSM. Then, a careful feature selection technique is performed based on support vector machine-recursive feature elimination (SVM-RFE). These optimal features are selected to construct a final predictor. The results of jackknife tests on four working datasets show that our method obtains satisfactory prediction accuracies by extracting features solely based on PSSM and could serve as a very promising tool to predict protein structural class.

  20. Established and emerging variants of glioblastoma multiforme: review of morphological and molecular features.

    PubMed

    Karsy, Michael; Gelbman, Marshall; Shah, Paarth; Balumbu, Odessa; Moy, Fred; Arslan, Erol

    2012-01-01

    Since the recent publication of the World Health Organization brain tumour classification guidelines in 2007, a significant expansion in the molecular understanding of glioblastoma multiforme (GBM) and its pathological as well as genomic variants has been evident. The purpose of this review article is to evaluate the histopathological, molecular and clinical features surrounding emerging and currently established GBM variants. The tumours discussed include classic glioblastoma multiforme and its four genomic variants, proneural, neural, mesenchymal, classical, as well as gliosarcoma (GS), and giant cell GBM (gcGBM). Furthermore, the emerging variants include fibrillary/epithelial GBM, small cell astrocytoma (SCA), GBM with oligodendroglial component (GBMO), GBM with primitive neuroectodermal features (GBM-PNET), gemistocytic astrocytoma (GA), granular cell astrocytoma (GCA), and paediatric high-grade glioma (HGG) as well as diffuse intrinsic pontine glioma (DIPG). Better understanding of the heterogeneous nature of GBM may provide improved treatment paradigms, prognostic classification, and approaches towards molecularly targeted treatments.

  1. Coarse-grained molecular dynamics simulations linking molecular features of polycations to polycation-polyanion complexation for gene delivery

    NASA Astrophysics Data System (ADS)

    McLeland, Anna; Johnson, Daniel; Jayaraman, Arthi

    2014-03-01

    Gene therapy is a method involving transfection or delivery of therapeutic DNA to target cells for expression of proteins that can cure diseases. Polycations have shown tremendous potential as DNA delivery vectors because the positive charges along the polycation interact with the negatively charged DNA backbone to form a polyplex that protects and transfects the DNA. Past work has shown that the structure and chemistry of the polycation affects DNA transfection efficiency. In this work, we use coarse grained models that are mapped from atomistic simulations, along with molecular dynamics simulations to study the binding of polycations and polyanions into polyplexes. We characterize the structure, surface composition and shape of the polyplex, features that impact DNA delivery, as a function of polycation chemistry, architecture (linear versus grafted), and molecular weight. The results from these simulations serve as valuable guidelines for experimentalists on what molecular characteristics they need to incorporate in the polycations to achieve higher transfection efficiency.

  2. Specific molecular signatures predict decitabine response in chronic myelomonocytic leukemia.

    PubMed

    Meldi, Kristen; Qin, Tingting; Buchi, Francesca; Droin, Nathalie; Sotzen, Jason; Micol, Jean-Baptiste; Selimoglu-Buet, Dorothée; Masala, Erico; Allione, Bernardino; Gioia, Daniela; Poloni, Antonella; Lunghi, Monia; Solary, Eric; Abdel-Wahab, Omar; Santini, Valeria; Figueroa, Maria E

    2015-05-01

    Myelodysplastic syndromes and chronic myelomonocytic leukemia (CMML) are characterized by mutations in genes encoding epigenetic modifiers and aberrant DNA methylation. DNA methyltransferase inhibitors (DMTis) are used to treat these disorders, but response is highly variable, with few means to predict which patients will benefit. Here, we examined baseline differences in mutations, DNA methylation, and gene expression in 40 CMML patients who were responsive or resistant to decitabine (DAC) in order to develop a molecular means of predicting response at diagnosis. While somatic mutations did not differentiate responders from nonresponders, we identified 167 differentially methylated regions (DMRs) of DNA at baseline that distinguished responders from nonresponders using next-generation sequencing. These DMRs were primarily localized to nonpromoter regions and overlapped with distal regulatory enhancers. Using the methylation profiles, we developed an epigenetic classifier that accurately predicted DAC response at the time of diagnosis. Transcriptional analysis revealed differences in gene expression at diagnosis between responders and nonresponders. In responders, the upregulated genes included those that are associated with the cell cycle, potentially contributing to effective DAC incorporation. Treatment with CXCL4 and CXCL7, which were overexpressed in nonresponders, blocked DAC effects in isolated normal CD34+ and primary CMML cells, suggesting that their upregulation contributes to primary DAC resistance.

  3. The identification of molecular surfaces' feature regions based on spherical mapping

    NASA Astrophysics Data System (ADS)

    Zhang, Meiling; Zhang, Jingqiao

    2017-02-01

    As possible active sites, the concave and convex feature regions of the molecule are the locations where the molecular docking will happen more possibly. Then how to search for those regions is valuable to study. In this paper, a new method is proposed for identifying concave and convex regions. Based on the established spherical mapping between molecular surfaces and its bounding-sphere surfaces, the concave and convex vertices of local areas can be determined according to the expansion distance defined by the spherical mapping. Then through mesh growing, a feature region can be firmed by a concave point or a convex point, also called center point, and its neighboring faces, whose normal vector has an angle in a specified range with the center point. After that, areas and volumes of feature regions are calculated. The experimental results indicate that the method can well identify the concave and convex characteristics of the molecule.

  4. Dynamics of Molecular Emission Features from Nanosecond, Femtosecond Laser and Filament Ablation Plasmas

    SciTech Connect

    Harilal, Sivanandan S.; Yeak, J.; Brumfield, Brian E.; Suter, Jonathan D.; Phillips, Mark C.

    2016-06-15

    The evolutionary paths of molecular species and nanoparticles in laser ablation plumes are not well understood due to the complexity of numerous physical processes that occur simultaneously in a transient laser-plasma system. It is well known that the emission features of ions, atoms, molecules and nanoparticles in a laser ablation plume strongly depend on the laser irradiation conditions. In this letter we report the temporal emission features of AlO molecules in plasmas generated using a nanosecond laser, a femtosecond laser and filaments generated from a femtosecond laser. Our results show that, at a fixed laser energy, the persistence of AlO is found to be highest and lowest in ns and filament laser plasmas respectively while molecular species are formed at early times for both ultrashort pulse (fs and filament) generated plasmas. Analysis of the AlO emission band features show that the vibrational temperature of AlO decays rapidly in filament assisted laser ablation plumes.

  5. Music-induced emotions can be predicted from a combination of brain activity and acoustic features.

    PubMed

    Daly, Ian; Williams, Duncan; Hallowell, James; Hwang, Faustina; Kirke, Alexis; Malik, Asad; Weaver, James; Miranda, Eduardo; Nasuto, Slawomir J

    2015-12-01

    It is widely acknowledged that music can communicate and induce a wide range of emotions in the listener. However, music is a highly-complex audio signal composed of a wide range of complex time- and frequency-varying components. Additionally, music-induced emotions are known to differ greatly between listeners. Therefore, it is not immediately clear what emotions will be induced in a given individual by a piece of music. We attempt to predict the music-induced emotional response in a listener by measuring the activity in the listeners electroencephalogram (EEG). We combine these measures with acoustic descriptors of the music, an approach that allows us to consider music as a complex set of time-varying acoustic features, independently of any specific music theory. Regression models are found which allow us to predict the music-induced emotions of our participants with a correlation between the actual and predicted responses of up to r=0.234,p<0.001. This regression fit suggests that over 20% of the variance of the participant's music induced emotions can be predicted by their neural activity and the properties of the music. Given the large amount of noise, non-stationarity, and non-linearity in both EEG and music, this is an encouraging result. Additionally, the combination of measures of brain activity and acoustic features describing the music played to our participants allows us to predict music-induced emotions with significantly higher accuracies than either feature type alone (p<0.01).

  6. Robust feature generation for protein subchloroplast location prediction with a weighted GO transfer model.

    PubMed

    Li, Xiaomei; Wu, Xindong; Wu, Gongqing

    2014-04-21

    Chloroplasts are crucial organelles of green plants and eukaryotic algae since they conduct photosynthesis. Predicting the subchloroplast location of a protein can provide important insights for understanding its biological functions. The performance of subchloroplast location prediction algorithms often depends on deriving predictive and succinct features from genomic and proteomic data. In this work, a novel weighted Gene Ontology (GO) transfer model is proposed to generate discriminating features from sequence data and GO Categories. This model contains two components. First, we transfer the GO terms of the homologous protein, and then assign the bit-score as weights to GO features. Second, we employ term-selection methods to determine weights for GO terms. This model is capable of improving prediction accuracy due to the tolerance of the noise derived from homolog knowledge transfer. The proposed weighted GO transfer method based on bit-score and a logarithmic transformation of CHI-square (WS-LCHI) performs better than the baseline models, and also outperforms the four off-the-shelf subchloroplast prediction methods.

  7. Genomic Signal Processing: Predicting Basic Molecular Biological Principles

    NASA Astrophysics Data System (ADS)

    Alter, Orly

    2005-03-01

    Advances in high-throughput technologies enable acquisition of different types of molecular biological data, monitoring the flow of biological information as DNA is transcribed to RNA, and RNA is translated to proteins, on a genomic scale. Future discovery in biology and medicine will come from the mathematical modeling of these data, which hold the key to fundamental understanding of life on the molecular level, as well as answers to questions regarding diagnosis, treatment and drug development. Recently we described data-driven models for genome-scale molecular biological data, which use singular value decomposition (SVD) and the comparative generalized SVD (GSVD). Now we describe an integrative data-driven model, which uses pseudoinverse projection (1). We also demonstrate the predictive power of these matrix algebra models (2). The integrative pseudoinverse projection model formulates any number of genome-scale molecular biological data sets in terms of one chosen set of data samples, or of profiles extracted mathematically from data samples, designated the ``basis'' set. The mathematical variables of this integrative model, the pseudoinverse correlation patterns that are uncovered in the data, represent independent processes and corresponding cellular states (such as observed genome-wide effects of known regulators or transcription factors, the biological components of the cellular machinery that generate the genomic signals, and measured samples in which these regulators or transcription factors are over- or underactive). Reconstruction of the data in the basis simulates experimental observation of only the cellular states manifest in the data that correspond to those of the basis. Classification of the data samples according to their reconstruction in the basis, rather than their overall measured profiles, maps the cellular states of the data onto those of the basis, and gives a global picture of the correlations and possibly also causal coordination of

  8. Quantitative structure-property relationships for predicting Henry's law constant from molecular structure.

    PubMed

    Dearden, John C; Schüürmann, Gerrit

    2003-08-01

    Various models are available for the prediction of Henry's law constant (H) or the air-water partition coefficient (Kaw), its dimensionless counterpart. Incremental methods are based on structural features such as atom types, bond types, and local structural environments; other regression models employ physicochemical properties, structural descriptors such as connectivity indices, and descriptors reflecting the electronic structure. There are also methods to calculate H from the ratio of vapor pressure (p(v)) and water solubility (S(w)) that in turn can be estimated from molecular structure, and quantum chemical continuum-solvation models to predict H via the solvation-free energy (deltaG(s)). This review is confined to methods that calculate H from molecular structure without experimental information and covers more than 40 methods published in the last 26 years. For a subset of eight incremental methods and four continuum-solvation models, a comparative analysis of their prediction performance is made using a test set of 700 compounds that includes a significant number of more complex and drug-like chemical structures. The results reveal substantial differences in the application range as well as in the prediction capability, a general decrease in prediction performance with decreasing H, and surprisingly large individual prediction errors, which are particularly striking for some quantum chemical schemes. The overall best-performing method appears to be the bond contribution method as implemented in the HENRYWIN software package, yielding a predictive squared correlation coefficient (q2) of 0.87 and a standard error of 1.03 log units for the test set.

  9. Sharp landscape features and their role in predictive hydrology and geomorphology

    NASA Astrophysics Data System (ADS)

    Belmont, P.; Foufoula, E.; Passalacqua, P.

    2012-12-01

    Sharp topographic features often represent critical boundaries, or discontinuities, in hydrologic and geomorphic processes. Many such features are found in the proximity of actively evolving river channels (e.g., small knickpoints, steep channel banks, natural levees, scroll bars, and floodplain microtopography). While these features are often overlooked in hydro-geomorphic modeling, they can be used as indicators of channel dynamics. The increasing availability and quality of high-resolution topography data provides new opportunities to utilize these sharp features to interpret geomorphic processes and identify critical process-boundaries. However, sophisticated and automated techniques are needed for delineation and measurement of these sharp features over spatially extensive areas (i.e., entire channel-floodplain networks). Further, these features occur at scales much smaller than the grid scale of predictive hydrologic and morphodynamic models, raising the need for sub-grid scale parameterizations, or closures. In this work we present such techniques and use the Minnesota River Basin (MRB) as a prototype system to investigate the distinct assemblages of sharp features that exist in different geomorphic environments, connect them to the processes responsible for their formation, and propose ways for incorporating them in hydro-geomorphologic modeling. The MRB is a predominantly agricultural watershed with pervasive human modifications, an accelerating hydrologic cycle, a uniquely dynamic geologic history, and severe impairments for sediment and eutrophication. The MRB channel-floodplain network exhibits an exceptionally broad range of geomorphic environments, including rapidly meandering, incising, and aggrading reaches, making it an ideal location to study the linkages between form and process. Specific challenges are discussed in deriving sub-grid scale closures that implicitly account for these sharp features and developments needed for increased prediction

  10. Computer-aided breast MR image feature analysis for prediction of tumor response to chemotherapy

    SciTech Connect

    Aghaei, Faranak; Tan, Maxine; Liu, Hong; Zheng, Bin; Hollingsworth, Alan B.; Qian, Wei

    2015-11-15

    Purpose: To identify a new clinical marker based on quantitative kinetic image features analysis and assess its feasibility to predict tumor response to neoadjuvant chemotherapy. Methods: The authors assembled a dataset involving breast MR images acquired from 68 cancer patients before undergoing neoadjuvant chemotherapy. Among them, 25 patients had complete response (CR) and 43 had partial and nonresponse (NR) to chemotherapy based on the response evaluation criteria in solid tumors. The authors developed a computer-aided detection scheme to segment breast areas and tumors depicted on the breast MR images and computed a total of 39 kinetic image features from both tumor and background parenchymal enhancement regions. The authors then applied and tested two approaches to classify between CR and NR cases. The first one analyzed each individual feature and applied a simple feature fusion method that combines classification results from multiple features. The second approach tested an attribute selected classifier that integrates an artificial neural network (ANN) with a wrapper subset evaluator, which was optimized using a leave-one-case-out validation method. Results: In the pool of 39 features, 10 yielded relatively higher classification performance with the areas under receiver operating characteristic curves (AUCs) ranging from 0.61 to 0.78 to classify between CR and NR cases. Using a feature fusion method, the maximum AUC = 0.85 ± 0.05. Using the ANN-based classifier, AUC value significantly increased to 0.96 ± 0.03 (p < 0.01). Conclusions: This study demonstrated that quantitative analysis of kinetic image features computed from breast MR images acquired prechemotherapy has potential to generate a useful clinical marker in predicting tumor response to chemotherapy.

  11. Comprehensible Predictive Modeling Using Regularized Logistic Regression and Comorbidity Based Features

    PubMed Central

    Stiglic, Gregor; Povalej Brzan, Petra; Fijacko, Nino; Wang, Fei; Delibasic, Boris; Kalousis, Alexandros; Obradovic, Zoran

    2015-01-01

    Different studies have demonstrated the importance of comorbidities to better understand the origin and evolution of medical complications. This study focuses on improvement of the predictive model interpretability based on simple logical features representing comorbidities. We use group lasso based feature interaction discovery followed by a post-processing step, where simple logic terms are added. In the final step, we reduce the feature set by applying lasso logistic regression to obtain a compact set of non-zero coefficients that represent a more comprehensible predictive model. The effectiveness of the proposed approach was demonstrated on a pediatric hospital discharge dataset that was used to build a readmission risk estimation model. The evaluation of the proposed method demonstrates a reduction of the initial set of features in a regression model by 72%, with a slight improvement in the Area Under the ROC Curve metric from 0.763 (95% CI: 0.755–0.771) to 0.769 (95% CI: 0.761–0.777). Additionally, our results show improvement in comprehensibility of the final predictive model using simple comorbidity based terms for logistic regression. PMID:26645087

  12. A machine-learning approach for predicting palmitoylation sites from integrated sequence-based features.

    PubMed

    Li, Liqi; Luo, Qifa; Xiao, Weidong; Li, Jinhui; Zhou, Shiwen; Li, Yongsheng; Zheng, Xiaoqi; Yang, Hua

    2017-02-01

    Palmitoylation is the covalent attachment of lipids to amino acid residues in proteins. As an important form of protein posttranslational modification, it increases the hydrophobicity of proteins, which contributes to the protein transportation, organelle localization, and functions, therefore plays an important role in a variety of cell biological processes. Identification of palmitoylation sites is necessary for understanding protein-protein interaction, protein stability, and activity. Since conventional experimental techniques to determine palmitoylation sites in proteins are both labor intensive and costly, a fast and accurate computational approach to predict palmitoylation sites from protein sequences is in urgent need. In this study, a support vector machine (SVM)-based method was proposed through integrating PSI-BLAST profile, physicochemical properties, [Formula: see text]-mer amino acid compositions (AACs), and [Formula: see text]-mer pseudo AACs into the principal feature vector. A recursive feature selection scheme was subsequently implemented to single out the most discriminative features. Finally, an SVM method was implemented to predict palmitoylation sites in proteins based on the optimal features. The proposed method achieved an accuracy of 99.41% and Matthews Correlation Coefficient of 0.9773 for a benchmark dataset. The result indicates the efficiency and accuracy of our method in prediction of palmitoylation sites based on protein sequences.

  13. Molecular Markers for Breast Cancer: Prediction on Tumor Behavior

    PubMed Central

    Banin Hirata, Bruna Karina; Oda, Julie Massayo Maeda; Losi Guembarovski, Roberta; Ariza, Carolina Batista; de Oliveira, Carlos Eduardo Coral; Watanabe, Maria Angelica Ehara

    2014-01-01

    Breast cancer is one of the most common cancers with greater than 1,300,000 cases and 450,000 deaths each year worldwide. The development of breast cancer involves a progression through intermediate stages until the invasive carcinoma and finally into metastatic disease. Given the variability in clinical progression, the identification of markers that could predict the tumor behavior is particularly important in breast cancer. The determination of tumor markers is a useful tool for clinical management in cancer patients, assisting in diagnostic, staging, evaluation of therapeutic response, detection of recurrence and metastasis, and development of new treatment modalities. In this context, this review aims to discuss the main tumor markers in breast carcinogenesis. The most well-established breast molecular markers with prognostic and/or therapeutic value like hormone receptors, HER-2 oncogene, Ki-67, and p53 proteins, and the genes for hereditary breast cancer will be presented. Furthermore, this review shows the new molecular targets in breast cancer: CXCR4, caveolin, miRNA, and FOXP3, as promising candidates for future development of effective and targeted therapies, also with lower toxicity. PMID:24591761

  14. Predicting the Occurrence of Cave-Inhabiting Fauna Based on Features of the Earth Surface Environment

    PubMed Central

    Doctor, Daniel H.; Niemiller, Matthew L.; Weary, David J.; Young, John A.; Zigler, Kirk S.

    2016-01-01

    One of the most challenging fauna to study in situ is the obligate cave fauna because of the difficulty of sampling. Cave-limited species display patchy and restricted distributions, but it is often unclear whether the observed distribution is a sampling artifact or a true restriction in range. Further, the drivers of the distribution could be local environmental conditions, such as cave humidity, or they could be associated with surface features that are surrogates for cave conditions. If surface features can be used to predict the distribution of important cave taxa, then conservation management is more easily obtained. We examined the hypothesis that the presence of major faunal groups of cave obligate species could be predicted based on features of the earth surface. Georeferenced records of cave obligate amphipods, crayfish, fish, isopods, beetles, millipedes, pseudoscorpions, spiders, and springtails within the area of Appalachian Landscape Conservation Cooperative in the eastern United States (Illinois to Virginia and New York to Alabama) were assigned to 20 x 20 km grid cells. Habitat suitability for these faunal groups was modeled using logistic regression with twenty predictor variables within each grid cell, such as percent karst, soil features, temperature, precipitation, and elevation. Models successfully predicted the presence of a group greater than 65% of the time (mean = 88%) for the presence of single grid cell endemics, and for all faunal groups except pseudoscorpions. The most common predictor variables were latitude, percent karst, and the standard deviation of the Topographic Position Index (TPI), a measure of landscape rugosity within each grid cell. The overall success of these models points to a number of important connections between the surface and cave environments, and some of these, especially soil features and topographic variability, suggest new research directions. These models should prove to be useful tools in predicting the

  15. Predicting the occurrence of cave-inhabiting fauna based on features of the earth surface environment

    USGS Publications Warehouse

    Christman, Mary C.; Doctor, Daniel H.; Niemiller, Matthew L.; Weary, David J.; Young, John A.; Zigler, Kirk S.; Culver, David C.

    2016-01-01

    One of the most challenging fauna to study in situ is the obligate cave fauna because of the difficulty of sampling. Cave-limited species display patchy and restricted distributions, but it is often unclear whether the observed distribution is a sampling artifact or a true restriction in range. Further, the drivers of the distribution could be local environmental conditions, such as cave humidity, or they could be associated with surface features that are surrogates for cave conditions. If surface features can be used to predict the distribution of important cave taxa, then conservation management is more easily obtained. We examined the hypothesis that the presence of major faunal groups of cave obligate species could be predicted based on features of the earth surface. Georeferenced records of cave obligate amphipods, crayfish, fish, isopods, beetles, millipedes, pseudoscorpions, spiders, and springtails within the area of Appalachian Landscape Conservation Cooperative in the eastern United States (Illinois to Virginia and New York to Alabama) were assigned to 20 x 20 km grid cells. Habitat suitability for these faunal groups was modeled using logistic regression with twenty predictor variables within each grid cell, such as percent karst, soil features, temperature, precipitation, and elevation. Models successfully predicted the presence of a group greater than 65% of the time (mean = 88%) for the presence of single grid cell endemics, and for all faunal groups except pseudoscorpions. The most common predictor variables were latitude, percent karst, and the standard deviation of the Topographic Position Index (TPI), a measure of landscape rugosity within each grid cell. The overall success of these models points to a number of important connections between the surface and cave environments, and some of these, especially soil features and topographic variability, suggest new research directions. These models should prove to be useful tools in predicting the

  16. Predicting the Occurrence of Cave-Inhabiting Fauna Based on Features of the Earth Surface Environment.

    PubMed

    Christman, Mary C; Doctor, Daniel H; Niemiller, Matthew L; Weary, David J; Young, John A; Zigler, Kirk S; Culver, David C

    2016-01-01

    One of the most challenging fauna to study in situ is the obligate cave fauna because of the difficulty of sampling. Cave-limited species display patchy and restricted distributions, but it is often unclear whether the observed distribution is a sampling artifact or a true restriction in range. Further, the drivers of the distribution could be local environmental conditions, such as cave humidity, or they could be associated with surface features that are surrogates for cave conditions. If surface features can be used to predict the distribution of important cave taxa, then conservation management is more easily obtained. We examined the hypothesis that the presence of major faunal groups of cave obligate species could be predicted based on features of the earth surface. Georeferenced records of cave obligate amphipods, crayfish, fish, isopods, beetles, millipedes, pseudoscorpions, spiders, and springtails within the area of Appalachian Landscape Conservation Cooperative in the eastern United States (Illinois to Virginia and New York to Alabama) were assigned to 20 x 20 km grid cells. Habitat suitability for these faunal groups was modeled using logistic regression with twenty predictor variables within each grid cell, such as percent karst, soil features, temperature, precipitation, and elevation. Models successfully predicted the presence of a group greater than 65% of the time (mean = 88%) for the presence of single grid cell endemics, and for all faunal groups except pseudoscorpions. The most common predictor variables were latitude, percent karst, and the standard deviation of the Topographic Position Index (TPI), a measure of landscape rugosity within each grid cell. The overall success of these models points to a number of important connections between the surface and cave environments, and some of these, especially soil features and topographic variability, suggest new research directions. These models should prove to be useful tools in predicting the

  17. Morphological features of IFN-γ–stimulated mesenchymal stromal cells predict overall immunosuppressive capacity

    PubMed Central

    Klinker, Matthew W.; Marklein, Ross A.; Lo Surdo, Jessica L.; Wei, Cheng-Hong

    2017-01-01

    Human mesenchymal stromal cell (MSC) lines can vary significantly in their functional characteristics, and the effectiveness of MSC-based therapeutics may be realized by finding predictive features associated with MSC function. To identify features associated with immunosuppressive capacity in MSCs, we developed a robust in vitro assay that uses principal-component analysis to integrate multidimensional flow cytometry data into a single measurement of MSC-mediated inhibition of T-cell activation. We used this assay to correlate single-cell morphological data with overall immunosuppressive capacity in a cohort of MSC lines derived from different donors and manufacturing conditions. MSC morphology after IFN-γ stimulation significantly correlated with immunosuppressive capacity and accurately predicted the immunosuppressive capacity of MSC lines in a validation cohort. IFN-γ enhanced the immunosuppressive capacity of all MSC lines, and morphology predicted the magnitude of IFN-γ–enhanced immunosuppressive activity. Together, these data identify MSC morphology as a predictive feature of MSC immunosuppressive function. PMID:28283659

  18. Biased ART: a neural architecture that shifts attention toward previously disregarded features following an incorrect prediction.

    PubMed

    Carpenter, Gail A; Gaddam, Sai Chaitanya

    2010-04-01

    Memories in Adaptive Resonance Theory (ART) networks are based on matched patterns that focus attention on those portions of bottom-up inputs that match active top-down expectations. While this learning strategy has proved successful for both brain models and applications, computational examples show that attention to early critical features may later distort memory representations during online fast learning. For supervised learning, biased ARTMAP (bARTMAP) solves the problem of over-emphasis on early critical features by directing attention away from previously attended features after the system makes a predictive error. Small-scale, hand-computed analog and binary examples illustrate key model dynamics. Two-dimensional simulation examples demonstrate the evolution of bARTMAP memories as they are learned online. Benchmark simulations show that featural biasing also improves performance on large-scale examples. One example, which predicts movie genres and is based, in part, on the Netflix Prize database, was developed for this project. Both first principles and consistent performance improvements on all simulation studies suggest that featural biasing should be incorporated by default in all ARTMAP systems. Benchmark datasets and bARTMAP code are available from the CNS Technology Lab Website: http://techlab.bu.edu/bART/.

  19. MRI texture features as biomarkers to predict MGMT methylation status in glioblastomas

    PubMed Central

    Korfiatis, Panagiotis; Kline, Timothy L.; Coufalova, Lucie; Lachance, Daniel H.; Parney, Ian F.; Carter, Rickey E.; Buckner, Jan C.; Erickson, Bradley J.

    2016-01-01

    Purpose: Imaging biomarker research focuses on discovering relationships between radiological features and histological findings. In glioblastoma patients, methylation of the O6-methylguanine methyltransferase (MGMT) gene promoter is positively correlated with an increased effectiveness of current standard of care. In this paper, the authors investigate texture features as potential imaging biomarkers for capturing the MGMT methylation status of glioblastoma multiforme (GBM) tumors when combined with supervised classification schemes. Methods: A retrospective study of 155 GBM patients with known MGMT methylation status was conducted. Co-occurrence and run length texture features were calculated, and both support vector machines (SVMs) and random forest classifiers were used to predict MGMT methylation status. Results: The best classification system (an SVM-based classifier) had a maximum area under the receiver-operating characteristic (ROC) curve of 0.85 (95% CI: 0.78–0.91) using four texture features (correlation, energy, entropy, and local intensity) originating from the T2-weighted images, yielding at the optimal threshold of the ROC curve, a sensitivity of 0.803 and a specificity of 0.813. Conclusions: Results show that supervised machine learning of MRI texture features can predict MGMT methylation status in preoperative GBM tumors, thus providing a new noninvasive imaging biomarker. PMID:27277032

  20. HER2 status in molecular apocrine breast cancer: associations with clinical, pathological, and molecular features

    PubMed Central

    Guo, Wenwen; Wang, Wei; Zhu, Yun; Zhu, Xiaojing; Shi, Zhongyuan; Wang, Yan

    2015-01-01

    Molecular apocrine breast cancer (MABC) is a distinct subtype of breast cancer. The purpose of this study was to investigate the relationship between HER2 status and clinicopathologic characteristics of MABCs from Chinese Han cohort. A cohort of 90 MABC patients were enrolled. Immunohistochemical method was performed to analyze the molecular expression, and the human epidermal growth factor receptor 2 (HER2) amplification was verified by fluorescence in situ hybridization (FISH). By studying these 90 MABC cases, the majority of studied patients were premenopausal young women (median age 48 yr) with high grade tumors. We also found that MABCs had high positive expression rates of HER2, CK8, CD44, CD166, p53 and BRCA1, the elevated Ki-67 labeling index, and favorable prognosis. There was a significantly higher incidence of lymph node metastasis and lower CD166 positive rate in HER2-negative patients compared to HER2-positive patients (54.5% vs. 37.0%, P = 0.044 and 72.7% vs. 91.3%, P = 0.021, respectively). The CK5/6 and EGFR expression rates were significant higher in HER2-negative cases than in HER2-positive cases, suggesting that there is overlap between MABC with HER2-negative phenotype and basal-like breast cancer. In addition, HER2 positive was found to be significantly associated a poor overall survival in MABCs. In conclusion, HER2 are highly expressed, and HER2 positivity could be considered as a significant biomarker of poor prognosis in MABC. The results also suggest that a subtype tumor with distinct patterns of molecule expression depending on HER2 status presented in MABC. PMID:26339367

  1. HER2 status in molecular apocrine breast cancer: associations with clinical, pathological, and molecular features.

    PubMed

    Guo, Wenwen; Wang, Wei; Zhu, Yun; Zhu, Xiaojing; Shi, Zhongyuan; Wang, Yan

    2015-01-01

    Molecular apocrine breast cancer (MABC) is a distinct subtype of breast cancer. The purpose of this study was to investigate the relationship between HER2 status and clinicopathologic characteristics of MABCs from Chinese Han cohort. A cohort of 90 MABC patients were enrolled. Immunohistochemical method was performed to analyze the molecular expression, and the human epidermal growth factor receptor 2 (HER2) amplification was verified by fluorescence in situ hybridization (FISH). By studying these 90 MABC cases, the majority of studied patients were premenopausal young women (median age 48 yr) with high grade tumors. We also found that MABCs had high positive expression rates of HER2, CK8, CD44, CD166, p53 and BRCA1, the elevated Ki-67 labeling index, and favorable prognosis. There was a significantly higher incidence of lymph node metastasis and lower CD166 positive rate in HER2-negative patients compared to HER2-positive patients (54.5% vs. 37.0%, P = 0.044 and 72.7% vs. 91.3%, P = 0.021, respectively). The CK5/6 and EGFR expression rates were significant higher in HER2-negative cases than in HER2-positive cases, suggesting that there is overlap between MABC with HER2-negative phenotype and basal-like breast cancer. In addition, HER2 positive was found to be significantly associated a poor overall survival in MABCs. In conclusion, HER2 are highly expressed, and HER2 positivity could be considered as a significant biomarker of poor prognosis in MABC. The results also suggest that a subtype tumor with distinct patterns of molecule expression depending on HER2 status presented in MABC.

  2. Accurate Prediction of One-Dimensional Protein Structure Features Using SPINE-X.

    PubMed

    Faraggi, Eshel; Kloczkowski, Andrzej

    2017-01-01

    Accurate prediction of protein secondary structure and other one-dimensional structure features is essential for accurate sequence alignment, three-dimensional structure modeling, and function prediction. SPINE-X is a software package to predict secondary structure as well as accessible surface area and dihedral angles ϕ and ψ. For secondary structure SPINE-X achieves an accuracy of between 81 and 84 % depending on the dataset and choice of tests. The Pearson correlation coefficient for accessible surface area prediction is 0.75 and the mean absolute error from the ϕ and ψ dihedral angles are 20(∘) and 33(∘), respectively. The source code and a Linux executables for SPINE-X are available from Research and Information Systems at http://mamiris.com .

  3. Apocrine carcinoma of the breast: A brief update on the molecular features and targetable biomarkers

    PubMed Central

    Vranic, Semir; Feldman, Rebecca; Gatalica, Zoran

    2017-01-01

    Apocrine carcinoma of the breast is a rare, primary breast cancer characterized by the apocrine morphology, estrogen receptor-negative and androgen receptor-positive profile with a frequent overexpression of Her-2/neu protein (~30%). Apart from the Her-2/neu target, advanced and/or metastatic apocrine carcinomas have limited treatment options. In this review, we briefly describe and discuss the molecular features and new theranostic biomarkers for this rare mammary malignancy. The importance of comprehensive profiling is highlighted due to synergistic and potentially antagonistic molecular events in the individual patients. PMID:28027454

  4. Feature Selection Methods for Early Predictive Biomarker Discovery Using Untargeted Metabolomic Data

    PubMed Central

    Grissa, Dhouha; Pétéra, Mélanie; Brandolini, Marion; Napoli, Amedeo; Comte, Blandine; Pujos-Guillot, Estelle

    2016-01-01

    Untargeted metabolomics is a powerful phenotyping tool for better understanding biological mechanisms involved in human pathology development and identifying early predictive biomarkers. This approach, based on multiple analytical platforms, such as mass spectrometry (MS), chemometrics and bioinformatics, generates massive and complex data that need appropriate analyses to extract the biologically meaningful information. Despite various tools available, it is still a challenge to handle such large and noisy datasets with limited number of individuals without risking overfitting. Moreover, when the objective is focused on the identification of early predictive markers of clinical outcome, few years before occurrence, it becomes essential to use the appropriate algorithms and workflow to be able to discover subtle effects among this large amount of data. In this context, this work consists in studying a workflow describing the general feature selection process, using knowledge discovery and data mining methodologies to propose advanced solutions for predictive biomarker discovery. The strategy was focused on evaluating a combination of numeric-symbolic approaches for feature selection with the objective of obtaining the best combination of metabolites producing an effective and accurate predictive model. Relying first on numerical approaches, and especially on machine learning methods (SVM-RFE, RF, RF-RFE) and on univariate statistical analyses (ANOVA), a comparative study was performed on an original metabolomic dataset and reduced subsets. As resampling method, LOOCV was applied to minimize the risk of overfitting. The best k-features obtained with different scores of importance from the combination of these different approaches were compared and allowed determining the variable stabilities using Formal Concept Analysis. The results revealed the interest of RF-Gini combined with ANOVA for feature selection as these two complementary methods allowed selecting the 48

  5. Feature Selection Methods for Early Predictive Biomarker Discovery Using Untargeted Metabolomic Data.

    PubMed

    Grissa, Dhouha; Pétéra, Mélanie; Brandolini, Marion; Napoli, Amedeo; Comte, Blandine; Pujos-Guillot, Estelle

    2016-01-01

    Untargeted metabolomics is a powerful phenotyping tool for better understanding biological mechanisms involved in human pathology development and identifying early predictive biomarkers. This approach, based on multiple analytical platforms, such as mass spectrometry (MS), chemometrics and bioinformatics, generates massive and complex data that need appropriate analyses to extract the biologically meaningful information. Despite various tools available, it is still a challenge to handle such large and noisy datasets with limited number of individuals without risking overfitting. Moreover, when the objective is focused on the identification of early predictive markers of clinical outcome, few years before occurrence, it becomes essential to use the appropriate algorithms and workflow to be able to discover subtle effects among this large amount of data. In this context, this work consists in studying a workflow describing the general feature selection process, using knowledge discovery and data mining methodologies to propose advanced solutions for predictive biomarker discovery. The strategy was focused on evaluating a combination of numeric-symbolic approaches for feature selection with the objective of obtaining the best combination of metabolites producing an effective and accurate predictive model. Relying first on numerical approaches, and especially on machine learning methods (SVM-RFE, RF, RF-RFE) and on univariate statistical analyses (ANOVA), a comparative study was performed on an original metabolomic dataset and reduced subsets. As resampling method, LOOCV was applied to minimize the risk of overfitting. The best k-features obtained with different scores of importance from the combination of these different approaches were compared and allowed determining the variable stabilities using Formal Concept Analysis. The results revealed the interest of RF-Gini combined with ANOVA for feature selection as these two complementary methods allowed selecting the 48

  6. Introduction: feature issue on optical molecular probes, imaging, and drug delivery.

    PubMed

    Campagnola, Paul; French, Paul M W; Georgakoudi, Irene; Mycek, Mary-Ann

    2014-02-01

    The editors introduce the Biomedical Optics Express feature issue "Optical Molecular Probes, Imaging, and Drug Delivery," which is associated with a Topical Meeting of the same name held at the 2013 Optical Society of America (OSA) Optics in the Life Sciences Congress in Waikoloa Beach, Hawaii, April 14-18, 2013. The international meeting focused on the convergence of optical physics, photonics technology, nanoscience, and photochemistry with drug discovery and clinical medicine. Papers in this feature issue are representative of meeting topics, including advances in microscopy, nanotechnology, and optics in cancer research.

  7. Relationship of carbohydrate molecular spectroscopic features in combined feeds to carbohydrate utilization and availability in ruminants

    NASA Astrophysics Data System (ADS)

    Zhang, Xuewei; Yu, Peiqiang

    To date, there is no study on the relationship between carbohydrate (CHO) molecular structures and nutrient availability of combined feeds in ruminants. The objective of this study was to use molecular spectroscopy to reveal the relationship between CHO molecular spectral profiles (in terms of functional groups (biomolecular, biopolymer) spectral peak area and height intensity) and CHO chemical profiles, CHO subfractions, energy values, and CHO rumen degradation kinetics of combined feeds of hulless barley with pure wheat dried distillers grains with solubles (DDGS) at five different combination ratios (hulless barley to pure wheat DDGS: 100:0, 75:25, 50:50, 25:75, 0:100). The molecular spectroscopic parameters assessed included: lignin biopolymer molecular spectra profile (peak area and height, region and baseline: ca. 1539-1504 cm-1); structural carbohydrate (STCHO, peaks area region and baseline: ca. 1485-1186 cm-1) mainly associated with hemi- and cellulosic compounds; cellulosic materials peak area (centered at ca. 1240 cm-1 with region and baseline: ca. 1272-1186 cm-1); total carbohydrate (CHO, peaks area region and baseline: ca. 1186-946 cm-1). The results showed that the functional groups (biomolecular, biopolymer) in the combined feeds are sensitive to the changes of carbohydrate chemical and nutrient profiles. The changes of the CHO molecular spectroscopic features in the combined feeds were highly correlated with CHO chemical profiles, CHO subfractions, in situ CHO rumen degradation kinetics and fermentable organic matter supply. Further study is needed to investigate possibility of using CHO molecular spectral features as a predictor to estimate nutrient availability in combined feeds for animals and quantify their relationship.

  8. Predicting solubilisation features of ternary phase diagrams of fully dilutable lecithin linker microemulsions.

    PubMed

    Nouraei, Mehdi; Acosta, Edgar J

    2017-06-01

    Fully dilutable microemulsions (μEs), used to design self-microemulsifying delivery system (SMEDS), are formulated as concentrate solutions containing oil and surfactants, without water. As water is added to dilute these systems, various μEs are produced (water-swollen reverse micelles, bicontinuous systems, and oil-swollen micelles), without the onset of phase separation. Currently, the formulation dilutable μEs follows a trial and error approach that has had a limited success. The objective of this work is to introduce the use of the hydrophilic-lipophilic-difference (HLD) and net-average-curvature (NAC) frameworks to predict the solubilisation features of ternary phase diagrams of lecithin-linker μEs and the use of these predictions to guide the formulation of dilutable μEs. To this end, the characteristic curvatures (Cc) of soybean lecithin (surfactant), glycerol monooleate (lipophilic linker) and polyglycerol caprylate (hydrophilic linker) and the equivalent alkane carbon number (EACN) of ethyl caprate (oil) were obtained via phase scans with reference surfactant-oil systems. These parameters were then used to calculate the HLD of lecithin-linkers-ethyl caprate microemulsions. The calculated HLDs were able to predict the phase transitions observed in the phase scans. The NAC was then used to fit and predict phase volumes obtained from salinity phase scans, and to predict the solubilisation features of ternary phase diagrams of the lecithin-linker formulations. The HLD-NAC predictions were reasonably accurate, and indicated that the largest region for dilutable μEs was obtained with slightly negative HLD values. The NAC framework also predicted, and explained, the changes in microemulsion properties along dilution lines.

  9. Systems Medicine: from molecular features and models to the clinic in COPD

    PubMed Central

    2014-01-01

    Background and hypothesis Chronic Obstructive Pulmonary Disease (COPD) patients are characterized by heterogeneous clinical manifestations and patterns of disease progression. Two major factors that can be used to identify COPD subtypes are muscle dysfunction/wasting and co-morbidity patterns. We hypothesized that COPD heterogeneity is in part the result of complex interactions between several genes and pathways. We explored the possibility of using a Systems Medicine approach to identify such pathways, as well as to generate predictive computational models that may be used in clinic practice. Objective and method Our overarching goal is to generate clinically applicable predictive models that characterize COPD heterogeneity through a Systems Medicine approach. To this end we have developed a general framework, consisting of three steps/objectives: (1) feature identification, (2) model generation and statistical validation, and (3) application and validation of the predictive models in the clinical scenario. We used muscle dysfunction and co-morbidity as test cases for this framework. Results In the study of muscle wasting we identified relevant features (genes) by a network analysis and generated predictive models that integrate mechanistic and probabilistic models. This allowed us to characterize muscle wasting as a general de-regulation of pathway interactions. In the co-morbidity analysis we identified relevant features (genes/pathways) by the integration of gene-disease and disease-disease associations. We further present a detailed characterization of co-morbidities in COPD patients that was implemented into a predictive model. In both use cases we were able to achieve predictive modeling but we also identified several key challenges, the most pressing being the validation and implementation into actual clinical practice. Conclusions The results confirm the potential of the Systems Medicine approach to study complex diseases and generate clinically relevant

  10. Machine learning methods enable predictive modeling of antibody feature:function relationships in RV144 vaccinees.

    PubMed

    Choi, Ickwon; Chung, Amy W; Suscovich, Todd J; Rerks-Ngarm, Supachai; Pitisuttithum, Punnee; Nitayaphan, Sorachai; Kaewkungwal, Jaranit; O'Connell, Robert J; Francis, Donald; Robb, Merlin L; Michael, Nelson L; Kim, Jerome H; Alter, Galit; Ackerman, Margaret E; Bailey-Kellogg, Chris

    2015-04-01

    The adaptive immune response to vaccination or infection can lead to the production of specific antibodies to neutralize the pathogen or recruit innate immune effector cells for help. The non-neutralizing role of antibodies in stimulating effector cell responses may have been a key mechanism of the protection observed in the RV144 HIV vaccine trial. In an extensive investigation of a rich set of data collected from RV144 vaccine recipients, we here employ machine learning methods to identify and model associations between antibody features (IgG subclass and antigen specificity) and effector function activities (antibody dependent cellular phagocytosis, cellular cytotoxicity, and cytokine release). We demonstrate via cross-validation that classification and regression approaches can effectively use the antibody features to robustly predict qualitative and quantitative functional outcomes. This integration of antibody feature and function data within a machine learning framework provides a new, objective approach to discovering and assessing multivariate immune correlates.

  11. Efficacy of computed tomography features in predicting stage III thymic tumors

    PubMed Central

    Shen, Yan; Ye, Jianding; Fang, Wentao; Zhang, Yu; Ye, Xiaodan; Ma, Yonghong; Chen, Libo; Li, Minghua

    2017-01-01

    Accurate assessment of the invasion of intrathoracic structures by stage III thymic tumors assists their appropriate management. The present study aimed to evaluate the efficacy of computed tomography (CT) features for the prediction of stage III thymoma invasion. The pre-operative CT images of 66 patients with confirmed stage III thymic tumors were reviewed retrospectively. The CT features of invasion into the mediastinal pleura, lungs, pericardium and great vessels were analyzed, and their sensitivity, specificity, positive predictive value (PPV), negative predictive value and accuracy were calculated. For mediastinal pleural and pericardial invasion, an absence of space between the tumor and the mediastinal pleura/pericardium with mediastinal pleural/pericardial thickening and pleural/pericardial effusion exhibited a specificity and PPV of 100%, respectively. For lung invasion, a multi-lobular tumor convex to the lung with adjacent lung abnormalities exhibited a specificity and PPV of 91.2 and 81.3%, respectively. For vessel invasion, the specificity and PPV were each 100% for tumors abutting ≥50% of the vessel circumference, and for tumor oppression, deformation and occlusion of the vessel. In conclusion, recognition of the appropriate CT features can serve as a guide to invasion by stage III thymic tumors, and can facilitate the selection of appropriate pre-operative treatment. PMID:28123518

  12. Protein subcellular localization prediction based on compartment-specific features and structure conservation

    PubMed Central

    Su, Emily Chia-Yu; Chiu, Hua-Sheng; Lo, Allan; Hwang, Jenn-Kang; Sung, Ting-Yi; Hsu, Wen-Lian

    2007-01-01

    Background Protein subcellular localization is crucial for genome annotation, protein function prediction, and drug discovery. Determination of subcellular localization using experimental approaches is time-consuming; thus, computational approaches become highly desirable. Extensive studies of localization prediction have led to the development of several methods including composition-based and homology-based methods. However, their performance might be significantly degraded if homologous sequences are not detected. Moreover, methods that integrate various features could suffer from the problem of low coverage in high-throughput proteomic analyses due to the lack of information to characterize unknown proteins. Results We propose a hybrid prediction method for Gram-negative bacteria that combines a one-versus-one support vector machines (SVM) model and a structural homology approach. The SVM model comprises a number of binary classifiers, in which biological features derived from Gram-negative bacteria translocation pathways are incorporated. In the structural homology approach, we employ secondary structure alignment for structural similarity comparison and assign the known localization of the top-ranked protein as the predicted localization of a query protein. The hybrid method achieves overall accuracy of 93.7% and 93.2% using ten-fold cross-validation on the benchmark data sets. In the assessment of the evaluation data sets, our method also attains accurate prediction accuracy of 84.0%, especially when testing on sequences with a low level of homology to the training data. A three-way data split procedure is also incorporated to prevent overestimation of the predictive performance. In addition, we show that the prediction accuracy should be approximately 85% for non-redundant data sets of sequence identity less than 30%. Conclusion Our results demonstrate that biological features derived from Gram-negative bacteria translocation pathways yield a significant

  13. Prediction of subcellular location apoptosis proteins with ensemble classifier and feature selection.

    PubMed

    Gu, Quan; Ding, Yong-Sheng; Jiang, Xiao-Ying; Zhang, Tong-Liang

    2010-04-01

    Apoptosis proteins have a central role in the development and the homeostasis of an organism. These proteins are very important for understanding the mechanism of programmed cell death. The function of an apoptosis protein is closely related to its subcellular location. It is crucial to develop powerful tools to predict apoptosis protein locations for rapidly increasing gap between the number of known structural proteins and the number of known sequences in protein databank. In this study, amino acids pair compositions with different spaces are used to construct feature sets for representing sample of protein feature selection approach based on binary particle swarm optimization, which is applied to extract effective feature. Ensemble classifier is used as prediction engine, of which the basic classifier is the fuzzy K-nearest neighbor. Each basic classifier is trained with different feature sets. Two datasets often used in prior works are selected to validate the performance of proposed approach. The results obtained by jackknife test are quite encouraging, indicating that the proposed method might become a potentially useful tool for subcellular location of apoptosis protein, or at least can play a complimentary role to the existing methods in the relevant areas. The supplement information and software written in Matlab are available by contacting the corresponding author.

  14. Predicting DNA Methylation State of CpG Dinucleotide Using Genome Topological Features and Deep Networks.

    PubMed

    Wang, Yiheng; Liu, Tong; Xu, Dong; Shi, Huidong; Zhang, Chaoyang; Mo, Yin-Yuan; Wang, Zheng

    2016-01-22

    The hypo- or hyper-methylation of the human genome is one of the epigenetic features of leukemia. However, experimental approaches have only determined the methylation state of a small portion of the human genome. We developed deep learning based (stacked denoising autoencoders, or SdAs) software named "DeepMethyl" to predict the methylation state of DNA CpG dinucleotides using features inferred from three-dimensional genome topology (based on Hi-C) and DNA sequence patterns. We used the experimental data from immortalised myelogenous leukemia (K562) and healthy lymphoblastoid (GM12878) cell lines to train the learning models and assess prediction performance. We have tested various SdA architectures with different configurations of hidden layer(s) and amount of pre-training data and compared the performance of deep networks relative to support vector machines (SVMs). Using the methylation states of sequentially neighboring regions as one of the learning features, an SdA achieved a blind test accuracy of 89.7% for GM12878 and 88.6% for K562. When the methylation states of sequentially neighboring regions are unknown, the accuracies are 84.82% for GM12878 and 72.01% for K562. We also analyzed the contribution of genome topological features inferred from Hi-C. DeepMethyl can be accessed at http://dna.cs.usm.edu/deepmethyl/.

  15. Predicting DNA Methylation State of CpG Dinucleotide Using Genome Topological Features and Deep Networks

    NASA Astrophysics Data System (ADS)

    Wang, Yiheng; Liu, Tong; Xu, Dong; Shi, Huidong; Zhang, Chaoyang; Mo, Yin-Yuan; Wang, Zheng

    2016-01-01

    The hypo- or hyper-methylation of the human genome is one of the epigenetic features of leukemia. However, experimental approaches have only determined the methylation state of a small portion of the human genome. We developed deep learning based (stacked denoising autoencoders, or SdAs) software named “DeepMethyl” to predict the methylation state of DNA CpG dinucleotides using features inferred from three-dimensional genome topology (based on Hi-C) and DNA sequence patterns. We used the experimental data from immortalised myelogenous leukemia (K562) and healthy lymphoblastoid (GM12878) cell lines to train the learning models and assess prediction performance. We have tested various SdA architectures with different configurations of hidden layer(s) and amount of pre-training data and compared the performance of deep networks relative to support vector machines (SVMs). Using the methylation states of sequentially neighboring regions as one of the learning features, an SdA achieved a blind test accuracy of 89.7% for GM12878 and 88.6% for K562. When the methylation states of sequentially neighboring regions are unknown, the accuracies are 84.82% for GM12878 and 72.01% for K562. We also analyzed the contribution of genome topological features inferred from Hi-C. DeepMethyl can be accessed at http://dna.cs.usm.edu/deepmethyl/.

  16. Self-Adaptive MOEA Feature Selection for Classification of Bankruptcy Prediction Data

    PubMed Central

    Gaspar-Cunha, A.; Recio, G.; Costa, L.; Estébanez, C.

    2014-01-01

    Bankruptcy prediction is a vast area of finance and accounting whose importance lies in the relevance for creditors and investors in evaluating the likelihood of getting into bankrupt. As companies become complex, they develop sophisticated schemes to hide their real situation. In turn, making an estimation of the credit risks associated with counterparts or predicting bankruptcy becomes harder. Evolutionary algorithms have shown to be an excellent tool to deal with complex problems in finances and economics where a large number of irrelevant features are involved. This paper provides a methodology for feature selection in classification of bankruptcy data sets using an evolutionary multiobjective approach that simultaneously minimise the number of features and maximise the classifier quality measure (e.g., accuracy). The proposed methodology makes use of self-adaptation by applying the feature selection algorithm while simultaneously optimising the parameters of the classifier used. The methodology was applied to four different sets of data. The obtained results showed the utility of using the self-adaptation of the classifier. PMID:24707201

  17. Self-adaptive MOEA feature selection for classification of bankruptcy prediction data.

    PubMed

    Gaspar-Cunha, A; Recio, G; Costa, L; Estébanez, C

    2014-01-01

    Bankruptcy prediction is a vast area of finance and accounting whose importance lies in the relevance for creditors and investors in evaluating the likelihood of getting into bankrupt. As companies become complex, they develop sophisticated schemes to hide their real situation. In turn, making an estimation of the credit risks associated with counterparts or predicting bankruptcy becomes harder. Evolutionary algorithms have shown to be an excellent tool to deal with complex problems in finances and economics where a large number of irrelevant features are involved. This paper provides a methodology for feature selection in classification of bankruptcy data sets using an evolutionary multiobjective approach that simultaneously minimise the number of features and maximise the classifier quality measure (e.g., accuracy). The proposed methodology makes use of self-adaptation by applying the feature selection algorithm while simultaneously optimising the parameters of the classifier used. The methodology was applied to four different sets of data. The obtained results showed the utility of using the self-adaptation of the classifier.

  18. Habitat features and predictive habitat modeling for the Colorado chipmunk in southern New Mexico

    USGS Publications Warehouse

    Rivieccio, M.; Thompson, B.C.; Gould, W.R.; Boykin, K.G.

    2003-01-01

    Two subspecies of Colorado chipmunk (state threatened and federal species of concern) occur in southern New Mexico: Tamias quadrivittatus australis in the Organ Mountains and T. q. oscuraensis in the Oscura Mountains. We developed a GIS model of potentially suitable habitat based on vegetation and elevation features, evaluated site classifications of the GIS model, and determined vegetation and terrain features associated with chipmunk occurrence. We compared GIS model classifications with actual vegetation and elevation features measured at 37 sites. At 60 sites we measured 18 habitat variables regarding slope, aspect, tree species, shrub species, and ground cover. We used logistic regression to analyze habitat variables associated with chipmunk presence/absence. All (100%) 37 sample sites (28 predicted suitable, 9 predicted unsuitable) were classified correctly by the GIS model regarding elevation and vegetation. For 28 sites predicted suitable by the GIS model, 18 sites (64%) appeared visually suitable based on habitat variables selected from logistic regression analyses, of which 10 sites (36%) were specifically predicted as suitable habitat via logistic regression. We detected chipmunks at 70% of sites deemed suitable via the logistic regression models. Shrub cover, tree density, plant proximity, presence of logs, and presence of rock outcrop were retained in the logistic model for the Oscura Mountains; litter, shrub cover, and grass cover were retained in the logistic model for the Organ Mountains. Evaluation of predictive models illustrates the need for multi-stage analyses to best judge performance. Microhabitat analyses indicate prospective needs for different management strategies between the subspecies. Sensitivities of each population of the Colorado chipmunk to natural and prescribed fire suggest that partial burnings of areas inhabited by Colorado chipmunks in southern New Mexico may be beneficial. These partial burnings may later help avoid a fire

  19. A Combination of Molecular Markers and Clinical Features Improve the Classification of Pancreatic Cysts

    PubMed Central

    Springer, Simeon; Wang, Yuxuan; Molin, Marco Dal; Masica, David L.; Jiao, Yuchen; Kinde, Isaac; Blackford, Amanda; Raman, Siva P.; Wolfgang, Christopher L.; Tomita, Tyler; Niknafs, Noushin; Douville, Christopher; Ptak, Janine; Dobbyn, Lisa; Allen, Peter J.; Klimstra, David S.; Schattner, Mark A.; Schmidt, C. Max; Yip-Schneider, Michele; Cummings, Oscar W.; Brand, Randall E.; Zeh, Herbert J.; Singhi, Aatur D.; Scarpa, Aldo; Salvia, Roberto; Malleo, Giuseppe; Zamboni, Giuseppe; Falconi, Massimo; Jang, Jin-Young; Kim, Sun-Whe; Kwon, Wooil; Hong, Seung-Mo; Song, Ki-Byung; Kim, Song Cheol; Swan, Niall; Murphy, Jean; Geoghegan, Justin; Brugge, William; Fernandez-Del Castillo, Carlos; Mino-Kenudson, Mari; Schulick, Richard; Edil, Barish H.; Adsay, Volkan; Paulino, Jorge; van Hooft, Jeanin; Yachida, Shinichi; Nara, Satoshi; Hiraoka, Nobuyoshi; Yamao, Kenji; Hijioka, Susuma; van der Merwe, Schalk; Goggins, Michael; Canto, Marcia Irene; Ahuja, Nita; Hirose, Kenzo; Makary, Martin; Weiss, Matthew J.; Cameron, John; Pittman, Meredith; Eshleman, James R.; Diaz, Luis A.; Papadopoulos, Nickolas; Kinzler, Kenneth W.; Karchin, Rachel; Hruban, Ralph H.; Vogelstein, Bert; Lennon, Anne Marie

    2016-01-01

    Background & Aims The management of pancreatic cysts poses challenges to both patients and their physicians. We investigated whether a combination of molecular markers and clinical information could improve the classification of pancreatic cysts and management of patients. Methods We performed a multi-center, retrospective study of 130 patients with resected pancreatic cystic neoplasms (12 serous cystadenomas, 10 solid-pseudopapillary neoplasms, 12 mucinous cystic neoplasms, and 96 intraductal papillary mucinous neoplasms). Cyst fluid was analyzed to identify subtle mutations in genes known to be mutated in pancreatic cysts (BRAF, CDKN2A, CTNNB1, GNAS, KRAS, NRAS, PIK3CA, RNF43, SMAD4, TP53 and VHL); to identify loss of heterozygozity at CDKN2A, RNF43, SMAD4, TP53, and VHL tumor suppressor loci; and to identify aneuploidy. The analyses were performed using specialized technologies for implementing and interpreting massively parallel sequencing data acquisition. An algorithm was used to select markers that could classify cyst type and grade. The accuracy of the molecular markers were compared with that of clinical markers, and a combination of molecular and clinical markers. Results We identified molecular markers and clinical features that classified cyst type with 90%–100% sensitivity and 92%–98% specificity. The molecular marker panel correctly identified 67 of the 74 patients who did not require surgery, and could therefore reduce the number of unnecessary operations by 91%. Conclusions We identified a panel of molecular markers and clinical features that show promise for the accurate classification of cystic neoplasms of the pancreas and identification of cysts that require surgery. PMID:26253305

  20. Combinatorial modeling of chromatin features quantitatively predicts DNA replication timing in Drosophila.

    PubMed

    Comoglio, Federico; Paro, Renato

    2014-01-01

    In metazoans, each cell type follows a characteristic, spatio-temporally regulated DNA replication program. Histone modifications (HMs) and chromatin binding proteins (CBPs) are fundamental for a faithful progression and completion of this process. However, no individual HM is strictly indispensable for origin function, suggesting that HMs may act combinatorially in analogy to the histone code hypothesis for transcriptional regulation. In contrast to gene expression however, the relationship between combinations of chromatin features and DNA replication timing has not yet been demonstrated. Here, by exploiting a comprehensive data collection consisting of 95 CBPs and HMs we investigated their combinatorial potential for the prediction of DNA replication timing in Drosophila using quantitative statistical models. We found that while combinations of CBPs exhibit moderate predictive power for replication timing, pairwise interactions between HMs lead to accurate predictions genome-wide that can be locally further improved by CBPs. Independent feature importance and model analyses led us to derive a simplified, biologically interpretable model of the relationship between chromatin landscape and replication timing reaching 80% of the full model accuracy using six model terms. Finally, we show that pairwise combinations of HMs are able to predict differential DNA replication timing across different cell types. All in all, our work provides support to the existence of combinatorial HM patterns for DNA replication and reveal cell-type independent key elements thereof, whose experimental investigation might contribute to elucidate the regulatory mode of this fundamental cellular process.

  1. Prediction of molecular mimicry candidates in human pathogenic bacteria.

    PubMed

    Doxey, Andrew C; McConkey, Brendan J

    2013-08-15

    Molecular mimicry of host proteins is a common strategy adopted by bacterial pathogens to interfere with and exploit host processes. Despite the availability of pathogen genomes, few studies have attempted to predict virulence-associated mimicry relationships directly from genomic sequences. Here, we analyzed the proteomes of 62 pathogenic and 66 non-pathogenic bacterial species, and screened for the top pathogen-specific or pathogen-enriched sequence similarities to human proteins. The screen identified approximately 100 potential mimicry relationships including well-characterized examples among the top-scoring hits (e.g., RalF, internalin, yopH, and others), with about 1/3 of predicted relationships supported by existing literature. Examination of homology to virulence factors, statistically enriched functions, and comparison with literature indicated that the detected mimics target key host structures (e.g., extracellular matrix, ECM) and pathways (e.g., cell adhesion, lipid metabolism, and immune signaling). The top-scoring and most widespread mimicry pattern detected among pathogens consisted of elevated sequence similarities to ECM proteins including collagens and leucine-rich repeat proteins. Unexpectedly, analysis of the pathogen counterparts of these proteins revealed that they have evolved independently in different species of bacterial pathogens from separate repeat amplifications. Thus, our analysis provides evidence for two classes of mimics: complex proteins such as enzymes that have been acquired by eukaryote-to-pathogen horizontal transfer, and simpler repeat proteins that have independently evolved to mimic the host ECM. Ultimately, computational detection of pathogen-specific and pathogen-enriched similarities to host proteins provides insights into potentially novel mimicry-mediated virulence mechanisms of pathogenic bacteria.

  2. Prediction of molecular mimicry candidates in human pathogenic bacteria

    PubMed Central

    Doxey, Andrew C; McConkey, Brendan J

    2013-01-01

    Molecular mimicry of host proteins is a common strategy adopted by bacterial pathogens to interfere with and exploit host processes. Despite the availability of pathogen genomes, few studies have attempted to predict virulence-associated mimicry relationships directly from genomic sequences. Here, we analyzed the proteomes of 62 pathogenic and 66 non-pathogenic bacterial species, and screened for the top pathogen-specific or pathogen-enriched sequence similarities to human proteins. The screen identified approximately 100 potential mimicry relationships including well-characterized examples among the top-scoring hits (e.g., RalF, internalin, yopH, and others), with about 1/3 of predicted relationships supported by existing literature. Examination of homology to virulence factors, statistically enriched functions, and comparison with literature indicated that the detected mimics target key host structures (e.g., extracellular matrix, ECM) and pathways (e.g., cell adhesion, lipid metabolism, and immune signaling). The top-scoring and most widespread mimicry pattern detected among pathogens consisted of elevated sequence similarities to ECM proteins including collagens and leucine-rich repeat proteins. Unexpectedly, analysis of the pathogen counterparts of these proteins revealed that they have evolved independently in different species of bacterial pathogens from separate repeat amplifications. Thus, our analysis provides evidence for two classes of mimics: complex proteins such as enzymes that have been acquired by eukaryote-to-pathogen horizontal transfer, and simpler repeat proteins that have independently evolved to mimic the host ECM. Ultimately, computational detection of pathogen-specific and pathogen-enriched similarities to host proteins provides insights into potentially novel mimicry-mediated virulence mechanisms of pathogenic bacteria. PMID:23715053

  3. Prediction models for solitary pulmonary nodules based on curvelet textural features and clinical parameters.

    PubMed

    Wang, Jing-Jing; Wu, Hai-Feng; Sun, Tao; Li, Xia; Wang, Wei; Tao, Li-Xin; Huo, Da; Lv, Ping-Xin; He, Wen; Guo, Xiu-Hua

    2013-01-01

    Lung cancer, one of the leading causes of cancer-related deaths, usually appears as solitary pulmonary nodules (SPNs) which are hard to diagnose using the naked eye. In this paper, curvelet-based textural features and clinical parameters are used with three prediction models [a multilevel model, a least absolute shrinkage and selection operator (LASSO) regression method, and a support vector machine (SVM)] to improve the diagnosis of benign and malignant SPNs. Dimensionality reduction of the original curvelet-based textural features was achieved using principal component analysis. In addition, non-conditional logistical regression was used to find clinical predictors among demographic parameters and morphological features. The results showed that, combined with 11 clinical predictors, the accuracy rates using 12 principal components were higher than those using the original curvelet-based textural features. To evaluate the models, 10-fold cross validation and back substitution were applied. The results obtained, respectively, were 0.8549 and 0.9221 for the LASSO method, 0.9443 and 0.9831 for SVM, and 0.8722 and 0.9722 for the multilevel model. All in all, it was found that using curvelet-based textural features after dimensionality reduction and using clinical predictors, the highest accuracy rate was achieved with SVM. The method may be used as an auxiliary tool to differentiate between benign and malignant SPNs in CT images.

  4. Improving model predictions for RNA interference activities that use support vector machine regression by combining and filtering features

    PubMed Central

    Peek, Andrew S

    2007-01-01

    Background RNA interference (RNAi) is a naturally occurring phenomenon that results in the suppression of a target RNA sequence utilizing a variety of possible methods and pathways. To dissect the factors that result in effective siRNA sequences a regression kernel Support Vector Machine (SVM) approach was used to quantitatively model RNA interference activities. Results Eight overall feature mapping methods were compared in their abilities to build SVM regression models that predict published siRNA activities. The primary factors in predictive SVM models are position specific nucleotide compositions. The secondary factors are position independent sequence motifs (N-grams) and guide strand to passenger strand sequence thermodynamics. Finally, the factors that are least contributory but are still predictive of efficacy are measures of intramolecular guide strand secondary structure and target strand secondary structure. Of these, the site of the 5' most base of the guide strand is the most informative. Conclusion The capacity of specific feature mapping methods and their ability to build predictive models of RNAi activity suggests a relative biological importance of these features. Some feature mapping methods are more informative in building predictive models and overall t-test filtering provides a method to remove some noisy features or make comparisons among datasets. Together, these features can yield predictive SVM regression models with increased predictive accuracy between predicted and observed activities both within datasets by cross validation, and between independently collected RNAi activity datasets. Feature filtering to remove features should be approached carefully in that it is possible to reduce feature set size without substantially reducing predictive models, but the features retained in the candidate models become increasingly distinct. Software to perform feature prediction and SVM training and testing on nucleic acid sequences can be found at

  5. Predicting the rupture probabilities of molecular bonds in series.

    PubMed

    Neuert, Gregor; Albrecht, Christian H; Gaub, Hermann E

    2007-08-15

    An assembly of two receptor ligand bonds in series will typically break at the weaker complex upon application of an external force. The rupture site depends highly on the binding potentials of both bonds and on the loading rate of the applied force. A model is presented that allows simulations of force-induced rupture of bonds in series at a given force and loading rate based on the natural dissociation rates kR0,S0 and the potential width DeltaxR,S of the reference and sample bonds. The model is especially useful for the analysis of differential force assay experiments. This is illustrated by experiments on molecular force balances consisting of two 30-bp oligonucleotide duplexes where kR0,S0 and DeltaxR,S have been determined for different single nucleotide mismatches. Furthermore, prediction of the rupture site of two bonds in series is demonstrated for DNA duplexes in combination with streptavidin/biotin and anti-digoxigenin/digoxigenin, respectively.

  6. Predicting Molecular Crowding Effects in Ion-RNA Interactions.

    PubMed

    Yu, Tao; Zhu, Yuhong; He, Zhaojian; Chen, Shi-Jie

    2016-09-01

    We develop a new statistical mechanical model to predict the molecular crowding effects in ion-RNA interactions. By considering discrete distributions of the crowders, the model can treat the main crowder-induced effects, such as the competition with ions for RNA binding, changes of electrostatic interaction due to crowder-induced changes in the dielectric environment, and changes in the nonpolar hydration state of the crowder-RNA system. To enhance the computational efficiency, we sample the crowder distribution using a hybrid approach: For crowders in the close vicinity of RNA surface, we sample their discrete distributions; for crowders in the bulk solvent away from the RNA surface, we use a continuous mean-field distribution for the crowders. Moreover, using the tightly bound ion (TBI) model, we account for ion fluctuation and correlation effects in the calculation for ion-RNA interactions. Applications of the model to a variety of simple RNA structures such as RNA helices show a crowder-induced increase in free energy and decrease in ion binding. Such crowding effects tend to contribute to the destabilization of RNA structure. Further analysis indicates that these effects are associated with the crowder-ion competition in RNA binding and the effective decrease in the dielectric constant. This simple ion effect model may serve as a useful framework for modeling more realistic crowders with larger, more complex RNA structures.

  7. Prediction of microRNAs involved in immune system diseases through network based features.

    PubMed

    Prabahar, Archana; Natarajan, Jeyakumar

    2017-01-01

    MicroRNAs are a class of small non-coding regulatory RNA molecules that modulate the expression of several genes at post-transcriptional level and play a vital role in disease pathogenesis. Recent research shows that a range of miRNAs are involved in the regulation of immunity and its deregulation results in immune mediated diseases such as cancer, inflammation and autoimmune diseases. Computational discovery of these immune miRNAs using a set of specific features is highly desirable. In the current investigation, we present a SVM based classification system which uses a set of novel network based topological and motif features in addition to the baseline sequential and structural features to predict immune specific miRNAs from other non-immune miRNAs. The classifier was trained and tested on a balanced set of equal number of positive and negative examples to show the discriminative power of our network features. Experimental results show that our approach achieves an accuracy of 90.2% and outperforms the classification accuracy of 63.2% reported using the traditional miRNA sequential and structural features. The proposed classifier was further validated with two immune disease sub-class datasets related to multiple sclerosis microarray data and psoriasis RNA-seq data with higher accuracy. These results indicate that our classifier which uses network and motif features along with sequential and structural features will lead to significant improvement in classifying immune miRNAs and hence can be applied to identify other specific classes of miRNAs as an extensible miRNA classification system.

  8. Predictive Ensemble Decoding of Acoustical Features Explains Context-Dependent Receptive Fields

    PubMed Central

    Mesgarani, Nima; Deneve, Sophie

    2016-01-01

    A primary goal of auditory neuroscience is to identify the sound features extracted and represented by auditory neurons. Linear encoding models, which describe neural responses as a function of the stimulus, have been primarily used for this purpose. Here, we provide theoretical arguments and experimental evidence in support of an alternative approach, based on decoding the stimulus from the neural response. We used a Bayesian normative approach to predict the responses of neurons detecting relevant auditory features, despite ambiguities and noise. We compared the model predictions to recordings from the primary auditory cortex of ferrets and found that: (1) the decoding filters of auditory neurons resemble the filters learned from the statistics of speech sounds; (2) the decoding model captures the dynamics of responses better than a linear encoding model of similar complexity; and (3) the decoding model accounts for the accuracy with which the stimulus is represented in neural activity, whereas linear encoding model performs very poorly. Most importantly, our model predicts that neuronal responses are fundamentally shaped by “explaining away,” a divisive competition between alternative interpretations of the auditory scene. SIGNIFICANCE STATEMENT Neural responses in the auditory cortex are dynamic, nonlinear, and hard to predict. Traditionally, encoding models have been used to describe neural responses as a function of the stimulus. However, in addition to external stimulation, neural activity is strongly modulated by the responses of other neurons in the network. We hypothesized that auditory neurons aim to collectively decode their stimulus. In particular, a stimulus feature that is decoded (or explained away) by one neuron is not explained by another. We demonstrated that this novel Bayesian decoding model is better at capturing the dynamic responses of cortical neurons in ferrets. Whereas the linear encoding model poorly reflects selectivity of neurons

  9. Selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species

    PubMed Central

    Liu, Xiao; Wang, Bao-Jin; Xu, Luo; Tang, Hong-Ling; Xu, Guo-Qing

    2017-01-01

    Genes that are indispensable for survival are essential genes. Many features have been proposed for computational prediction of essential genes. In this paper, the least absolute shrinkage and selection operator method was used to screen key sequence-based features related to gene essentiality. To assess the effects, the selected features were used to predict the essential genes from 31 bacterial species based on a support vector machine classifier. For all 31 bacterial objects (21 Gram-negative objects and ten Gram-positive objects), the features in the three datasets were reduced from 57, 59, and 58, to 40, 37, and 38, respectively, without loss of prediction accuracy. Results showed that some features were redundant for gene essentiality, so could be eliminated from future analyses. The selected features contained more complex (or key) biological information for gene essentiality, and could be of use in related research projects, such as gene prediction, synthetic biology, and drug design. PMID:28358836

  10. Sequence-Based Prediction of RNA-Binding Proteins Using Random Forest with Minimum Redundancy Maximum Relevance Feature Selection

    PubMed Central

    Ma, Xin; Guo, Jing; Sun, Xiao

    2015-01-01

    The prediction of RNA-binding proteins is one of the most challenging problems in computation biology. Although some studies have investigated this problem, the accuracy of prediction is still not sufficient. In this study, a highly accurate method was developed to predict RNA-binding proteins from amino acid sequences using random forests with the minimum redundancy maximum relevance (mRMR) method, followed by incremental feature selection (IFS). We incorporated features of conjoint triad features and three novel features: binding propensity (BP), nonbinding propensity (NBP), and evolutionary information combined with physicochemical properties (EIPP). The results showed that these novel features have important roles in improving the performance of the predictor. Using the mRMR-IFS method, our predictor achieved the best performance (86.62% accuracy and 0.737 Matthews correlation coefficient). High prediction accuracy and successful prediction performance suggested that our method can be a useful approach to identify RNA-binding proteins from sequence information. PMID:26543860

  11. Sub-resolution assist feature (SRAF) printing prediction using logistic regression

    NASA Astrophysics Data System (ADS)

    Tan, Chin Boon; Koh, Kar Kit; Zhang, Dongqing; Foong, Yee Mei

    2015-03-01

    In optical proximity correction (OPC), the sub-resolution assist feature (SRAF) has been used to enhance the process window of main structures. However, the printing of SRAF on wafer is undesirable as this may adversely degrade the overall process yield if it is transferred into the final pattern. A reasonably accurate prediction model is needed during OPC to ensure that the SRAF placement and size have no risk of SRAF printing. Current common practice in OPC is either using the main OPC model or model threshold adjustment (MTA) solution to predict the SRAF printing. This paper studies the feasibility of SRAF printing prediction using logistic regression (LR). Logistic regression is a probabilistic classification model that gives discrete binary outputs after receiving sufficient input variables from SRAF printing conditions. In the application of SRAF printing prediction, the binary outputs can be treated as 1 for SRAFPrinting and 0 for No-SRAF-Printing. The experimental work was performed using a 20nm line/space process layer. The results demonstrate that the accuracy of SRAF printing prediction using LR approach outperforms MTA solution. Overall error rate of as low as calibration 2% and verification 5% was achieved by LR approach compared to calibration 6% and verification 15% for MTA solution. In addition, the performance of LR approach was found to be relatively independent and consistent across different resist image planes compared to MTA solution.

  12. A Toxicogenomic Approach for the Prediction of Murine Hepatocarcinogenesis Using Ensemble Feature Selection

    PubMed Central

    Eichner, Johannes; Kossler, Nadine; Wrzodek, Clemens; Kalkuhl, Arno; Bach Toft, Dorthe; Ostenfeldt, Nina; Richard, Virgile; Zell, Andreas

    2013-01-01

    The current strategy for identifying the carcinogenicity of drugs involves the 2-year bioassay in male and female rats and mice. As this assay is cost-intensive and time-consuming there is a high interest in developing approaches for the screening and prioritization of drug candidates in preclinical safety evaluations. Predictive models based on toxicogenomics investigations after short-term exposure have shown their potential for assessing the carcinogenic risk. In this study, we investigated a novel method for the evaluation of toxicogenomics data based on ensemble feature selection in conjunction with bootstrapping for the purpose to derive reproducible and characteristic multi-gene signatures. This method was evaluated on a microarray dataset containing global gene expression data from liver samples of both male and female mice. The dataset was generated by the IMI MARCAR consortium and included gene expression profiles of genotoxic and nongenotoxic hepatocarcinogens obtained after treatment of CD-1 mice for 3 or 14 days. We developed predictive models based on gene expression data of both sexes and the models were employed for predicting the carcinogenic class of diverse compounds. Comparing the predictivity of our multi-gene signatures against signatures from literature, we demonstrated that by incorporating our gene sets as features slightly higher accuracy is on average achieved by a representative set of state-of-the art supervised learning methods. The constructed models were also used for the classification of Cyproterone acetate (CPA), Wy-14643 (WY) and Thioacetamid (TAA), whose primary mechanism of carcinogenicity is controversially discussed. Based on the extracted mouse liver gene expression patterns, CPA would be predicted as a nongenotoxic compound. In contrast, both WY and TAA would be classified as genotoxic mouse hepatocarcinogens. PMID:24040119

  13. Quantitative Description of a Protein Fitness Landscape Based on Molecular Features

    PubMed Central

    Meini, María-Rocío; Tomatis, Pablo E.; Weinreich, Daniel M.; Vila, Alejandro J.

    2015-01-01

    Understanding the driving forces behind protein evolution requires the ability to correlate the molecular impact of mutations with organismal fitness. To address this issue, we employ here metallo-β-lactamases as a model system, which are Zn(II) dependent enzymes that mediate antibiotic resistance. We present a study of all the possible evolutionary pathways leading to a metallo-β-lactamase variant optimized by directed evolution. By studying the activity, stability and Zn(II) binding capabilities of all mutants in the preferred evolutionary pathways, we show that this local fitness landscape is strongly conditioned by epistatic interactions arising from the pleiotropic effect of mutations in the different molecular features of the enzyme. Activity and stability assays in purified enzymes do not provide explanatory power. Instead, measurement of these molecular features in an environment resembling the native one provides an accurate description of the observed antibiotic resistance profile. We report that optimization of Zn(II) binding abilities of metallo-β-lactamases during evolution is more critical than stabilization of the protein to enhance fitness. A global analysis of these parameters allows us to connect genotype with fitness based on quantitative biochemical and biophysical parameters. PMID:25767204

  14. Quantitative Description of a Protein Fitness Landscape Based on Molecular Features.

    PubMed

    Meini, María-Rocío; Tomatis, Pablo E; Weinreich, Daniel M; Vila, Alejandro J

    2015-07-01

    Understanding the driving forces behind protein evolution requires the ability to correlate the molecular impact of mutations with organismal fitness. To address this issue, we employ here metallo-β-lactamases as a model system, which are Zn(II) dependent enzymes that mediate antibiotic resistance. We present a study of all the possible evolutionary pathways leading to a metallo-β-lactamase variant optimized by directed evolution. By studying the activity, stability and Zn(II) binding capabilities of all mutants in the preferred evolutionary pathways, we show that this local fitness landscape is strongly conditioned by epistatic interactions arising from the pleiotropic effect of mutations in the different molecular features of the enzyme. Activity and stability assays in purified enzymes do not provide explanatory power. Instead, measurement of these molecular features in an environment resembling the native one provides an accurate description of the observed antibiotic resistance profile. We report that optimization of Zn(II) binding abilities of metallo-β-lactamases during evolution is more critical than stabilization of the protein to enhance fitness. A global analysis of these parameters allows us to connect genotype with fitness based on quantitative biochemical and biophysical parameters.

  15. INTEGRATIVE ANALYSIS FOR LUNG ADENOCARCINOMA PREDICTS MORPHOLOGICAL FEATURES ASSOCIATED WITH GENETIC VARIATIONS*

    PubMed Central

    WANG, CHAO; SU, HAI; YANG, LIN; HUANG, KUN

    2016-01-01

    Lung cancer is one of the most deadly cancers and lung adenocarcinoma (LUAD) is the most common histological type of lung cancer. However, LUAD is highly heterogeneous due to genetic difference as well as phenotypic differences such as cellular and tissue morphology. In this paper, we systematically examine the relationships between histological features and gene transcription. Specifically, we calculated 283 morphological features from histology images for 201 LUAD patients from TCGA project and identified the morphological feature with strong correlation with patient outcome. We then modeled the morphology feature using multiple co-expressed gene clusters using Lasso-regression. Many of the gene clusters are highly associated with genetic variations, specifically DNA copy number variations, implying that genetic variations play important roles in the development cancer morphology. As far as we know, our finding is the first to directly link the genetic variations and functional genomics to LUAD histology. These observations will lead to new insight on lung cancer development and potential new integrative biomarkers for prediction patient prognosis and response to treatments. PMID:27896964

  16. Respiratory trace feature analysis for the prediction of respiratory-gated PET quantification

    NASA Astrophysics Data System (ADS)

    Wang, Shouyi; Bowen, Stephen R.; Chaovalitwongse, W. Art; Sandison, George A.; Grabowski, Thomas J.; Kinahan, Paul E.

    2014-02-01

    The benefits of respiratory gating in quantitative PET/CT vary tremendously between individual patients. Respiratory pattern is among many patient-specific characteristics that are thought to play an important role in gating-induced imaging improvements. However, the quantitative relationship between patient-specific characteristics of respiratory pattern and improvements in quantitative accuracy from respiratory-gated PET/CT has not been well established. If such a relationship could be estimated, then patient-specific respiratory patterns could be used to prospectively select appropriate motion compensation during image acquisition on a per-patient basis. This study was undertaken to develop a novel statistical model that predicts quantitative changes in PET/CT imaging due to respiratory gating. Free-breathing static FDG-PET images without gating and respiratory-gated FDG-PET images were collected from 22 lung and liver cancer patients on a PET/CT scanner. PET imaging quality was quantified with peak standardized uptake value (SUVpeak) over lesions of interest. Relative differences in SUVpeak between static and gated PET images were calculated to indicate quantitative imaging changes due to gating. A comprehensive multidimensional extraction of the morphological and statistical characteristics of respiratory patterns was conducted, resulting in 16 features that characterize representative patterns of a single respiratory trace. The six most informative features were subsequently extracted using a stepwise feature selection approach. The multiple-regression model was trained and tested based on a leave-one-subject-out cross-validation. The predicted quantitative improvements in PET imaging achieved an accuracy higher than 90% using a criterion with a dynamic error-tolerance range for SUVpeak values. The results of this study suggest that our prediction framework could be applied to determine which patients would likely benefit from respiratory motion compensation

  17. Molecular Size and Separability Features of Pea Cell Wall Polysaccharides 1

    PubMed Central

    Talbott, Lawrence D.; Ray, Peter M.

    1992-01-01

    Relative molecular size distributions of pectic and hemicellulosic polysaccharides of pea (Pisum sativum cv Alaska) third internode primary walls were determined by gel filtration chromatography. Pectic polyuronides have a peak molecular mass of about 1100 kilodaltons, relative to dextran standards. This peak may be partly an aggregate of smaller molecular units, because demonstrable aggregation occurred when samples were concentrated by evaporation. About 86% of the neutral sugars (mostly arabinose and galactose) in the pectin cofractionate with polyuronide in gel filtration chromatography and diethylaminoethyl-cellulose chromatography and appear to be attached covalently to polyuronide chains, probably as constituents of rhamnogalacturonans. However, at least 60% of the wall's arabinan/galactan is not linked covalently to the bulk of its rhamnogalacturonan, either glycosidically or by ester links, but occurs in the hemicellulose fraction, accompanied by negligible uronic acid, and has a peak molecular mass of about 1000 kilodaltons. Xyloglucan, the other principal hemicellulosic polymer, has a peak molecular mass of about 30 kilodaltons (with a secondary, usually minor, peak of approximately 300 kilodaltons) and is mostly not linked glycosidically either to pectic polyuronides or to arabinogalactan. The relatively narrow molecular mass distributions of these polymers suggest mechanisms of co- or postsynthetic control of hemicellulose chain length by the cell. Although the macromolecular features of the mentioned polymers individually agree generally with those shown in the widely disseminated sycamore cell primary wall model, the matrix polymers seem to be associated mostly noncovalently rather than in the covalently interlinked meshwork postulated by that model. Xyloglucan and arabinan/galactan may form tightly and more loosely bound layers, respectively, around the cellulose microfibrils, the outer layer interacting with pectic rhamnogalacturonans that occupy

  18. Energy Minimization of Molecular Features Observed on the (110) Face of Lysozyme Crystals

    NASA Technical Reports Server (NTRS)

    Perozzo, Mary A.; Konnert, John H.; Li, Huayu; Nadarajah, Arunan; Pusey, Marc

    1999-01-01

    Molecular dynamics and energy minimization have been carried out using the program XPLOR to check the plausibility of a model lysozyme crystal surface. The molecular features of the (110) face of lysozyme were observed using atomic force microscopy (AFM). A model of the crystal surface was constructed using the PDB file 193L, and was used to simulate an AFM image. Molecule translations, van der Waals radii, and assumed AFM tip shape were adjusted to maximize the correlation coefficient between the experimental and simulated images. The highest degree of 0 correlation (0.92) was obtained with the molecules displaced over 6 A from their positions within the bulk of the crystal. The quality of this starting model, the extent of energy minimization, and the correlation coefficient between the final model and the experimental data will be discussed.

  19. Feature Detection” vs. “Predictive Coding” Models of Plant Behavior

    PubMed Central

    Calvo, Paco; Baluška, František; Sims, Andrew

    2016-01-01

    In this article we consider the possibility that plants exhibit anticipatory behavior, a mark of intelligence. If plants are able to anticipate and respond accordingly to varying states of their surroundings, as opposed to merely responding online to environmental contingencies, then such capacity may be in principle testable, and subject to empirical scrutiny. Our main thesis is that adaptive behavior can only take place by way of a mechanism that predicts the environmental sources of sensory stimulation. We propose to test for anticipation in plants experimentally by contrasting two empirical hypotheses: “feature detection” and “predictive coding.” We spell out what these contrasting hypotheses consist of by way of illustration from the animal literature, and consider how to transfer the rationale involved to the plant literature. PMID:27757094

  20. Observations of the interstellar ice grain feature in the Taurus molecular clouds

    SciTech Connect

    Whittet, D.C.B.; Bode, H.F.; Longmore, A.J.; Baines, D.W.T.; Evans, A.

    1983-01-01

    Although water ice was originally proposed as a major constituent of the interstellar grain population (e.g. Oort and van de Hulst, 1946), the advent of infrared astronomy has shown that the expected absorption due to O-H stretching vibrations at 3 ..mu..m is illusive. Observations have in fact revealed that the carrier of this feature is apparently restricted to regions deep within dense molecular clouds (Merrill et al., 1976; Willner et al., 1982). However, the exact carrier of this feature is still controversial, and many questions remain as to the conditions required for its appearance. It is also uncertain whether it is restricted to circumstellar shells, rather than the general cloud medium. Detailed discussion of the 3 ..mu..m band properties is given elsewhere in this volume. 15 references, 4 figures.

  1. Adult primary pulmonary primitive neuroectodermal tumor: molecular features and translational opportunities.

    PubMed

    Andrei, Mirela; Cramer, Stewart F; Kramer, Zachary B; Zeidan, Amer; Faltas, Bishoy

    2013-02-01

    Primitive neuroectodermal tumors (PNET) arising directly from the lung are very rare but particularly aggressive neoplasms. We report a case of a 31-y-old man with primary pulmonary neuroectodermal tumor. We review the clinical as well as pathological features. As typical for these tumors, the diagnosis was initially delayed in our patient and prognosis was poor despite aggressive surgical resection, postoperative chemotherapy and local irradiation. Recent biological insights have revealed unique chromosomal translocations crucial to the pathogenesis of these tumors, most notably the EWS-FLI-1 translocation. We provide an overview of the molecular features of the Ewing Sarcoma Family of Tumors (ESFT) including PNET and their potential implications for therapeutic targeting.

  2. Pulmonary ground-glass opacity: computed tomography features, histopathology and molecular pathology

    PubMed Central

    Gao, Jian-Wei; Rizzo, Stefania; Ma, Li-Hong; Qiu, Xiang-Yu; Warth, Arne; Seki, Nobuhiko; Hasegawa, Mizue; Zou, Jia-Wei; Li, Qian; Femia, Marco

    2017-01-01

    The incidence of pulmonary ground-glass opacity (GGO) lesions is increasing as a result of the widespread use of multislice spiral computed tomography (CT) and the low-dose CT screening for lung cancer detection. Besides benign lesions, GGOs can be a specific type of lung adenocarcinomas or their preinvasive lesions. Evaluation of pulmonary GGO and investigation of the correlation between CT imaging features and lung adenocarcinoma subtypes or driver genes can be helpful in confirming the diagnosis and in guiding the clinical management. Our review focuses on the pathologic characteristics of GGO detected at CT, involving histopathology and molecular pathology.

  3. Stargardt disease: clinical features, molecular genetics, animal models and therapeutic options

    PubMed Central

    Tanna, Preena; Strauss, Rupert W; Fujinami, Kaoru; Michaelides, Michel

    2017-01-01

    Stargardt disease (STGD1; MIM 248200) is the most prevalent inherited macular dystrophy and is associated with disease-causing sequence variants in the gene ABCA4. Significant advances have been made over the last 10 years in our understanding of both the clinical and molecular features of STGD1, and also the underlying pathophysiology, which has culminated in ongoing and planned human clinical trials of novel therapies. The aims of this review are to describe the detailed phenotypic and genotypic characteristics of the disease, conventional and novel imaging findings, current knowledge of animal models and pathogenesis, and the multiple avenues of intervention being explored. PMID:27491360

  4. Transient protein-protein interface prediction: datasets, features, algorithms, and the RAD-T predictor

    PubMed Central

    2014-01-01

    Background Transient protein-protein interactions (PPIs), which underly most biological processes, are a prime target for therapeutic development. Immense progress has been made towards computational prediction of PPIs using methods such as protein docking and sequence analysis. However, docking generally requires high resolution structures of both of the binding partners and sequence analysis requires that a significant number of recurrent patterns exist for the identification of a potential binding site. Researchers have turned to machine learning to overcome some of the other methods’ restrictions by generalising interface sites with sets of descriptive features. Best practices for dataset generation, features, and learning algorithms have not yet been identified or agreed upon, and an analysis of the overall efficacy of machine learning based PPI predictors is due, in order to highlight potential areas for improvement. Results The presence of unknown interaction sites as a result of limited knowledge about protein interactions in the testing set dramatically reduces prediction accuracy. Greater accuracy in labelling the data by enforcing higher interface site rates per domain resulted in an average 44% improvement across multiple machine learning algorithms. A set of 10 biologically unrelated proteins that were consistently predicted on with high accuracy emerged through our analysis. We identify seven features with the most predictive power over multiple datasets and machine learning algorithms. Through our analysis, we created a new predictor, RAD-T, that outperforms existing non-structurally specializing machine learning protein interface predictors, with an average 59% increase in MCC score on a dataset with a high number of interactions. Conclusion Current methods of evaluating machine-learning based PPI predictors tend to undervalue their performance, which may be artificially decreased by the presence of un-identified interaction sites. Changes to

  5. Toll-Like Receptor 7 Agonists: Chemical Feature Based Pharmacophore Identification and Molecular Docking Studies

    PubMed Central

    Sun, Lidan; Zhang, Liangren; Sun, Gang; Wang, Zhanli; Yu, Yongchun

    2013-01-01

    Chemical feature based pharmacophore models were generated for Toll-like receptors 7 (TLR7) agonists using HypoGen algorithm, which is implemented in the Discovery Studio software. Several methods tools used in validation of pharmacophore model were presented. The first hypothesis Hypo1 was considered to be the best pharmacophore model, which consists of four features: one hydrogen bond acceptor, one hydrogen bond donor, and two hydrophobic features. In addition, homology modeling and molecular docking studies were employed to probe the intermolecular interactions between TLR7 and its agonists. The results further confirmed the reliability of the pharmacophore model. The obtained pharmacophore model (Hypo1) was then employed as a query to screen the Traditional Chinese Medicine Database (TCMD) for other potential lead compounds. One hit was identified as a potent TLR7 agonist, which has antiviral activity against hepatitis virus in vitro. Therefore, our current work provides confidence for the utility of the selected chemical feature based pharmacophore model to design novel TLR7 agonists with desired biological activity. PMID:23526932

  6. Molecular features of interaction between VEGFA and anti-angiogenic drugs used in retinal diseases: a computational approach

    PubMed Central

    Platania, Chiara B. M.; Di Paola, Luisa; Leggio, Gian M.; Romano, Giovanni L.; Drago, Filippo; Salomone, Salvatore; Bucolo, Claudio

    2015-01-01

    Anti-angiogenic agents are biological drugs used for treatment of retinal neovascular degenerative diseases. In this study, we aimed at in silico analysis of interaction of vascular endothelial growth factor A (VEGFA), the main mediator of angiogenesis, with binding domains of anti-angiogenic agents used for treatment of retinal diseases, such as ranibizumab, bevacizumab and aflibercept. The analysis of anti-VEGF/VEGFA complexes was carried out by means of protein-protein docking and molecular dynamics (MD) coupled to molecular mechanics-Poisson Boltzmann Surface Area (MM-PBSA) calculation. Molecular dynamics simulation was further analyzed by protein contact networks. Rough energetic evaluation with protein-protein docking scores revealed that aflibercept/VEGFA complex was characterized by electrostatic stabilization, whereas ranibizumab and bevacizumab complexes were stabilized by Van der Waals (VdW) energy term; these results were confirmed by MM-PBSA. Comparison of MM-PBSA predicted energy terms with experimental binding parameters reported in literature indicated that the high association rate (Kon) of aflibercept to VEGFA was consistent with high stabilizing electrostatic energy. On the other hand, the relatively low experimental dissociation rate (Koff) of ranibizumab may be attributed to lower conformational fluctuations of the ranibizumab/VEGFA complex, higher number of contacts and hydrogen bonds in comparison to bevacizumab and aflibercept. Thus, the anti-angiogenic agents have been found to be considerably different both in terms of molecular interactions and stabilizing energy. Characterization of such features can improve the design of novel biological drugs potentially useful in clinical practice. PMID:26578958

  7. Molecular features of secondary vascular tissue regeneration after bark girdling in Populus.

    PubMed

    Zhang, Jing; Gao, Ge; Chen, Jia-Jia; Taylor, Gail; Cui, Ke-Ming; He, Xin-Qiang

    2011-12-01

    Regeneration is a common strategy for plants to repair damage to their tissue after attacks from other organisms or physical assaults. However, how differentiating cells acquire regenerative competence and rebuild the pattern of new tissues remains largely unknown. Using anatomical observation and microarray analysis, we investigated the morphological process and molecular features of secondary vascular tissue regeneration after bark girdling in trees. After bark girdling, new phloem and cambium regenerate from differentiating xylem cells and rebuild secondary vascular tissue pattern within 1 month. Differentiating xylem cells acquire regenerative competence through epigenetic regulation and cell cycle re-entry. The xylem developmental program was blocked, whereas the phloem or cambium program was activated, resulting in the secondary vascular tissue pattern re-establishment. Phytohormones play important roles in vascular tissue regeneration. We propose a model describing the molecular features of secondary vascular tissue regeneration after bark girdling in trees. It provides information for understanding mechanisms of tissue regeneration and pattern formation of the secondary vascular tissues in plants.

  8. MELANCHOLIC DEPRESSION PREDICTION BY IDENTIFYING REPRESENTATIVE FEATURES IN METABOLIC AND MICROARRAY PROFILES WITH MISSING VALUES

    PubMed Central

    Nie, Zhi; Yang, Tao; Liu, Yashu; Lin, Binbin; Li, Qingyang; Narayan, Vaibhav A; Wittenberg, Gayle; Ye, Jieping

    2014-01-01

    Recent studies have revealed that melancholic depression, one major subtype of depression, is closely associated with the concentration of some metabolites and biological functions of certain genes and pathways. Meanwhile, recent advances in biotechnologies have allowed us to collect a large amount of genomic data, e.g., metabolites and microarray gene expression. With such a huge amount of information available, one approach that can give us new insights into the understanding of the fundamental biology underlying melancholic depression is to build disease status prediction models using classification or regression methods. However, the existence of strong empirical correlations, e.g., those exhibited by genes sharing the same biological pathway in microarray profiles, tremendously limits the performance of these methods. Furthermore, the occurrence of missing values which are ubiquitous in biomedical applications further complicates the problem. In this paper, we hypothesize that the problem of missing values might in some way benefit from the correlation between the variables and propose a method to learn a compressed set of representative features through an adapted version of sparse coding which is capable of identifying correlated variables and addressing the issue of missing values simultaneously. An efficient algorithm is also developed to solve the proposed formulation. We apply the proposed method on metabolic and microarray profiles collected from a group of subjects consisting of both patients with melancholic depression and healthy controls. Results show that the proposed method can not only produce meaningful clusters of variables but also generate a set of representative features that achieve superior classification performance over those generated by traditional clustering and data imputation techniques. In particular, on both datasets, we found that in comparison with the competing algorithms, the representative features learned by the proposed

  9. Melancholic depression prediction by identifying representative features in metabolic and microarray profiles with missing values.

    PubMed

    Nie, Zhi; Yang, Tao; Liu, Yashu; Li, Qingyang; Narayan, Vaibhav A; Wittenberg, Gayle; Ye, Jieping

    2015-01-01

    Recent studies have revealed that melancholic depression, one major subtype of depression, is closely associated with the concentration of some metabolites and biological functions of certain genes and pathways. Meanwhile, recent advances in biotechnologies have allowed us to collect a large amount of genomic data, e.g., metabolites and microarray gene expression. With such a huge amount of information available, one approach that can give us new insights into the understanding of the fundamental biology underlying melancholic depression is to build disease status prediction models using classification or regression methods. However, the existence of strong empirical correlations, e.g., those exhibited by genes sharing the same biological pathway in microarray profiles, tremendously limits the performance of these methods. Furthermore, the occurrence of missing values which are ubiquitous in biomedical applications further complicates the problem. In this paper, we hypothesize that the problem of missing values might in some way benefit from the correlation between the variables and propose a method to learn a compressed set of representative features through an adapted version of sparse coding which is capable of identifying correlated variables and addressing the issue of missing values simultaneously. An efficient algorithm is also developed to solve the proposed formulation. We apply the proposed method on metabolic and microarray profiles collected from a group of subjects consisting of both patients with melancholic depression and healthy controls. Results show that the proposed method can not only produce meaningful clusters of variables but also generate a set of representative features that achieve superior classification performance over those generated by traditional clustering and data imputation techniques. In particular, on both datasets, we found that in comparison with the competing algorithms, the representative features learned by the proposed

  10. Machine Learning Approaches for Integrating Clinical and Imaging Features in LLD Classification and Response Prediction

    PubMed Central

    Patel, Meenal J.; Andreescu, Carmen; Price, Julie C.; Edelman, Kathryn L.; Reynolds, Charles F.; Aizenstein, Howard J.

    2015-01-01

    Objective Currently, depression diagnosis relies primarily on behavioral symptoms and signs, and treatment is guided by trial and error instead of evaluating associated underlying brain characteristics. Unlike past studies, we attempted to estimate accurate prediction models for late-life depression diagnosis and treatment response using multiple machine learning methods with inputs of multi-modal imaging and non-imaging whole brain and network-based features. Methods Late-life depression patients (medicated post-recruitment) [n=33] and elderly non-depressed individuals [n=35] were recruited. Their demographics and cognitive ability scores were recorded, and brain characteristics were acquired using multi-modal magnetic resonance imaging pre-treatment. Linear and nonlinear learning methods were tested for estimating accurate prediction models. Results A learning method called alternating decision trees estimated the most accurate prediction models for late-life depression diagnosis (87.27% accuracy) and treatment response (89.47% accuracy). The diagnosis model included measures of age, mini-mental state examination score, and structural imaging (e.g. whole brain atrophy and global white mater hyperintensity burden). The treatment response model included measures of structural and functional connectivity. Conclusions Combinations of multi-modal imaging and/or non-imaging measures may help better predict late-life depression diagnosis and treatment response. As a preliminary observation, we speculate the results may also suggest that different underlying brain characteristics defined by multi-modal imaging measures—rather than region-based differences—are associated with depression versus depression recovery since to our knowledge this is the first depression study to accurately predict both using the same approach. These findings may help better understand late-life depression and identify preliminary steps towards personalized late-life depression treatment

  11. Computational intelligence models to predict porosity of tablets using minimum features

    PubMed Central

    Khalid, Mohammad Hassan; Kazemi, Pezhman; Perez-Gandarillas, Lucia; Michrafy, Abderrahim; Szlęk, Jakub; Jachowicz, Renata; Mendyk, Aleksander

    2017-01-01

    The effects of different formulations and manufacturing process conditions on the physical properties of a solid dosage form are of importance to the pharmaceutical industry. It is vital to have in-depth understanding of the material properties and governing parameters of its processes in response to different formulations. Understanding the mentioned aspects will allow tighter control of the process, leading to implementation of quality-by-design (QbD) practices. Computational intelligence (CI) offers an opportunity to create empirical models that can be used to describe the system and predict future outcomes in silico. CI models can help explore the behavior of input parameters, unlocking deeper understanding of the system. This research endeavor presents CI models to predict the porosity of tablets created by roll-compacted binary mixtures, which were milled and compacted under systematically varying conditions. CI models were created using tree-based methods, artificial neural networks (ANNs), and symbolic regression trained on an experimental data set and screened using root-mean-square error (RMSE) scores. The experimental data were composed of proportion of microcrystalline cellulose (MCC) (in percentage), granule size fraction (in micrometers), and die compaction force (in kilonewtons) as inputs and porosity as an output. The resulting models show impressive generalization ability, with ANNs (normalized root-mean-square error [NRMSE] =1%) and symbolic regression (NRMSE =4%) as the best-performing methods, also exhibiting reliable predictive behavior when presented with a challenging external validation data set (best achieved symbolic regression: NRMSE =3%). Symbolic regression demonstrates the transition from the black box modeling paradigm to more transparent predictive models. Predictive performance and feature selection behavior of CI models hints at the most important variables within this factor space. PMID:28138223

  12. Computational intelligence models to predict porosity of tablets using minimum features.

    PubMed

    Khalid, Mohammad Hassan; Kazemi, Pezhman; Perez-Gandarillas, Lucia; Michrafy, Abderrahim; Szlęk, Jakub; Jachowicz, Renata; Mendyk, Aleksander

    2017-01-01

    The effects of different formulations and manufacturing process conditions on the physical properties of a solid dosage form are of importance to the pharmaceutical industry. It is vital to have in-depth understanding of the material properties and governing parameters of its processes in response to different formulations. Understanding the mentioned aspects will allow tighter control of the process, leading to implementation of quality-by-design (QbD) practices. Computational intelligence (CI) offers an opportunity to create empirical models that can be used to describe the system and predict future outcomes in silico. CI models can help explore the behavior of input parameters, unlocking deeper understanding of the system. This research endeavor presents CI models to predict the porosity of tablets created by roll-compacted binary mixtures, which were milled and compacted under systematically varying conditions. CI models were created using tree-based methods, artificial neural networks (ANNs), and symbolic regression trained on an experimental data set and screened using root-mean-square error (RMSE) scores. The experimental data were composed of proportion of microcrystalline cellulose (MCC) (in percentage), granule size fraction (in micrometers), and die compaction force (in kilonewtons) as inputs and porosity as an output. The resulting models show impressive generalization ability, with ANNs (normalized root-mean-square error [NRMSE] =1%) and symbolic regression (NRMSE =4%) as the best-performing methods, also exhibiting reliable predictive behavior when presented with a challenging external validation data set (best achieved symbolic regression: NRMSE =3%). Symbolic regression demonstrates the transition from the black box modeling paradigm to more transparent predictive models. Predictive performance and feature selection behavior of CI models hints at the most important variables within this factor space.

  13. In silico predictive studies of mAHR congener binding using homology modelling and molecular docking.

    PubMed

    Panda, Roshni; Cleave, A Suneetha Susan; Suresh, P K

    2014-09-01

    The aryl hydrocarbon receptor (AHR) is one of the principal xenobiotic, nuclear receptor that is responsible for the early events involved in the transcription of a complex set of genes comprising the CYP450 gene family. In the present computational study, homology modelling and molecular docking were carried out with the objective of predicting the relationship between the binding efficiency and the lipophilicity of different polychlorinated biphenyl (PCB) congeners and the AHR in silico. Homology model of the murine AHR was constructed by several automated servers and assessed by PROCHECK, ERRAT, VERIFY3D and WHAT IF. The resulting model of the AHR by MODWEB was used to carry out molecular docking of 36 PCB congeners using PatchDock server. The lipophilicity of the congeners was predicted using the XLOGP3 tool. The results suggest that the lipophilicity influences binding energy scores and is positively correlated with the same. Score and Log P were correlated with r = +0.506 at p = 0.01 level. In addition, the number of chlorine (Cl) atoms and Log P were highly correlated with r = +0.900 at p = 0.01 level. The number of Cl atoms and scores also showed a moderate positive correlation of r = +0.481 at p = 0.01 level. To the best of our knowledge, this is the first study employing PatchDock in the docking of AHR to the environmentally deleterious congeners and attempting to correlate structural features of the AHR with its biochemical properties with regards to PCBs. The result of this study are consistent with those of other computational studies reported in the previous literature that suggests that a combination of docking, scoring and ranking organic pollutants could be a possible predictive tool for investigating ligand-mediated toxicity, for their subsequent validation using wet lab-based studies.

  14. Infrared images of reflection nebulae and Orion's bar: Fluorescent molecular hydrogen and the 3.3 micron feature

    NASA Technical Reports Server (NTRS)

    Burton, Michael G.; Moorhouse, Alan; Brand, P. W. J. L.; Roche, Patrick F.; Geballe, T. R.

    1989-01-01

    Images were obtained of the (fluorescent) molecular hydrogen 1-0 S(1) line, and of the 3.3 micron emission feature, in Orion's Bar and three reflection nebulae. The emission from these species appears to come from the same spatial locations in all sources observed. This suggests that the 3.3 micron feature is excited by the same energetic UV-photons which cause the molecular hydrogen to fluoresce.

  15. DNABP: Identification of DNA-Binding Proteins Based on Feature Selection Using a Random Forest and Predicting Binding Residues

    PubMed Central

    Guo, Jing; Sun, Xiao

    2016-01-01

    DNA-binding proteins are fundamentally important in cellular processes. Several computational-based methods have been developed to improve the prediction of DNA-binding proteins in previous years. However, insufficient work has been done on the prediction of DNA-binding proteins from protein sequence information. In this paper, a novel predictor, DNABP (DNA-binding proteins), was designed to predict DNA-binding proteins using the random forest (RF) classifier with a hybrid feature. The hybrid feature contains two types of novel sequence features, which reflect information about the conservation of physicochemical properties of the amino acids, and the binding propensity of DNA-binding residues and non-binding propensities of non-binding residues. The comparisons with each feature demonstrated that these two novel features contributed most to the improvement in predictive ability. Furthermore, to improve the prediction performance of the DNABP model, feature selection using the minimum redundancy maximum relevance (mRMR) method combined with incremental feature selection (IFS) was carried out during the model construction. The results showed that the DNABP model could achieve 86.90% accuracy, 83.76% sensitivity, 90.03% specificity and a Matthews correlation coefficient of 0.727. High prediction accuracy and performance comparisons with previous research suggested that DNABP could be a useful approach to identify DNA-binding proteins from sequence information. The DNABP web server system is freely available at http://www.cbi.seu.edu.cn/DNABP/. PMID:27907159

  16. Prediction of molecular properties including symmetry from quantum-based molecular structural formulas, VIF.

    PubMed

    Alia, Joseph D; Vlaisavljevich, Bess; Abbot, Matthew; Warneke, Hallie; Mastin, Tyson

    2008-10-09

    Structurally covariant valency interaction formulas, VIF, gain chemical significance by comparison with resonance structures and natural bond orbital, NBO, bonding schemes and at the same time allow for additional prediction such as symmetry of ring systems and destabilization of electron pairs with respect to reference energy of -1/2 Eh. Comparisons are based on three chemical interpretations of Sinanoğlu's theory of structural covariance: (1) sets of structurally covariant quantum structural formulas, VIF, are interpreted as the same quantum operator represented in linearly related basis frames; (2) structurally covariant VIF pictures are interpreted as sets of molecular species with similar energy; and (3) the same VIF picture can be interpreted as different quantum operators, one-electron density or Hamiltonian; for example. According to these three interpretations, bond pair, lone pair, and free radical electrons understood in terms of a localized orbital representation are recognized as having energies above, below, or equal to a predetermined reference, frequently-1/2 Eh. The probable position of electron pairs and radical electrons is predicted. The selectivity of concerted ring closures in allyl anion and cation is described. Symmetries of conjugated ring systems are predicted according to their numbers of pi-electrons and spin-multiplicity. The pi-distortivity of benzene is predicted.The 3c/2e- H-bridging bonds in diborane are derived in a natural way according to the notion that the bridging bonds will have delocalizing interactions between them consistent with results of the NBO method. Key chemical bonding motifs are described using VIF. These include 2c/1e-, 2c/2e-, 2c/3e-, 3c/2e-, 3c/3e-,3c/4e-, 4n antiaromatic, and 4n+2 aromatic bonding systems. Some common organic functional groups are represented as VIF pictures and because these pictures can be interpreted simultaneously as one-electron density and Hamiltonian operators, the valence shell

  17. Importance of Molecular Features of Non–Small Cell Lung Cancer for Choice of Treatment

    PubMed Central

    Moran, Cesar

    2011-01-01

    Lung cancer is the leading cause of cancer-related deaths in the United States. Approximately 85% of lung cancer is categorized as non–small cell lung cancer, and traditionally, non–small cell lung cancer has been treated with surgery, radiation, and chemotherapy. Targeted agents that inhibit the epidermal growth factor receptor pathway have been developed and integrated into the treatment regimens in non–small cell lung cancer. Currently, approved epidermal growth factor receptor inhibitors include the tyrosine kinase inhibitors erlotinib and gefitinib. Molecular determinants, such as epidermal growth factor receptor–activating mutations, have been associated with response to epidermal growth factor receptor tyrosine kinase inhibitors and may be used to guide treatment choices in patients with non–small cell lung cancer. Thus, treatment choice for patients with non–small cell lung cancer depends on molecular features of tumors; however, improved techniques are required to increase the specificity and efficiency of molecular profiling so that these methods can be incorporated into routine clinical practice. This review provides an overview of how genetic analysis is currently used to direct treatment choices in non–small cell lung cancer. PMID:21514411

  18. Time Score: A New Feature for Link Prediction in Social Networks

    NASA Astrophysics Data System (ADS)

    Munasinghe, Lankeshwara; Ichise, Ryutaro

    Link prediction in social networks, such as friendship networks and coauthorship networks, has recently attracted a great deal of attention. There have been numerous attempts to address the problem of link prediction through diverse approaches. In the present paper, we focus on the temporal behavior of the link strength, particularly the relationship between the time stamps of interactions or links and the temporal behavior of link strength and how link strength affects future link evolution. Most previous studies have not sufficiently discussed either the impact of time stamps of the interactions or time stamps of the links on link evolution. The gap between the current time and the time stamps of the interactions or links is also important to link evolution. In the present paper, we introduce a new time-aware feature, referred to as time score, that captures the important aspects of time stamps of interactions and the temporality of the link strengths. We also analyze the effectiveness of time score with different parameter settings for different network data sets. The results of the analysis revealed that the time score was sensitive to different networks and different time measures. We applied time score to two social network data sets, namely, Facebook friendship network data set and a coauthorship network data set. The results revealed a significant improvement in predicting future links.

  19. Ribonucleotide reductases reveal novel viral diversity and predict biological and ecological features of unknown marine viruses

    PubMed Central

    Sakowski, Eric G.; Munsell, Erik V.; Hyatt, Mara; Kress, William; Williamson, Shannon J.; Nasko, Daniel J.; Polson, Shawn W.; Wommack, K. Eric

    2014-01-01

    Virioplankton play a crucial role in aquatic ecosystems as top-down regulators of bacterial populations and agents of horizontal gene transfer and nutrient cycling. However, the biology and ecology of virioplankton populations in the environment remain poorly understood. Ribonucleotide reductases (RNRs) are ancient enzymes that reduce ribonucleotides to deoxyribonucleotides and thus prime DNA synthesis. Composed of three classes according to O2 reactivity, RNRs can be predictive of the physiological conditions surrounding DNA synthesis. RNRs are universal among cellular life, common within viral genomes and virioplankton shotgun metagenomes (viromes), and estimated to occur within >90% of the dsDNA virioplankton sampled in this study. RNRs occur across diverse viral groups, including all three morphological families of tailed phages, making these genes attractive for studies of viral diversity. Differing patterns in virioplankton diversity were clear from RNRs sampled across a broad oceanic transect. The most abundant RNRs belonged to novel lineages of podoviruses infecting α-proteobacteria, a bacterial class critical to oceanic carbon cycling. RNR class was predictive of phage morphology among cyanophages and RNR distribution frequencies among cyanophages were largely consistent with the predictions of the “kill the winner–cost of resistance” model. RNRs were also identified for the first time to our knowledge within ssDNA viromes. These data indicate that RNR polymorphism provides a means of connecting the biological and ecological features of virioplankton populations. PMID:25313075

  20. Functional connectivity classification of autism identifies highly predictive brain features but falls short of biomarker standards

    PubMed Central

    Plitt, Mark; Barnes, Kelly Anne; Martin, Alex

    2014-01-01

    Objectives Autism spectrum disorders (ASD) are diagnosed based on early-manifesting clinical symptoms, including markedly impaired social communication. We assessed the viability of resting-state functional MRI (rs-fMRI) connectivity measures as diagnostic biomarkers for ASD and investigated which connectivity features are predictive of a diagnosis. Methods Rs-fMRI scans from 59 high functioning males with ASD and 59 age- and IQ-matched typically developing (TD) males were used to build a series of machine learning classifiers. Classification features were obtained using 3 sets of brain regions. Another set of classifiers was built from participants' scores on behavioral metrics. An additional age and IQ-matched cohort of 178 individuals (89 ASD; 89 TD) from the Autism Brain Imaging Data Exchange (ABIDE) open-access dataset (http://fcon_1000.projects.nitrc.org/indi/abide/) were included for replication. Results High classification accuracy was achieved through several rs-fMRI methods (peak accuracy 76.67%). However, classification via behavioral measures consistently surpassed rs-fMRI classifiers (peak accuracy 95.19%). The class probability estimates, P(ASD|fMRI data), from brain-based classifiers significantly correlated with scores on a measure of social functioning, the Social Responsiveness Scale (SRS), as did the most informative features from 2 of the 3 sets of brain-based features. The most informative connections predominantly originated from regions strongly associated with social functioning. Conclusions While individuals can be classified as having ASD with statistically significant accuracy from their rs-fMRI scans alone, this method falls short of biomarker standards. Classification methods provided further evidence that ASD functional connectivity is characterized by dysfunction of large-scale functional networks, particularly those involved in social information processing. PMID:25685703

  1. The Prediction of Biological Activity Using Molecular Connectivity Indices.

    DTIC Science & Technology

    1986-04-23

    toxicities of 15 organotin compounds against Daphnia magna . -18- This study confirmed that molecular topology can be employed to model the behavior...parameter correlation of the toxicity of polycyclic aromatic hydrocarbons in Daphnia Pulex with 0XV: -log LC50 = 0.5346 OXV - 7.004 (r = 0.9972, n...CRC Press, Boca Raton, Florida, * 1983, chap. 4, pp. 105-140. 8. L.B. Kier and L.H. Hall, Molecular Connectivity in Chemistry and Drug * Research

  2. Larval description of Drusus bosnicus Klapálek 1899 (Trichoptera: Limnephilidae), with distributional, molecular and ecological features

    PubMed Central

    KUČINIĆ, MLADEN; PREVIŠIĆ, ANA; GRAF, WOLFRAM; MIHOCI, IVA; ŠOUFEK, MARIN; STANIĆ-KOŠTROMAN, SVJETLANA; LELO, SUVAD; VITECEK, SIMON; WARINGER, JOHANN

    2016-01-01

    In this study we present morphological, molecular and ecological features of the last instar larvae of Drusus bosnicus with data about distribution of this species in Bosnia and Herzegovina. We also included are the most important diagnostic features enabling separation of larvae of D. bosnicus from larvae of the other European Drusinae and Trichoptera species. PMID:26249056

  3. Automatic Recognition of Solar Features for Developing Data Driven Prediction Models of Solar Activity and Space Weather

    DTIC Science & Technology

    2013-05-01

    Aschwanden, M. J. 2005, Physics of the Solar Corona . An Introduction with Problems and Solutions (2nd edition), ed. Aschwanden, M. J. Balasubramaniam, K...AFRL-OSR-VA-TR-2013-0020 Automatic Recognition of Solar Features for Developing Data Driven Prediction Models of Solar Activity...Automatic Recognition of Solar Features for Developing Data Driven Prediction Models of Solar Activity and Space Weather 5a. CONTRACT NUMBER FA9550-09

  4. Pre-transplantation minimal residual disease with cytogenetic and molecular diagnostic features improves risk stratification in acute myeloid leukemia

    PubMed Central

    Oran, Betül; Jorgensen, Jeff L.; Marin, David; Wang, Sa; Ahmed, Sairah; Alousi, Amin M.; Andersson, Borje S.; Bashir, Qaiser; Bassett, Roland; Lyons, Genevieve; Chen, Julianne; Rezvani, Katy; Popat, Uday; Kebriaei, Partow; Patel, Keyur; Rondon, Gabriela; Shpall, Elizabeth J.; Champlin, Richard E.

    2017-01-01

    Our aim was to improve outcome prediction after allogeneic hematopoietic stem cell transplantation in acute myeloid leukemia by combining cytogenetic and molecular data at diagnosis with minimal residual disease assessment by multicolor flow-cytometry at transplantation. Patients with acute myeloid leukemia in first complete remission in whom minimal residual disease was assessed at transplantation were included and categorized according to the European LeukemiaNet classification. The primary outcome was 1-year relapse incidence after transplantation. Of 152 patients eligible, 48 had minimal residual disease at the time of their transplant. Minimal residual disease-positive patients were older, required more therapy to achieve first remission, were more likely to have incomplete recovery of blood counts and had more adverse risk features by cytogenetics. Relapse incidence at 1 year was higher in patients with minimal residual disease (32.6% versus 14.4%, P=0.002). Leukemia-free survival (43.6% versus 64%, P=0.007) and overall survival (48.8% versus 66.9%, P=0.008) rates were also inferior in patients with minimal residual disease. In multivariable analysis, minimal residual disease status at transplantation independently predicted 1-year relapse incidence, identifying a subgroup of intermediate-risk patients, according to the European LeukemiaNet classification, with a particularly poor outcome. Assessment of minimal residual disease at transplantation in combination with cytogenetic and molecular findings provides powerful independent prognostic information in acute myeloid leukemia, lending support to the incorporation of minimal residual disease detection to refine risk stratification and develop a more individualized approach during hematopoietic stem cell transplantation. PMID:27540139

  5. Delta-radiomics features for the prediction of patient outcomes in non-small cell lung cancer.

    PubMed

    Fave, Xenia; Zhang, Lifei; Yang, Jinzhong; Mackin, Dennis; Balter, Peter; Gomez, Daniel; Followill, David; Jones, Aaron Kyle; Stingo, Francesco; Liao, Zhongxing; Mohan, Radhe; Court, Laurence

    2017-04-03

    Radiomics is the use of quantitative imaging features extracted from medical images to characterize tumor pathology or heterogeneity. Features measured at pretreatment have successfully predicted patient outcomes in numerous cancer sites. This project was designed to determine whether radiomics features measured from non-small cell lung cancer (NSCLC) change during therapy and whether those features (delta-radiomics features) can improve prognostic models. Features were calculated from pretreatment and weekly intra-treatment computed tomography images for 107 patients with stage III NSCLC. Pretreatment images were used to determine feature-specific image preprocessing. Linear mixed-effects models were used to identify features that changed significantly with dose-fraction. Multivariate models were built for overall survival, distant metastases, and local recurrence using only clinical factors, clinical factors and pretreatment radiomics features, and clinical factors, pretreatment radiomics features, and delta-radiomics features. All of the radiomics features changed significantly during radiation therapy. For overall survival and distant metastases, pretreatment compactness improved the c-index. For local recurrence, pretreatment imaging features were not prognostic, while texture-strength measured at the end of treatment significantly stratified high- and low-risk patients. These results suggest radiomics features change due to radiation therapy and their values at the end of treatment may be indicators of tumor response.

  6. Predicting the Poaceae pollen season: six month-ahead forecasting and identification of relevant features.

    PubMed

    Navares, Ricardo; Aznarte, José Luis

    2017-04-01

    In this paper, we approach the problem of predicting the concentrations of Poaceae pollen which define the main pollination season in the city of Madrid. A classification-based approach, based on a computational intelligence model (random forests), is applied to forecast the dates in which risk concentration levels are to be observed. Unlike previous works, the proposal extends the range of forecasting horizons up to 6 months ahead. Furthermore, the proposed model allows to determine the most influential factors for each horizon, making no assumptions about the significance of the weather features. The performace of the proposed model proves it as a successful tool for allergy patients in preventing and minimizing the exposure to risky pollen concentrations and for researchers to gain a deeper insight on the factors driving the pollination season.

  7. Predicting the Poaceae pollen season: six month-ahead forecasting and identification of relevant features

    NASA Astrophysics Data System (ADS)

    Navares, Ricardo; Aznarte, José Luis

    2016-09-01

    In this paper, we approach the problem of predicting the concentrations of Poaceae pollen which define the main pollination season in the city of Madrid. A classification-based approach, based on a computational intelligence model (random forests), is applied to forecast the dates in which risk concentration levels are to be observed. Unlike previous works, the proposal extends the range of forecasting horizons up to 6 months ahead. Furthermore, the proposed model allows to determine the most influential factors for each horizon, making no assumptions about the significance of the weather features. The performace of the proposed model proves it as a successful tool for allergy patients in preventing and minimizing the exposure to risky pollen concentrations and for researchers to gain a deeper insight on the factors driving the pollination season.

  8. Sequence features accurately predict genome-wide MeCP2 binding in vivo

    PubMed Central

    Rube, H. Tomas; Lee, Wooje; Hejna, Miroslav; Chen, Huaiyang; Yasui, Dag H.; Hess, John F.; LaSalle, Janine M.; Song, Jun S.; Gong, Qizhi

    2016-01-01

    Methyl-CpG binding protein 2 (MeCP2) is critical for proper brain development and expressed at near-histone levels in neurons, but the mechanism of its genomic localization remains poorly understood. Using high-resolution MeCP2-binding data, we show that DNA sequence features alone can predict binding with 88% accuracy. Integrating MeCP2 binding and DNA methylation in a probabilistic graphical model, we demonstrate that previously reported genome-wide association with methylation is in part due to MeCP2's affinity to GC-rich chromatin, a result replicated using published data. Furthermore, MeCP2 co-localizes with nucleosomes. Finally, MeCP2 binding downstream of promoters correlates with increased expression in Mecp2-deficient neurons. PMID:27008915

  9. Molecular dissection of colorectal cancer in pre-clinical models identifies biomarkers predicting sensitivity to EGFR inhibitors

    PubMed Central

    Schütte, Moritz; Risch, Thomas; Abdavi-Azar, Nilofar; Boehnke, Karsten; Schumacher, Dirk; Keil, Marlen; Yildiriman, Reha; Jandrasits, Christine; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Worth, Catherine L.; Schweiger, Caroline; Liebs, Sandra; Lange, Martin; Warnatz, Hans- Jörg; Butcher, Lee M.; Barrett, James E.; Sultan, Marc; Wierling, Christoph; Golob-Schwarzl, Nicole; Lax, Sigurd; Uranitsch, Stefan; Becker, Michael; Welte, Yvonne; Regan, Joseph Lewis; Silvestrov, Maxine; Kehler, Inge; Fusi, Alberto; Kessler, Thomas; Herwig, Ralf; Landegren, Ulf; Wienke, Dirk; Nilsson, Mats; Velasco, Juan A.; Garin-Chesa, Pilar; Reinhard, Christoph; Beck, Stephan; Schäfer, Reinhold; Regenbrecht, Christian R. A.; Henderson, David; Lange, Bodo; Haybaeck, Johannes; Keilholz, Ulrich; Hoffmann, Jens; Lehrach, Hans; Yaspo, Marie-Laure

    2017-01-01

    Colorectal carcinoma represents a heterogeneous entity, with only a fraction of the tumours responding to available therapies, requiring a better molecular understanding of the disease in precision oncology. To address this challenge, the OncoTrack consortium recruited 106 CRC patients (stages I–IV) and developed a pre-clinical platform generating a compendium of drug sensitivity data totalling >4,000 assays testing 16 clinical drugs on patient-derived in vivo and in vitro models. This large biobank of 106 tumours, 35 organoids and 59 xenografts, with extensive omics data comparing donor tumours and derived models provides a resource for advancing our understanding of CRC. Models recapitulate many of the genetic and transcriptomic features of the donors, but defined less complex molecular sub-groups because of the loss of human stroma. Linking molecular profiles with drug sensitivity patterns identifies novel biomarkers, including a signature outperforming RAS/RAF mutations in predicting sensitivity to the EGFR inhibitor cetuximab. PMID:28186126

  10. Assessment of two mammographic density related features in predicting near-term breast cancer risk

    NASA Astrophysics Data System (ADS)

    Zheng, Bin; Sumkin, Jules H.; Zuley, Margarita L.; Wang, Xingwei; Klym, Amy H.; Gur, David

    2012-02-01

    In order to establish a personalized breast cancer screening program, it is important to develop risk models that have high discriminatory power in predicting the likelihood of a woman developing an imaging detectable breast cancer in near-term (e.g., <3 years after a negative examination in question). In epidemiology-based breast cancer risk models, mammographic density is considered the second highest breast cancer risk factor (second to woman's age). In this study we explored a new feature, namely bilateral mammographic density asymmetry, and investigated the feasibility of predicting near-term screening outcome. The database consisted of 343 negative examinations, of which 187 depicted cancers that were detected during the subsequent screening examination and 155 that remained negative. We computed the average pixel value of the segmented breast areas depicted on each cranio-caudal view of the initial negative examinations. We then computed the mean and difference mammographic density for paired bilateral images. Using woman's age, subjectively rated density (BIRADS), and computed mammographic density related features we compared classification performance in estimating the likelihood of detecting cancer during the subsequent examination using areas under the ROC curves (AUC). The AUCs were 0.63+/-0.03, 0.54+/-0.04, 0.57+/-0.03, 0.68+/-0.03 when using woman's age, BIRADS rating, computed mean density and difference in computed bilateral mammographic density, respectively. Performance increased to 0.62+/-0.03 and 0.72+/-0.03 when we fused mean and difference in density with woman's age. The results suggest that, in this study, bilateral mammographic tissue density is a significantly stronger (p<0.01) risk indicator than both woman's age and mean breast density.

  11. Body Composition Features Predict Overall Survival in Patients With Hepatocellular Carcinoma

    PubMed Central

    Singal, Amit G; Zhang, Peng; Waljee, Akbar K; Ananthakrishnan, Lakshmi; Parikh, Neehar D; Sharma, Pratima; Barman, Pranab; Krishnamurthy, Venkataramu; Wang, Lu; Wang, Stewart C; Su, Grace L

    2016-01-01

    Objectives: Existing prognostic models for patients with hepatocellular carcinoma (HCC) have limitations. Analytic morphomics, a novel process to measure body composition using computational image-processing algorithms, may offer further prognostic information. The aim of this study was to develop and validate a prognostic model for HCC patients using body composition features and objective clinical information. Methods: Using computed tomography scans from a cohort of HCC patients at the VA Ann Arbor Healthcare System between January 2006 and December 2013, we developed a prognostic model using analytic morphomics and routine clinical data based on multivariate Cox regression and regularization methods. We assessed model performance using C-statistics and validated predicted survival probabilities. We validated model performance in an external cohort of HCC patients from Parkland Hospital, a safety-net health system in Dallas County. Results: The derivation cohort consisted of 204 HCC patients (20.1% Barcelona Clinic Liver Cancer classification (BCLC) 0/A), and the validation cohort had 225 patients (22.2% BCLC 0/A). The analytic morphomics model had good prognostic accuracy in the derivation cohort (C-statistic 0.80, 95% confidence interval (CI) 0.71–0.89) and external validation cohort (C-statistic 0.75, 95% CI 0.68–0.82). The accuracy of the analytic morphomics model was significantly higher than that of TNM and BCLC staging systems in derivation (P<0.001 for both) and validation (P<0.001 for both) cohorts. For calibration, mean absolute errors in predicted 1-year survival probabilities were 5.3% (90% quantile of 7.5%) and 7.6% (90% quantile of 12.5%) in the derivation and validation cohorts, respectively. Conclusion: Body composition features, combined with readily available clinical data, can provide valuable prognostic information for patients with newly diagnosed HCC. PMID:27228403

  12. Using theory and simulation to link molecular features of nanoscale fillers to morphology in polymer nanocomposites

    NASA Astrophysics Data System (ADS)

    Jayaraman, Arthi; Martin, Tyler

    2014-03-01

    Polymer nanocomposites are a class of materials that consist of a polymer matrix embedded with nanoscale fillers or additives that enhance the inherent properties of the matrix polymer. To engineer polymer nanocomposites for specific applications with target macroscopic properties (e.g. photovoltaics, photonics, automobile parts) it is important to have design rules that relate molecular features to equilibrium morphology of the composite. In the first part of the talk I will present our recent theory and simulation work on composites containing polymer grafted nanoparticles, showing how polydispersity in graft and matrix polymers (physical heterogeneity) can be used to stabilize dispersion of the nanoparticles within a polymer matrix. In the second part of the talk I will present our recent work linking block-copolymer functionalization to the nanoparticle location in a polymer matrix consisting of homopolymer blends.

  13. Molecular effective coverage surface area of optical clearing agents for predicting optical clearing potential

    NASA Astrophysics Data System (ADS)

    Feng, Wei; Ma, Ning; Zhu, Dan

    2015-03-01

    The improvement of methods for optical clearing agent prediction exerts an important impact on tissue optical clearing technique. The molecular dynamic simulation is one of the most convincing and simplest approaches to predict the optical clearing potential of agents by analyzing the hydrogen bonds, hydrogen bridges and hydrogen bridges type forming between agents and collagen. However, the above analysis methods still suffer from some problem such as analysis of cyclic molecule by reason of molecular conformation. In this study, a molecular effective coverage surface area based on the molecular dynamic simulation was proposed to predict the potential of optical clearing agents. Several typical cyclic molecules, fructose, glucose and chain molecules, sorbitol, xylitol were analyzed by calculating their molecular effective coverage surface area, hydrogen bonds, hydrogen bridges and hydrogen bridges type, respectively. In order to verify this analysis methods, in vitro skin samples optical clearing efficacy were measured after 25 min immersing in the solutions, fructose, glucose, sorbitol and xylitol at concentration of 3.5 M using 1951 USAF resolution test target. The experimental results show accordance with prediction of molecular effective coverage surface area. Further to compare molecular effective coverage surface area with other parameters, it can show that molecular effective coverage surface area has a better performance in predicting OCP of agents.

  14. Search performance is better predicted by tileability than presence of a unique basic feature

    PubMed Central

    Chang, Honghua; Rosenholtz, Ruth

    2016-01-01

    Traditional models of visual search such as feature integration theory (FIT; Treisman & Gelade, 1980), have suggested that a key factor determining task difficulty consists of whether or not the search target contains a “basic feature” not found in the other display items (distractors). Here we discriminate between such traditional models and our recent texture tiling model (TTM) of search (Rosenholtz, Huang, Raj, Balas, & Ilie, 2012b), by designing new experiments that directly pit these models against each other. Doing so is nontrivial, for two reasons. First, the visual representation in TTM is fully specified, and makes clear testable predictions, but its complexity makes getting intuitions difficult. Here we elucidate a rule of thumb for TTM, which enables us to easily design new and interesting search experiments. FIT, on the other hand, is somewhat ill-defined and hard to pin down. To get around this, rather than designing totally new search experiments, we start with five classic experiments that FIT already claims to explain: T among Ls, 2 among 5s, Q among Os, O among Qs, and an orientation/luminance-contrast conjunction search. We find that fairly subtle changes in these search tasks lead to significant changes in performance, in a direction predicted by TTM, providing definitive evidence in favor of the texture tiling model as opposed to traditional views of search. PMID:27548090

  15. Beyond intensity: Spectral features effectively predict music-induced subjective arousal.

    PubMed

    Gingras, Bruno; Marin, Manuela M; Fitch, W Tecumseh

    2014-01-01

    Emotions in music are conveyed by a variety of acoustic cues. Notably, the positive association between sound intensity and arousal has particular biological relevance. However, although amplitude normalization is a common procedure used to control for intensity in music psychology research, direct comparisons between emotional ratings of original and amplitude-normalized musical excerpts are lacking. In this study, 30 nonmusicians retrospectively rated the subjective arousal and pleasantness induced by 84 six-second classical music excerpts, and an additional 30 nonmusicians rated the same excerpts normalized for amplitude. Following the cue-redundancy and Brunswik lens models of acoustic communication, we hypothesized that arousal and pleasantness ratings would be similar for both versions of the excerpts, and that arousal could be predicted effectively by other acoustic cues besides intensity. Although the difference in mean arousal and pleasantness ratings between original and amplitude-normalized excerpts correlated significantly with the amplitude adjustment, ratings for both sets of excerpts were highly correlated and shared a similar range of values, thus validating the use of amplitude normalization in music emotion research. Two acoustic parameters, spectral flux and spectral entropy, accounted for 65% of the variance in arousal ratings for both sets, indicating that spectral features can effectively predict arousal. Additionally, we confirmed that amplitude-normalized excerpts were adequately matched for loudness. Overall, the results corroborate our hypotheses and support the cue-redundancy and Brunswik lens models.

  16. Somatic molecular changes and histo-pathological features of colorectal cancer in Tunisia

    PubMed Central

    Aissi, Sana; Buisine, Marie Pierre; Zerimech, Farid; Kourda, Nadia; Moussa, Amel; Manai, Mohamed; Porchet, Nicole

    2013-01-01

    AIM: To determine correlations between family history, clinical features and mutational status of genes involved in the progression of colorectal cancer (CRC). METHODS: Histo-pathological features and molecular changes [KRAS, BRAF and CTNNB1 genes mutations, microsatellite instability (MSI) phenotype, expression of mismatch repair (MMR) and mucin (MUC) 5AC proteins, mutation and expression analysis of TP53, MLH1 promoter hypermethylation analysis] were examined in a series of 51 unselected Tunisian CRC patients, 10 of them had a proven or probable hereditary disease, on the track of new tumoral markers for CRC susceptibility in Tunisian patients. RESULTS: As expected, MSI and MMR expression loss were associated to the presence of familial CRC (75% vs 9%, P < 0.001). However, no significant associations have been detected between personal or familial cancer history and KRAS (codons 12 and 13) or TP53 (exons 4-9) alterations. A significant inverse relationship has been observed between the presence of MSI and TP53 accumulation (10.0% vs 48.8%, P = 0.0335) in CRC tumors, suggesting different molecular pathways to CRC that in turn may reflect different environmental exposures. Interestingly, MUC5AC expression was significantly associated to the presence of MSI (46.7% vs 8.3%, P = 0.0039), MMR expression loss (46.7% vs 8.3%, P = 0.0039) and the presence of familial CRC (63% vs 23%, P = 0.039). CONCLUSION: These findings suggest that MUC5AC expression analysis may be useful in the screening of Tunisian patients with high risk of CRC. PMID:23983431

  17. Spatial Habitat Features Derived from Multiparametric Magnetic Resonance Imaging Data Are Associated with Molecular Subtype and 12-Month Survival Status in Glioblastoma Multiforme

    PubMed Central

    Lee, Joonsang; Narang, Shivali; Martinez, Juan; Rao, Ganesh; Rao, Arvind

    2015-01-01

    One of the most common and aggressive malignant brain tumors is Glioblastoma multiforme. Despite the multimodality treatment such as radiation therapy and chemotherapy (temozolomide: TMZ), the median survival rate of glioblastoma patient is less than 15 months. In this study, we investigated the association between measures of spatial diversity derived from spatial point pattern analysis of multiparametric magnetic resonance imaging (MRI) data with molecular status as well as 12-month survival in glioblastoma. We obtained 27 measures of spatial proximity (diversity) via spatial point pattern analysis of multiparametric T1 post-contrast and T2 fluid-attenuated inversion recovery MRI data. These measures were used to predict 12-month survival status (≤12 or >12 months) in 74 glioblastoma patients. Kaplan-Meier with receiver operating characteristic analyses was used to assess the relationship between derived spatial features and 12-month survival status as well as molecular subtype status in patients with glioblastoma. Kaplan-Meier survival analysis revealed that 14 spatial features were capable of stratifying overall survival in a statistically significant manner. For prediction of 12-month survival status based on these diversity indices, sensitivity and specificity were 0.86 and 0.64, respectively. The area under the receiver operating characteristic curve and the accuracy were 0.76 and 0.75, respectively. For prediction of molecular subtype status, proneural subtype shows highest accuracy of 0.93 among all molecular subtypes based on receiver operating characteristic analysis. We find that measures of spatial diversity from point pattern analysis of intensity habitats from T1 post-contrast and T2 fluid-attenuated inversion recovery images are associated with both tumor subtype status and 12-month survival status and may therefore be useful indicators of patient prognosis, in addition to providing potential guidance for molecularly-targeted therapies in

  18. Wetland features and landscape context predict the risk of wetland habitat loss.

    PubMed

    Gutzwiller, Kevin J; Flather, Curtis H

    2011-04-01

    Wetlands generally provide significant ecosystem services and function as important harbors of biodiversity. To ensure that these habitats are conserved, an efficient means of identifying wetlands at risk of conversion is needed, especially in the southern United States where the rate of wetland loss has been highest in recent decades. We used multivariate adaptive regression splines to develop a model to predict the risk of wetland habitat loss as a function of wetland features and landscape context. Fates of wetland habitats from 1992 to 1997 were obtained from the National Resources Inventory for the U.S. Forest Service's Southern Region, and land-cover data were obtained from the National Land Cover Data. We randomly selected 70% of our 40 617 observations to build the model (n = 28 432), and randomly divided the remaining 30% of the data into five Test data sets (n = 2437 each). The wetland and landscape variables that were important in the model, and their relative contributions to the model's predictive ability (100 = largest, 0 = smallest), were land-cover/ land-use of the surrounding landscape (100.0), size and proximity of development patches within 570 m (39.5), land ownership (39.1), road density within 570 m (37.5), percent woody and herbaceous wetland cover within 570 m (27.8), size and proximity of development patches within 5130 m (25.7), percent grasslands/herbaceous plants and pasture/hay cover within 5130 m (21.7), wetland type (21.2), and percent woody and herbaceous wetland cover within 1710 m (16.6). For the five Test data sets, Kappa statistics (0.40, 0.50, 0.52, 0.55, 0.56; P < 0.0001), area-under-the-receiver-operating-curve (AUC) statistics (0.78, 0.82, 0.83, 0.83, 0.84; P < 0.0001), and percent correct prediction of wetland habitat loss (69.1, 80.4, 81.7, 82.3, 83.1) indicated the model generally had substantial predictive ability across the South. Policy analysts and land-use planners can use the model and associated maps to prioritize

  19. Evaluating stability of histomorphometric features across scanner and staining variations: predicting biochemical recurrence from prostate cancer whole slide images

    NASA Astrophysics Data System (ADS)

    Leo, Patrick; Lee, George; Madabhushi, Anant

    2016-03-01

    Quantitative histomorphometry (QH) is the process of computerized extraction of features from digitized tissue slide images. Typically these features are used in machine learning classifiers to predict disease presence, behavior and outcome. Successful robust classifiers require features that both discriminate between classes of interest and are stable across data from multiple sites. Feature stability may be compromised by variation in slide staining and scanning procedures. These laboratory specific variables include dye batch, slice thickness and the whole slide scanner used to digitize the slide. The key therefore is to be able to identify features that are not only discriminating between the classes of interest (e.g. cancer and non-cancer or biochemical recurrence and non- recurrence) but also features that will not wildly fluctuate on slides representing the same tissue class but from across multiple different labs and sites. While there has been some recent efforts at understanding feature stability in the context of radiomics applications (i.e. feature analysis of radiographic images), relatively few attempts have been made at studying the trade-off between feature stability and discriminability for histomorphometric and digital pathology applications. In this paper we present two new measures, preparation-induced instability score (PI) and latent instability score (LI), to quantify feature instability across and within datasets. Dividing PI by LI yields a ratio for how often a feature for a specific tissue class (e.g. low grade prostate cancer) is different between datasets from different sites versus what would be expected from random chance alone. Using this ratio we seek to quantify feature vulnerability to variations in slide preparation and digitization. Since our goal is to identify stable QH features we evaluate these features for their stability and thus inclusion in machine learning based classifiers in a use case involving prostate cancer

  20. PREDICTING FIFTEEN-YEAR CANCER-SPECIFIC MORTALITY BASED ON THE PATHOLOGICAL FEATURES OF PROSTATE CANCER

    PubMed Central

    Eggener, Scott E.; Scardino, Peter T.; Walsh, Patrick C.; Han, Misop; Partin, Alan W.; Trock, Bruce J.; Feng, Zhaoyong; Wood, David P.; Eastham, James A.; Yossepowitch, Ofer; Rabah, Danny M.; Kattan, Michael W.; Yu, Changhong; Klein, Eric A.; Stephenson, Andrew J.

    2014-01-01

    Purpose Long-term prostate cancer-specific mortality (PCSM) after radical prostatectomy is poorly defined in the era of widespread screening. An understanding of the treated natural history of screen-detected cancers and the pathological risk factors for PCSM are needed for treatment decision-making. Methods Using Fine and Gray competing risk regression analysis, the clinical and pathological data and follow-up information of 11,521 patients treated by radical prostatectomy at four academic centers from 1987 to 2005 were modeled to predict PCSM. The model was validated on 12,389 patients treated at a separate institution during the same period. Results The overall 15-year PCSM was 7%. Primary and secondary pathological Gleason grade 4–5 (P < 0.001 for both), seminal vesicle invasion (P < 0.001), and year of surgery (P = 0.002) were significant predictors of PCSM. A nomogram predicting 15-year PCSM based on standard pathological parameters was accurate and discriminating with an externally-validated concordance index of 0.92. Stratified by patient age, 15-year PCSM for Gleason score ≤ 6, 3+4, 4+3, and 8–10 ranged from 0.2–1.2%, 4.2–6.5%, 6.6–11%, and 26–37%, respectively. The 15-year PCSM risks ranged from 0.8–1.5%, 2.9–10%, 15–27%, and 22–30% for organ-confined cancer, extraprostatic extension, seminal vesicle invasion, and lymph node metastasis, respectively. Only 3 of 9557 patients with organ-confined, Gleason score ≤ 6 cancers have died from prostate cancer. Conclusions The presence of poorly differentiated cancer and seminal vesicle invasion are the prime determinants of PCSM after radical prostatectomy. The risk of PCSM can be predicted with unprecedented accuracy once the pathological features of prostate cancer are known. PMID:21239008

  1. Molecular crosstalk between tumour and brain parenchyma instructs histopathological features in glioblastoma

    PubMed Central

    Bougnaud, Sébastien; Golebiewska, Anna; Oudin, Anaïs; Keunen, Olivier; Harter, Patrick N.; Mäder, Lisa; Azuaje, Francisco; Fritah, Sabrina; Stieber, Daniel; Kaoma, Tony; Vallar, Laurent; Brons, Nicolaas H.C.; Daubon, Thomas; Miletic, Hrvoje; Sundstrøm, Terje; Herold-Mende, Christel; Mittelbronn, Michel; Bjerkvig, Rolf; Niclou, Simone P.

    2016-01-01

    The histopathological and molecular heterogeneity of glioblastomas represents a major obstacle for effective therapies. Glioblastomas do not develop autonomously, but evolve in a unique environment that adapts to the growing tumour mass and contributes to the malignancy of these neoplasms. Here, we show that patient-derived glioblastoma xenografts generated in the mouse brain from organotypic spheroids reproducibly give rise to three different histological phenotypes: (i) a highly invasive phenotype with an apparent normal brain vasculature, (ii) a highly angiogenic phenotype displaying microvascular proliferation and necrosis and (iii) an intermediate phenotype combining features of invasion and vessel abnormalities. These phenotypic differences were visible during early phases of tumour development suggesting an early instructive role of tumour cells on the brain parenchyma. Conversely, we found that tumour-instructed stromal cells differentially influenced tumour cell proliferation and migration in vitro, indicating a reciprocal crosstalk between neoplastic and non-neoplastic cells. We did not detect any transdifferentiation of tumour cells into endothelial cells. Cell type-specific transcriptomic analysis of tumour and endothelial cells revealed a strong phenotype-specific molecular conversion between the two cell types, suggesting co-evolution of tumour and endothelial cells. Integrative bioinformatic analysis confirmed the reciprocal crosstalk between tumour and microenvironment and suggested a key role for TGFβ1 and extracellular matrix proteins as major interaction modules that shape glioblastoma progression. These data provide novel insight into tumour-host interactions and identify novel stroma-specific targets that may play a role in combinatorial treatment strategies against glioblastoma. PMID:27049916

  2. Programmatic features of aging originating in development: aging mechanisms beyond molecular damage?

    PubMed Central

    de Magalhães, João Pedro

    2012-01-01

    The idea that aging follows a predetermined sequence of events, a program, has been discredited by most contemporary authors. Instead, aging is largely thought to occur due to the accumulation of various forms of molecular damage. Recent work employing functional genomics now suggests that, indeed, certain facets of mammalian aging may follow predetermined patterns encoded in the genome as part of developmental processes. It appears that genetic programs coordinating some aspects of growth and development persist into adulthood and may become detrimental. This link between development and aging may occur due to regulated processes, including through the action of microRNAs and epigenetic mechanisms. Taken together with other results, in particular from worms, these findings provide evidence that some aging changes are not primarily a result of a build-up of stochastic damage but are rather a product of regulated processes. These processes are interpreted as forms of antagonistic pleiotropy, the product of a “shortsighted watchmaker,” and thus do not assume aging evolved for a purpose. Overall, it appears that the genome does, indeed, contain specific instructions that drive aging in animals, a radical shift in our perception of the aging process.—de Magalhães, J. P. Programmatic features of aging originating in development: aging mechanisms beyond molecular damage? PMID:22964300

  3. Racial Differences in Esophageal Squamous Cell Carcinoma: Incidence and Molecular Features

    PubMed Central

    Zhou, Kai; Yang, Liguang

    2017-01-01

    The incidence and histological type of esophageal cancer are highly variable depending on geographic location and race/ethnicity. Here we want to determine if racial difference exists in the molecular features of esophageal cancer. We firstly confirmed that the incidence rate of esophagus adenocarcinoma (EA) was higher in Whites than in Asians and Blacks, while the incidence of esophageal squamous cell carcinoma (ESCC) was highest in Asians. Then we compared the genome-wide somatic mutations, methylation, and gene expression to identify differential genes by race. The mutation frequencies of some genes in the same pathway showed opposite difference between Asian and White patients, but their functional effects to the pathway may be consistent. The global patterns of methylation and expression were similar, which reflected the common characteristics of ESCC tumors from different populations. A small number of genes had significant differences between Asians and Whites. More interesting, the racial differences of COL11A1 were consistent across multiple molecular levels, with higher mutation frequency, higher methylation, and lower expression in White patients. This indicated that COL11A1 might play important roles in ESCC, especially in White population. Additional studies are needed to further explore their functions in esophageal cancer. PMID:28393072

  4. Computer extracted texture features on T2w MRI to predict biochemical recurrence following radiation therapy for prostate cancer

    NASA Astrophysics Data System (ADS)

    Ginsburg, Shoshana B.; Rusu, Mirabela; Kurhanewicz, John; Madabhushi, Anant

    2014-03-01

    In this study we explore the ability of a novel machine learning approach, in conjunction with computer-extracted features describing prostate cancer morphology on pre-treatment MRI, to predict whether a patient will develop biochemical recurrence within ten years of radiation therapy. Biochemical recurrence, which is characterized by a rise in serum prostate-specific antigen (PSA) of at least 2 ng/mL above the nadir PSA, is associated with increased risk of metastasis and prostate cancer-related mortality. Currently, risk of biochemical recurrence is predicted by the Kattan nomogram, which incorporates several clinical factors to predict the probability of recurrence-free survival following radiation therapy (but has limited prediction accuracy). Semantic attributes on T2w MRI, such as the presence of extracapsular extension and seminal vesicle invasion and surrogate measure- ments of tumor size, have also been shown to be predictive of biochemical recurrence risk. While the correlation between biochemical recurrence and factors like tumor stage, Gleason grade, and extracapsular spread are well- documented, it is less clear how to predict biochemical recurrence in the absence of extracapsular spread and for small tumors fully contained in the capsule. Computer{extracted texture features, which quantitatively de- scribe tumor micro-architecture and morphology on MRI, have been shown to provide clues about a tumor's aggressiveness. However, while computer{extracted features have been employed for predicting cancer presence and grade, they have not been evaluated in the context of predicting risk of biochemical recurrence. This work seeks to evaluate the role of computer-extracted texture features in predicting risk of biochemical recurrence on a cohort of sixteen patients who underwent pre{treatment 1.5 Tesla (T) T2w MRI. We extract a combination of first-order statistical, gradient, co-occurrence, and Gabor wavelet features from T2w MRI. To identify which of these

  5. The associations between mast cell infiltration, clinical features and molecular types of invasive breast cancer

    PubMed Central

    Tang, Xiaoqiao; Zhang, Yifen; Huang, Tao

    2016-01-01

    Associations between mast cell infiltration and the clinical features and known molecular profile of breast cancer remain unclear. The distribution difference of mast cell was evaluated, in 219 patients with no special type of invasive carcinoma, using sorts of age, max diameter of cancer, histological type, lymph node metastasis as well as the expressions of estrogen receptor (ER), progestogen receptor (PR), human epidermal growth factor receptor 2 (HER-2) and nuclear protein Ki67. The mast cell density (MCD) in patients younger than 50 years old was significantly higher than that in patients with age ≥ 50. The MCD in ER or PR positive patients was significantly higher than MCD in ER or PR negative patients. The MCD in patients with Ki67 ≤ 14% was also significantly higher than MDC in patients with Ki67 > 14%. The MCD of patients with invasive ductal carcinoma was significantly higher than MCD of patients with invasive lobular carcinoma. No significant distribution difference of MCD was found to be associated with max diameter of cancer, lymph node metastasis and HER-2. Further analysis found that MDC was significantly higher in patients after neo-adjuvant chemotherapy. The distribution difference of mast cell widely exists in patients with distinct clinical features, the role of mast cell in breast cancer need further research with detailed and reasonable classification to clarify. PMID:27835573

  6. Clinical, Pathological, and Molecular Features of Lung Adenocarcinomas with AXL Expression

    PubMed Central

    Suda, Kenichi; Shimizu, Shigeki; Sakai, Kazuko; Mizuuchi, Hiroshi; Tomizawa, Kenji; Takemoto, Toshiki; Nishio, Kazuto; Mitsudomi, Tetsuya

    2016-01-01

    The receptor tyrosine kinase AXL is a member of the Tyro3-Axl-Mer receptor tyrosine kinase subfamily. AXL affects several cellular functions, including growth and migration. AXL aberration is reportedly a marker for poor prognosis and treatment resistance in various cancers. In this study, we analyzed clinical, pathological, and molecular features of AXL expression in lung adenocarcinomas (LADs). We examined 161 LAD specimens from patients who underwent pulmonary resections. When AXL protein expression was quantified (0, 1+, 2+, 3+) according to immunohistochemical staining intensity, results were 0: 35%; 1+: 20%; 2+: 37%; and 3+: 7% for the 161 samples. AXL expression status did not correlate with clinical features, including smoking status and pathological stage. However, patients whose specimens showed strong AXL expression (3+) had markedly poorer prognoses than other groups (P = 0.0033). Strong AXL expression was also significantly associated with downregulation of E-cadherin (P = 0.025) and CD44 (P = 0.0010). In addition, 9 of 12 specimens with strong AXL expression had driver gene mutations (6 with EGFR, 2 with KRAS, 1 with ALK). In conclusion, we found that strong AXL expression in surgically resected LADs was a predictor of poor prognosis. LADs with strong AXL expression were characterized by mesenchymal status, higher expression of stem-cell-like markers, and frequent driver gene mutations. PMID:27100677

  7. Molecular definition of a region of chromosome 21 that causes features of the Down syndrome phenotype

    PubMed Central

    Korenberg, Julie R.; Kawashima, Hiroko; Pulst, Stefan-M.; Ikeuchi, T.; Ogasawara, N.; Yamamoto, K.; Schonberg, Steven A.; West, Ruth; Allen, Leland; Magenis, Ellen; Ikawa, K.; Taniguchi, N.; Epstein, Charles J.

    1990-01-01

    Down syndrome (DS) is a major cause of mental retardation and heart disease. Although it is usually caused by the presence of an extra chromosome 21, a subset of the diagnostic features may be caused by the presence of only band 21q22. We now present evidence that significantly narrows the chromosomal region responsible for several of the phenotypic features of DS. We report a molecular and cytogenetic analysis of a three-generation family containing four individuals with clinical DS as manifested by the characteristic facial appearance, endocardial cushion defect, mental retardation, and probably dermatoglyphic changes. Autoradiograms of quantitative Southern blots of DNAs from two affected sisters, their carrier father, and a normal control were analyzed after hybridization with two to six unique DNA sequences regionally mapped on chromosome 21. These include cDNA probes for the genes for CuZn-superoxide dismutase (SOD1) mapping in 21q22.1 and for the amyloid precursor protein (APP) mapping in 21q11.2-21.05, in addition to six probes for single-copy sequences: D21S46 in 21q11.2-21.05, D21S47 and SF57 in 21q22.1-22.3, and D21S39, D21S42, and D21S43 in 21q22.3. All sequences located in 21q22.3 were present in three copies in the affected individuals, whereas those located proximal to this region were present in only two copies. In the carrier father, all DNA sequences were present in only two copies. Cytogenetic analysis of affected individuals employing R and G banding of prometaphase preparations combined with in situ hybridization revealed a translocation of the region from very distal 21q22.1 to 21qter to chromosome 4q. Except for a possible phenotypic contribution from the deletion of chromosome band 4q35, these data provide a molecular definition of the minimal region of chromosome 21 which, when duplicated, generates the facial features, heart defect, a component of the mental retardation, and probably several of the dermatoglyphic changes of DS. This region

  8. Predicting the biomechanical strength of proximal femur specimens with bone mineral density features and support vector regression

    NASA Astrophysics Data System (ADS)

    Huber, Markus B.; Yang, Chien-Chun; Carballido-Gamio, Julio; Bauer, Jan S.; Baum, Thomas; Nagarajan, Mahesh B.; Eckstein, Felix; Lochmüller, Eva; Majumdar, Sharmila; Link, Thomas M.; Wismüller, Axel

    2012-03-01

    To improve the clinical assessment of osteoporotic hip fracture risk, recent computer-aided diagnosis systems explore new approaches to estimate the local trabecular bone quality beyond bone density alone to predict femoral bone strength. In this context, statistical bone mineral density (BMD) features extracted from multi-detector computed tomography (MDCT) images of proximal femur specimens and different function approximations methods were compared in their ability to predict the biomechanical strength. MDCT scans were acquired in 146 proximal femur specimens harvested from human cadavers. The femurs' failure load (FL) was determined through biomechanical testing. An automated volume of interest (VOI)-fitting algorithm was used to define a consistent volume in the femoral head of each specimen. In these VOIs, the trabecular bone was represented by statistical moments of the BMD distribution and by pairwise spatial occurrence of BMD values using the gray-level co-occurrence (GLCM) approach. A linear multi-regression analysis (MultiReg) and a support vector regression algorithm with a linear kernel (SVRlin) were used to predict the FL from the image feature sets. The prediction performance was measured by the root mean square error (RMSE) for each image feature on independent test sets; in addition the coefficient of determination R2 was calculated. The best prediction result was obtained with a GLCM feature set using SVRlin, which had the lowest prediction error (RSME = 1.040+/-0.143, R2 = 0.544) and which was significantly lower that the standard approach of using BMD.mean and MultiReg (RSME = 1.093+/-0.133, R2 = 0.490, p<0.0001). The combined sets including BMD.mean and GLCM features had a similar or slightly lower performance than using only GLCM features. The results indicate that the performance of high-dimensional BMD features extracted from MDCT images in predicting the biomechanical strength of proximal femur specimens can be significantly improved by

  9. Attentional Selection Can Be Predicted by Reinforcement Learning of Task-relevant Stimulus Features Weighted by Value-independent Stickiness.

    PubMed

    Balcarras, Matthew; Ardid, Salva; Kaping, Daniel; Everling, Stefan; Womelsdorf, Thilo

    2016-02-01

    Attention includes processes that evaluate stimuli relevance, select the most relevant stimulus against less relevant stimuli, and bias choice behavior toward the selected information. It is not clear how these processes interact. Here, we captured these processes in a reinforcement learning framework applied to a feature-based attention task that required macaques to learn and update the value of stimulus features while ignoring nonrelevant sensory features, locations, and action plans. We found that value-based reinforcement learning mechanisms could account for feature-based attentional selection and choice behavior but required a value-independent stickiness selection process to explain selection errors while at asymptotic behavior. By comparing different reinforcement learning schemes, we found that trial-by-trial selections were best predicted by a model that only represents expected values for the task-relevant feature dimension, with nonrelevant stimulus features and action plans having only a marginal influence on covert selections. These findings show that attentional control subprocesses can be described by (1) the reinforcement learning of feature values within a restricted feature space that excludes irrelevant feature dimensions, (2) a stochastic selection process on feature-specific value representations, and (3) value-independent stickiness toward previous feature selections akin to perseveration in the motor domain. We speculate that these three mechanisms are implemented by distinct but interacting brain circuits and that the proposed formal account of feature-based stimulus selection will be important to understand how attentional subprocesses are implemented in primate brain networks.

  10. Anion pairs in room temperature ionic liquids predicted by molecular dynamics simulation, verified by spectroscopic characterization

    SciTech Connect

    Schwenzer, Birgit; Kerisit, Sebastien N.; Vijayakumar, M.

    2014-01-01

    Molecular-level spectroscopic analyses of an aprotic and a protic room-temperature ionic liquid, BMIM OTf and BMIM HSO4, respectively, have been carried out with the aim of verifying molecular dynamics simulations that predict anion pair formation in these fluid structures. Fourier-transform infrared spectroscopy, Raman spectroscopy and nuclear magnetic resonance spectroscopy of various nuclei support the theoretically-determined average molecular arrangements.

  11. Mutation spectrum of TP53 gene predicts clinicopathological features and survival of gastric cancer

    PubMed Central

    Tahara, Tomomitsu; Shibata, Tomoyuki; Okamoto, Yasuyuki; Yamazaki, Jumpei; Kawamura, Tomohiko; Horiguchi, Noriyuki; Okubo, Masaaki; Nakano, Naoko; Ishizuka, Takamitsu; Nagasaka, Mitsuo; Nakagawa, Yoshihito; Ohmiya, Naoki

    2016-01-01

    Background and aim TP53 gene is frequently mutated in gastric cancer (GC), but the relationship with clinicopathological features and prognosis is conflicting. Here, we screened TP53 mutation spectrum of 214 GC patients in relation to their clinicopathological features and prognosis. Results TP53 nonsilent mutations were detected in 80 cases (37.4%), being frequently occurred as C:G to T:A single nucleotide transitions at 5′-CpG-3′ sites. TP53 mutations occurred more frequently in differentiated histologic type than in undifferentiated type in the early stage (48.6% vs. 7%, P=0.0006), while the mutations correlated with venous invasion among advanced stage (47.7% vs. 20.7%, P=0.04). Subset of GC with TP53 hot spot mutations (R175, G245, R248, R273, R282) presented significantly worse overall survival and recurrence free survival compared to others (both P=0.001). Methods Matched biopsies from GC and adjacent tissues from 214 patients were used for the experiment. All coding regions of TP53 gene (exon2 to exon11) were examined using Sanger sequencing. Conclusion Our data suggest that GC with TP53 mutations seems to develop as differentiated histologic type and show aggressive biological behavior such as venous invasion. Moreover, our data emphasizes the importance of discriminating TP53 hot spot mutations (R175, G245, R248, R273, R282) to predict worse overall survival and recurrence free survival of GC patients. PMID:27323394

  12. Two-step feature selection for predicting survival time of patients with metastatic castrate resistant prostate cancer

    PubMed Central

    Shiga, Motoki

    2016-01-01

    Metastatic castrate resistant prostate cancer (mCRPC) is the major cause of death in prostate cancer patients. Even though some options for treatment of mCRPC have been developed, the most effective therapies remain unclear. Thus finding key patient clinical variables related with mCRPC is an important issue for understanding the disease progression mechanism of mCRPC and clinical decision making for these patients. The Prostate Cancer DREAM Challenge is a crowd-based competition to tackle this essential challenge using new large clinical datasets. This paper proposes an effective procedure for predicting global risks and survival times of these patients, aimed at sub-challenge 1a and 1b of the Prostate Cancer DREAM challenge. The procedure implements a two-step feature selection procedure, which first implements sparse feature selection for numerical clinical variables and statistical hypothesis testing of differences between survival curves caused by categorical clinical variables, and then implements a forward feature selection to narrow the list of informative features. Using Cox’s proportional hazards model with these selected features, this method predicted global risk and survival time of patients using a linear model whose input is a median time computed from the hazard model. The challenge results demonstrated that the proposed procedure outperforms the state of the art model by correctly selecting more informative features on both the global risk prediction and the survival time prediction. PMID:27990267

  13. Identification of critical chemical features for Aurora kinase-B inhibitors using Hip-Hop, virtual screening and molecular docking

    NASA Astrophysics Data System (ADS)

    Sakkiah, Sugunadevi; Thangapandian, Sundarapandian; John, Shalini; Lee, Keun Woo

    2011-01-01

    This study was performed to find the selective chemical features for Aurora kinase-B inhibitors using the potent methods like Hip-Hop, virtual screening, homology modeling, molecular dynamics and docking. The best hypothesis, Hypo1 was validated toward a wide range of test set containing the selective inhibitors of Aurora kinase-B. Homology modeling and molecular dynamics studies were carried out to perform the molecular docking studies. The best hypothesis Hypo1 was used as a 3D query to screen the chemical databases. The screened molecules from the databases were sorted based on ADME and drug like properties. The selective hit compounds were docked and the hydrogen bond interactions with the critical amino acids present in Aurora kinase-B were compared with the chemical features present in the Hypo1. Finally, we suggest that the chemical features present in the Hypo1 are vital for a molecule to inhibit the Aurora kinase-B activity.

  14. Mining SOM expression portraits: feature selection and integrating concepts of molecular function

    PubMed Central

    2012-01-01

    Background Self organizing maps (SOM) enable the straightforward portraying of high-dimensional data of large sample collections in terms of sample-specific images. The analysis of their texture provides so-called spot-clusters of co-expressed genes which require subsequent significance filtering and functional interpretation. We address feature selection in terms of the gene ranking problem and the interpretation of the obtained spot-related lists using concepts of molecular function. Results Different expression scores based either on simple fold change-measures or on regularized Student’s t-statistics are applied to spot-related gene lists and compared with special emphasis on the error characteristics of microarray expression data. The spot-clusters are analyzed using different methods of gene set enrichment analysis with the focus on overexpression and/or overrepresentation of predefined sets of genes. Metagene-related overrepresentation of selected gene sets was mapped into the SOM images to assign gene function to different regions. Alternatively we estimated set-related overexpression profiles over all samples studied using a gene set enrichment score. It was also applied to the spot-clusters to generate lists of enriched gene sets. We used the tissue body index data set, a collection of expression data of human tissues as an illustrative example. We found that tissue related spots typically contain enriched populations of gene sets well corresponding to molecular processes in the respective tissues. In addition, we display special sets of housekeeping and of consistently weak and high expressed genes using SOM data filtering. Conclusions The presented methods allow the comprehensive downstream analysis of SOM-transformed expression data in terms of cluster-related gene lists and enriched gene sets for functional interpretation. SOM clustering implies the ability to define either new gene sets using selected SOM spots or to verify and/or to amend existing

  15. Features of exciton dynamics in molecular nanoclusters (J-aggregates): Exciton self-trapping (Review Article)

    NASA Astrophysics Data System (ADS)

    Malyukin, Yu. V.; Sorokin, A. V.; Semynozhenko, V. P.

    2016-06-01

    We present thoroughly analyzed experimental results that demonstrate the anomalous manifestation of the exciton self-trapping effect, which is already well-known in bulk crystals, in ordered molecular nanoclusters called J-aggregates. Weakly-coupled one-dimensional (1D) molecular chains are the main structural feature of J-aggregates, wherein the electron excitations are manifested as 1D Frenkel excitons. According to the continuum theory of Rashba-Toyozawa, J-aggregates can have only self-trapped excitons, because 1D excitons must adhere to barrier-free self-trapping at any exciton-phonon coupling constant g = ɛLR/2β, wherein ɛLR is the lattice relaxation energy, and 2β is the half-width of the exciton band. In contrast, very often only the luminescence of free, mobile excitons would manifest in experiments involving J-aggregates. Using the Urbach rule in order to analyze the low-frequency region of the low-temperature exciton absorption spectra has shown that J-aggregates can have both a weak (g < 1) and a strong (g > 1) exciton-phonon coupling. Moreover, it is experimentally demonstrated that under certain conditions, the J-aggregate excited state can have both free and self-trapped excitons, i.e., we establish the existence of a self-trapping barrier for 1D Frenkel excitons. We demonstrate and analyze the reasons behind the anomalous existence of both free and self-trapped excitons in J-aggregates, and demonstrate how exciton-self trapping efficiency can be managed in J-aggregates by varying the values of g, which is fundamentally impossible in bulk crystals. We discuss how the exciton-self trapping phenomenon can be used as an alternate interpretation of the wide band emission of some J-aggregates, which has thus far been explained by the strongly localized exciton model.

  16. STAT3 Expression, Molecular Features, Inflammation Patterns and Prognosis in a Database of 724 Colorectal Cancers

    PubMed Central

    Morikawa, Teppei; Baba, Yoshifumi; Yamauchi, Mai; Kuchiba, Aya; Nosho, Katsuhiko; Shima, Kaori; Tanaka, Noriko; Huttenhower, Curtis; Frank, David A.; Fuchs, Charles S.; Ogino, Shuji

    2010-01-01

    Purpose STAT3 (signal transducer and activator of transcription 3) is a transcription factor that is constitutively activated in some cancers. STAT3 appears to play crucial roles in cell proliferation and survival, angiogenesis, tumor-promoting inflammation and suppression of anti-tumor host immune response in the tumor microenvironment. Although the STAT3 signaling pathway is a potential drug target, clinical, pathologic, molecular or prognostic features of STAT3-activated colorectal cancer remain uncertain. Experimental Design Utilizing a database of 724 colon and rectal cancer cases, we evaluated phosphorylated STAT3 (p-STAT3) expression by immunohistochemistry. Cox proportional hazards model was used to compute mortality hazard ratio (HR), adjusting for clinical, pathologic and molecular features, including microsatellite instability (MSI), the CpG island methylator phenotype (CIMP), LINE-1 methylation, 18q loss of heterozygosity, TP53 (p53), CTNNB1 (β-catenin), JC virus T-antigen, and KRAS, BRAF, and PIK3CA mutations. Results Among the 724 tumors, 131 (18%) showed high-level p-STAT3 expression (p-STAT3-high), 244 (34%) showed low-level expression (p-STAT3-low), and the remaining 349 (48%) were negative for p-STAT3. p-STAT3 overexpression was associated with significantly higher colorectal cancer-specific mortality [log-rank p=0.0020; univariate HR (p-STAT3-high vs. p-STAT3-negative) 1.85, 95% confidence interval (CI) 1.30–2.63, Ptrend =0.0005; multivariate HR, 1.61, 95% CI 1.11–2.34, Ptrend =0.015). p-STAT3 expression was positively associated with peritumoral lymphocytic reaction (multivariate odds ratio 3.23; 95% CI, 1.89–5.53; p<0.0001). p-STAT3 expression was not associated with MSI, CIMP, or LINE-1 hypomethylation. Conclusions STAT3 activation in colorectal cancer is associated with adverse clinical outcome, supporting its potential roles as a prognostic biomarker and a chemoprevention and/or therapeutic target. PMID:21310826

  17. Clinical Features, Outcomes, and Molecular Characteristics of Community- and Health Care-Associated Staphylococcus lugdunensis Infections

    PubMed Central

    Yeh, Chun-Fu; Chang, Shih-Cheng; Cheng, Chun-Wen; Lin, Jung-Fu; Liu, Tsui-Ping

    2016-01-01

    Staphylococcus lugdunensis is a major cause of aggressive endocarditis, but it is also responsible for a broad spectrum of infections. The differences in clinical and molecular characteristics between community-associated (CA) and health care-associated (HA) S. lugdunensis infections have remained unclear. We performed a retrospective study of S. lugdunensis infections between 2003 and 2014 to compare the clinical and molecular characteristics of CA and HA isolates. We collected 129 S. lugdunensis isolates in total: 81 (62.8%) HA isolates and 48 (37.2%) CA isolates. HA infections were more frequent than CA infections in children (16.0% versus 4.2%, respectively; P = 0.041) and the elderly (38.3% versus 14.6%, respectively; P = 0.004). The CA isolates were more likely to cause skin and soft tissue infections (85.4% versus 19.8%, respectively; P < 0.001). HA isolates were more frequently responsible for bacteremia of unknown origin (34.6% versus 4.2%, respectively; P < 0.001) and for catheter-related bacteremia (12.3% versus 0%, respectively; P = 0.011) than CA isolates. Fourteen-day mortality was higher for HA infections than for CA infections (11.1% versus 0%, respectively). A higher proportion of the HA isolates than of the CA isolates were resistant to penicillin (76.5% versus 52.1%, respectively; P = 0.004) and oxacillin (32.1% versus 2.1%, respectively; P < 0.001). Two major clonal complexes (CC1 and CC3) were identified. Sequence type 41 (ST41) was the most common sequence type identified (29.5%). The proportion of ST38 isolates was higher for HA than for CA infections (33.3% versus 12.5%, respectively; P = 0.009). These isolates were of staphylococcal cassette chromosome mec element (SCCmec)type IV, V, or Vt. HA and CA S. lugdunensis infections differ in terms of their clinical features, outcome, antibiotic susceptibilities, and molecular characteristics. PMID:27225402

  18. Molecular pathway activation features of pediatric acute myeloid leukemia (AML) and acute lymphoblast leukemia (ALL) cells.

    PubMed

    Petrov, Ivan; Suntsova, Maria; Mutorova, Olga; Sorokin, Maxim; Garazha, Andrew; Ilnitskaya, Elena; Spirin, Pavel; Larin, Sergey; Kovalchuk, Olga; Prassolov, Vladimir; Zhavoronkov, Alex; Roumiantsev, Alexander; Buzdin, Anton

    2016-11-19

    Acute lymphoblast leukemia (ALL) is characterized by overproduction of immature white blood cells in the bone marrow. ALL is most common in the childhood and has high (>80%) cure rate. In contrast, acute myeloid leukemia (AML) has far greater mortality rate than the ALL and is most commonly affecting older adults. However, AML is a leading cause of childhood cancer mortality. In this study, we compare gene expression and molecular pathway activation patterns in three normal blood, seven pediatric ALL and seven pediatric AML bone marrow samples. We identified 172/94 and 148/31 characteristic gene expression/pathway activation signatures, clearly distinguishing pediatric ALL and AML cells, respectively, from the normal blood. The pediatric AML and ALL cells differed by 139/34 gene expression/pathway activation biomarkers. For the adult 30 AML and 17 normal blood samples, we found 132/33 gene expression/pathway AML-specific features, of which only 7/2 were common for the adult and pediatric AML and, therefore, age-independent. At the pathway level, we found more differences than similarities between the adult and pediatric forms. These findings suggest that the adult and pediatric AMLs may require different treatment strategies.

  19. Cryptosporidiosis in HIV/AIDS Patients in Kenya: Clinical Features, Epidemiology, Molecular Characterization and Antibody Responses

    PubMed Central

    Wanyiri, Jane W.; Kanyi, Henry; Maina, Samuel; Wang, David E.; Steen, Aaron; Ngugi, Paul; Kamau, Timothy; Waithera, Tabitha; O'Connor, Roberta; Gachuhi, Kimani; Wamae, Claire N.; Mwamburi, Mkaya; Ward, Honorine D.

    2014-01-01

    We investigated the epidemiological and clinical features of cryptosporidiosis, the molecular characteristics of infecting species and serum antibody responses to three Cryptosporidium-specific antigens in human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS) patients in Kenya. Cryptosporidium was the most prevalent enteric pathogen and was identified in 56 of 164 (34%) of HIV/AIDS patients, including 25 of 70 (36%) with diarrhea and 31 of 94 (33%) without diarrhea. Diarrhea in patients exclusively infected with Cryptosporidium was significantly associated with the number of children per household, contact with animals, and water treatment. Cryptosporidium hominis was the most prevalent species and the most prevalent subtype family was Ib. Patients without diarrhea had significantly higher serum IgG levels to Chgp15, Chgp40 and Cp23, and higher fecal IgA levels to Chgp15 and Chgp40 than those with diarrhea suggesting that antibody responses to these antigens may be associated with protection from diarrhea and supporting further investigation of these antigens as vaccine candidates. PMID:24865675

  20. Molecular features in complex environment: Cooperative team players during excited state bond cleavage

    PubMed Central

    Thallmair, Sebastian; Roos, Matthias K.; de Vivie-Riedle, Regina

    2016-01-01

    Photoinduced bond cleavage is often employed for the generation of highly reactive carbocations in solution and to study their reactivity. Diphenylmethyl derivatives are prominent precursors in polar and moderately polar solvents like acetonitrile or dichloromethane. Depending on the leaving group, the photoinduced bond cleavage occurs on a femtosecond to picosecond time scale and typically leads to two distinguishable products, the desired diphenylmethyl cations (Ph2CH+) and as competing by-product the diphenylmethyl radicals (Ph2CH•). Conical intersections are the chief suspects for such ultrafast branching processes. We show for two typical examples, the neutral diphenylmethylchloride (Ph2CH–Cl) and the charged diphenylmethyltriphenylphosphonium ions (Ph2CH−PPh3+) that the role of the conical intersections depends not only on the molecular features but also on the interplay with the environment. It turns out to differ significantly for both precursors. Our analysis is based on quantum chemical and quantum dynamical calculations. For comparison, we use ultrafast transient absorption measurements. In case of Ph2CH–Cl, we can directly connect the observed signals to two early three-state and two-state conical intersections, both close to the Franck-Condon region. In case of the Ph2CH−PPh3+, dynamic solvent effects are needed to activate a two-state conical intersection at larger distances along the reaction coordinate. PMID:26958588

  1. Molecular pathway activation features of pediatric acute myeloid leukemia (AML) and acute lymphoblast leukemia (ALL) cells

    PubMed Central

    Petrov, Ivan; Suntsova, Maria; Mutorova, Olga; Sorokin, Maxim; Garazha, Andrew; Ilnitskaya, Elena; Spirin, Pavel; Larin, Sergey; Zhavoronkov, Alex; Kovalchuk, Olga; Prassolov, Vladimir; Roumiantsev, Alexander; Buzdin, Anton

    2016-01-01

    Acute lymphoblast leukemia (ALL) is characterized by overproduction of immature white blood cells in the bone marrow. ALL is most common in the childhood and has high (>80%) cure rate. In contrast, acute myeloid leukemia (AML) has far greater mortality rate than the ALL and is most commonly affecting older adults. However, AML is a leading cause of childhood cancer mortality. In this study, we compare gene expression and molecular pathway activation patterns in three normal blood, seven pediatric ALL and seven pediatric AML bone marrow samples. We identified 172/94 and 148/31 characteristic gene expression/pathway activation signatures, clearly distinguishing pediatric ALL and AML cells, respectively, from the normal blood. The pediatric AML and ALL cells differed by 139/34 gene expression/pathway activation biomarkers. For the adult 30 AML and 17 normal blood samples, we found 132/33 gene expression/pathway AML-specific features, of which only 7/2 were common for the adult and pediatric AML and, therefore, age-independent. At the pathway level, we found more differences than similarities between the adult and pediatric forms. These findings suggest that the adult and pediatric AMLs may require different treatment strategies. PMID:27870639

  2. Association of Fusobacterium species in pancreatic cancer tissues with molecular features and prognosis.

    PubMed

    Mitsuhashi, Kei; Nosho, Katsuhiko; Sukawa, Yasutaka; Matsunaga, Yasutaka; Ito, Miki; Kurihara, Hiroyoshi; Kanno, Shinichi; Igarashi, Hisayoshi; Naito, Takafumi; Adachi, Yasushi; Tachibana, Mami; Tanuma, Tokuma; Maguchi, Hiroyuki; Shinohara, Toshiya; Hasegawa, Tadashi; Imamura, Masafumi; Kimura, Yasutoshi; Hirata, Koichi; Maruyama, Reo; Suzuki, Hiromu; Imai, Kohzoh; Yamamoto, Hiroyuki; Shinomura, Yasuhisa

    2015-03-30

    Recently, bacterial infection causing periodontal disease has attracted considerable attention as a risk factor for pancreatic cancer. Fusobacterium species is an oral bacterial group of the human microbiome. Some evidence suggests that Fusobacterium species promote colorectal cancer development; however, no previous studies have reported the association between Fusobacterium species and pancreatic cancer. Therefore, we examined whether Fusobacterium species exist in pancreatic cancer tissue. Using a database of 283 patients with pancreatic ductal adenocarcinoma (PDAC), we tested cancer tissue specimens for Fusobacterium species. We also tested the specimens for KRAS, NRAS, BRAF and PIK3CA mutations and measured microRNA-21 and microRNA-31. In addition, we assessed epigenetic alterations, including CpG island methylator phenotype (CIMP). Our data showed an 8.8% detection rate of Fusobacterium species in pancreatic cancers; however, tumor Fusobacterium status was not associated with any clinical and molecular features. In contrast, in multivariate Cox regression analysis, compared with the Fusobacterium species-negative group, we observed significantly higher cancer-specific mortality rates in the positive group (p = 0.023). In conclusion, Fusobacterium species were detected in pancreatic cancer tissue. Tumor Fusobacterium species status is independently associated with a worse prognosis of pancreatic cancer, suggesting that Fusobacterium species may be a prognostic biomarker of pancreatic cancer.

  3. Molecular features and toxicological properties of four common pesticides, acetamiprid, deltamethrin, chlorpyriphos and fipronil.

    PubMed

    Taillebois, Emiliane; Alamiddine, Zakaria; Brazier, Christine; Graton, Jérôme; Laurent, Adèle D; Thany, Steeve H; Le Questel, Jean-Yves

    2015-04-01

    Structural features and selected physicochemical properties of four common pesticides: acetamiprid (neonicotinoid), chlorpyriphos (organophosphate insecticide), deltamethrin (pyrethroid) and fipronil (phenylpyrazole) have been investigated by Density Functional Theory quantum chemical calculations. The high flexible character of these insecticides is revealed by the numerous conformers obtained, located within a 20kJmol(-1) range in the gas phase. In line with this trend, a redistribution of the energetic minima is observed in water medium. Molecular electrostatic potential calculations provide a ranking of the potential interaction sites of the four insecticides. The theoretical studies reported in the present work are completed by comparative toxicological assays against three aphid strains. Thus, the same toxicity order for the two susceptible strains Myzus persicae 4106A and Acyrthosiphon pisum LSR1: acetamiprid>fipronil>deltamethrin>chlorpyriphos is revealed. In the resistant strain M. persicae 1300145, the toxicity order is modified: acetamiprid>fipronil>chlorpyriphos>deltamethrin. Interestingly, the strain 1300145 which is known to be resistant to neonicotinoids, is also less sensitive to deltamethrin, chlorpyriphos and fipronil.

  4. Robust prediction of B-factor profile from sequence using two-stage SVR based on random forest feature selection.

    PubMed

    Pan, Xiao-Yong; Shen, Hong-Bin

    2009-01-01

    B-factor is highly correlated with protein internal motion, which is used to measure the uncertainty in the position of an atom within a crystal structure. Although the rapid progress of structural biology in recent years makes more accurate protein structures available than ever, with the avalanche of new protein sequences emerging during the post-genomic Era, the gap between the known protein sequences and the known protein structures becomes wider and wider. It is urgent to develop automated methods to predict B-factor profile from the amino acid sequences directly, so as to be able to timely utilize them for basic research. In this article, we propose a novel approach, called PredBF, to predict the real value of B-factor. We firstly extract both global and local features from the protein sequences as well as their evolution information, then the random forests feature selection is applied to rank their importance and the most important features are inputted to a two-stage support vector regression (SVR) for prediction, where the initial predicted outputs from the 1(st) SVR are further inputted to the 2nd layer SVR for final refinement. Our results have revealed that a systematic analysis of the importance of different features makes us have deep insights into the different contributions of features and is very necessary for developing effective B-factor prediction tools. The two-layer SVR prediction model designed in this study further enhanced the robustness of predicting the B-factor profile. As a web server, PredBF is freely available at: http://www.csbio.sjtu.edu.cn/bioinf/PredBF for academic use.

  5. Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA data set.

    PubMed

    Li, Hui; Zhu, Yitan; Burnside, Elizabeth S; Huang, Erich; Drukker, Karen; Hoadley, Katherine A; Fan, Cheng; Conzen, Suzanne D; Zuley, Margarita; Net, Jose M; Sutton, Elizabeth; Whitman, Gary J; Morris, Elizabeth; Perou, Charles M; Ji, Yuan; Giger, Maryellen L

    2016-01-01

    Using quantitative radiomics, we demonstrate that computer-extracted magnetic resonance (MR) image-based tumor phenotypes can be predictive of the molecular classification of invasive breast cancers. Radiomics analysis was performed on 91 MRIs of biopsy-proven invasive breast cancers from National Cancer Institute's multi-institutional TCGA/TCIA. Immunohistochemistry molecular classification was performed including estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2, and for 84 cases, the molecular subtype (normal-like, luminal A, luminal B, HER2-enriched, and basal-like). Computerized quantitative image analysis included: three-dimensional lesion segmentation, phenotype extraction, and leave-one-case-out cross validation involving stepwise feature selection and linear discriminant analysis. The performance of the classifier model for molecular subtyping was evaluated using receiver operating characteristic analysis. The computer-extracted tumor phenotypes were able to distinguish between molecular prognostic indicators; area under the ROC curve values of 0.89, 0.69, 0.65, and 0.67 in the tasks of distinguishing between ER+ versus ER-, PR+ versus PR-, HER2+ versus HER2-, and triple-negative versus others, respectively. Statistically significant associations between tumor phenotypes and receptor status were observed. More aggressive cancers are likely to be larger in size with more heterogeneity in their contrast enhancement. Even after controlling for tumor size, a statistically significant trend was observed within each size group (P = 0.04 for lesions ≤ 2 cm; P = 0.02 for lesions >2 to ≤5 cm) as with the entire data set (P-value = 0.006) for the relationship between enhancement texture (entropy) and molecular subtypes (normal-like, luminal A, luminal B, HER2-enriched, basal-like). In conclusion, computer-extracted image phenotypes show promise for high-throughput discrimination of breast cancer subtypes and may yield a

  6. Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the TCGA/TCIA data set

    PubMed Central

    Li, Hui; Zhu, Yitan; Burnside, Elizabeth S; Huang, Erich; Drukker, Karen; Hoadley, Katherine A; Fan, Cheng; Conzen, Suzanne D; Zuley, Margarita; Net, Jose M; Sutton, Elizabeth; Whitman, Gary J; Morris, Elizabeth; Perou, Charles M; Ji, Yuan; Giger, Maryellen L

    2016-01-01

    Using quantitative radiomics, we demonstrate that computer-extracted magnetic resonance (MR) image-based tumor phenotypes can be predictive of the molecular classification of invasive breast cancers. Radiomics analysis was performed on 91 MRIs of biopsy-proven invasive breast cancers from National Cancer Institute’s multi-institutional TCGA/TCIA. Immunohistochemistry molecular classification was performed including estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2, and for 84 cases, the molecular subtype (normal-like, luminal A, luminal B, HER2-enriched, and basal-like). Computerized quantitative image analysis included: three-dimensional lesion segmentation, phenotype extraction, and leave-one-case-out cross validation involving stepwise feature selection and linear discriminant analysis. The performance of the classifier model for molecular subtyping was evaluated using receiver operating characteristic analysis. The computer-extracted tumor phenotypes were able to distinguish between molecular prognostic indicators; area under the ROC curve values of 0.89, 0.69, 0.65, and 0.67 in the tasks of distinguishing between ER+ versus ER−, PR+ versus PR−, HER2+ versus HER2−, and triple-negative versus others, respectively. Statistically significant associations between tumor phenotypes and receptor status were observed. More aggressive cancers are likely to be larger in size with more heterogeneity in their contrast enhancement. Even after controlling for tumor size, a statistically significant trend was observed within each size group (P = 0.04 for lesions ≤ 2 cm; P = 0.02 for lesions >2 to ≤5 cm) as with the entire data set (P-value = 0.006) for the relationship between enhancement texture (entropy) and molecular subtypes (normal-like, luminal A, luminal B, HER2-enriched, basal-like). In conclusion, computer-extracted image phenotypes show promise for high-throughput discrimination of breast cancer subtypes and may yield a

  7. Prediction of troponin-T degradation using color image texture features in 10d aged beef longissimus steaks.

    PubMed

    Sun, X; Chen, K J; Berg, E P; Newman, D J; Schwartz, C A; Keller, W L; Maddock Carlin, K R

    2014-02-01

    The objective was to use digital color image texture features to predict troponin-T degradation in beef. Image texture features, including 88 gray level co-occurrence texture features, 81 two-dimension fast Fourier transformation texture features, and 48 Gabor wavelet filter texture features, were extracted from color images of beef strip steaks (longissimus dorsi, n = 102) aged for 10d obtained using a digital camera and additional lighting. Steaks were designated degraded or not-degraded based on troponin-T degradation determined on d 3 and d 10 postmortem by immunoblotting. Statistical analysis (STEPWISE regression model) and artificial neural network (support vector machine model, SVM) methods were designed to classify protein degradation. The d 3 and d 10 STEPWISE models were 94% and 86% accurate, respectively, while the d 3 and d 10 SVM models were 63% and 71%, respectively, in predicting protein degradation in aged meat. STEPWISE and SVM models based on image texture features show potential to predict troponin-T degradation in meat.

  8. Toward Understanding the Size Dependence of Shape Features for Predicting Spiculation in Lung Nodules for Computer-Aided Diagnosis.

    PubMed

    Niehaus, Ron; Raicu, Daniela Stan; Furst, Jacob; Armato, Samuel

    2015-12-01

    We analyze the importance of shape features for predicting spiculation ratings assigned by radiologists to lung nodules in computed tomography (CT) scans. Using the Lung Image Database Consortium (LIDC) data and classification models based on decision trees, we demonstrate that the importance of several shape features increases disproportionately relative to other image features with increasing size of the nodule. Our shaped-based classification results show an area under the receiver operating characteristic (ROC) curve of 0.65 when classifying spiculation for small nodules and an area of 0.91 for large nodules, resulting in a 26% difference in classification performance using shape features. An analysis of the results illustrates that this change in performance is driven by features that measure boundary complexity, which perform well for large nodules but perform relatively poorly and do no better than other features for small nodules. For large nodules, the roughness of the segmented boundary maps well to the semantic concept of spiculation. For small nodules, measuring directly the complexity of hard segmentations does not yield good results for predicting spiculation due to limits imposed by spatial resolution and the uncertainty in boundary location. Therefore, a wider range of features, including shape, texture, and intensity features, are needed to predict spiculation ratings for small nodules. A further implication is that the efficacy of shape features for a particular classifier used to create computer-aided diagnosis systems depends on the distribution of nodule sizes in the training and testing sets, which may not be consistent across different research studies.

  9. Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast.

    PubMed

    Tsai, Zing Tsung-Yeh; Shiu, Shin-Han; Tsai, Huai-Kuang

    2015-08-01

    Transcription factor (TF) binding is determined by the presence of specific sequence motifs (SM) and chromatin accessibility, where the latter is influenced by both chromatin state (CS) and DNA structure (DS) properties. Although SM, CS, and DS have been used to predict TF binding sites, a predictive model that jointly considers CS and DS has not been developed to predict either TF-specific binding or general binding properties of TFs. Using budding yeast as model, we found that machine learning classifiers trained with either CS or DS features alone perform better in predicting TF-specific binding compared to SM-based classifiers. In addition, simultaneously considering CS and DS further improves the accuracy of the TF binding predictions, indicating the highly complementary nature of these two properties. The contributions of SM, CS, and DS features to binding site predictions differ greatly between TFs, allowing TF-specific predictions and potentially reflecting different TF binding mechanisms. In addition, a "TF-agnostic" predictive model based on three DNA "intrinsic properties" (in silico predicted nucleosome occupancy, major groove geometry, and dinucleotide free energy) that can be calculated from genomic sequences alone has performance that rivals the model incorporating experiment-derived data. This intrinsic property model allows prediction of binding regions not only across TFs, but also across DNA-binding domain families with distinct structural folds. Furthermore, these predicted binding regions can help identify TF binding sites that have a significant impact on target gene expression. Because the intrinsic property model allows prediction of binding regions across DNA-binding domain families, it is TF agnostic and likely describes general binding potential of TFs. Thus, our findings suggest that it is feasible to establish a TF agnostic model for identifying functional regulatory regions in potentially any sequenced genome.

  10. Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast

    PubMed Central

    Tsai, Zing Tsung-Yeh; Shiu, Shin-Han; Tsai, Huai-Kuang

    2015-01-01

    Transcription factor (TF) binding is determined by the presence of specific sequence motifs (SM) and chromatin accessibility, where the latter is influenced by both chromatin state (CS) and DNA structure (DS) properties. Although SM, CS, and DS have been used to predict TF binding sites, a predictive model that jointly considers CS and DS has not been developed to predict either TF-specific binding or general binding properties of TFs. Using budding yeast as model, we found that machine learning classifiers trained with either CS or DS features alone perform better in predicting TF-specific binding compared to SM-based classifiers. In addition, simultaneously considering CS and DS further improves the accuracy of the TF binding predictions, indicating the highly complementary nature of these two properties. The contributions of SM, CS, and DS features to binding site predictions differ greatly between TFs, allowing TF-specific predictions and potentially reflecting different TF binding mechanisms. In addition, a "TF-agnostic" predictive model based on three DNA “intrinsic properties” (in silico predicted nucleosome occupancy, major groove geometry, and dinucleotide free energy) that can be calculated from genomic sequences alone has performance that rivals the model incorporating experiment-derived data. This intrinsic property model allows prediction of binding regions not only across TFs, but also across DNA-binding domain families with distinct structural folds. Furthermore, these predicted binding regions can help identify TF binding sites that have a significant impact on target gene expression. Because the intrinsic property model allows prediction of binding regions across DNA-binding domain families, it is TF agnostic and likely describes general binding potential of TFs. Thus, our findings suggest that it is feasible to establish a TF agnostic model for identifying functional regulatory regions in potentially any sequenced genome. PMID:26291518

  11. Predicted Molecular Effects of Sequence Variants Link to System Level of Disease

    PubMed Central

    Bromberg, Yana; Rost, Burkhard

    2016-01-01

    Developments in experimental and computational biology are advancing our understanding of how protein sequence variation impacts molecular protein function. However, the leap from the micro level of molecular function to the macro level of the whole organism, e.g. disease, remains barred. Here, we present new results emphasizing earlier work that suggested some links from molecular function to disease. We focused on non-synonymous single nucleotide variants, also referred to as single amino acid variants (SAVs). Building upon OMIA (Online Mendelian Inheritance in Animals), we introduced a curated set of 117 disease-causing SAVs in animals. Methods optimized to capture effects upon molecular function often correctly predict human (OMIM) and animal (OMIA) Mendelian disease-causing variants. We also predicted effects of human disease-causing variants in the mouse model, i.e. we put OMIM SAVs into mouse orthologs. Overall, fewer variants were predicted with effect in the model organism than in the original organism. Our results, along with other recent studies, demonstrate that predictions of molecular effects capture some important aspects of disease. Thus, in silico methods focusing on the micro level of molecular function can help to understand the macro system level of disease. PMID:27536940

  12. TU-C-17A-10: Patient Features Based Dosimetric Pareto Front Prediction In Esophagus Cancer Radiotherapy

    SciTech Connect

    Wang, J; Zhao, K; Peng, J; Hu, W; Jin, X

    2014-06-15

    Purpose: The purpose of this study is to study the feasibility of the dosimetric pareto front (PF) prediction based on patient anatomic and dosimetric parameters for esophagus cancer patients. Methods: Sixty esophagus patients in our institution were enrolled in this study. A total 2920 IMRT plans were created to generated PF for each patient. On average, each patient had 48 plans. The anatomic and dosimetric features were extracted from those plans. The mean lung dose (MLD), mean heart dose (MHD), spinal cord max dose and PTV homogeneous index (PTVHI) were recorded for each plan. The principal component analysis (PCA) was used to extract overlap volume histogram (OVH) features between PTV and other critical organs. The full dataset was separated into two parts include the training dataset and the validation dataset. The prediction outcomes were the MHD and MLD for the current study. The spearman rank correlation coefficient was used to evaluate the correlation between the anatomical features and dosimetric features. The PF was fit by the the stepwise multiple regression method. The cross-validation method was used to evaluation the model. Results: The mean prediction error of the MHD was 465 cGy with 100 repetitions. The most correlated factors were the first principal components of the OVH between heart and PTV, and the overlap between heart and PTV in Z-axis. The mean prediction error of the MLD was 195 cGy. The most correlated factors were the first principal components of the OVH between lung and PTV, and the overlap between lung and PTV in Z-axis. Conclusion: It is feasible to use patients anatomic and dosimetric features to generate a predicted PF. Additional samples and further studies were required to get a better prediction model.

  13. Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics

    PubMed Central

    Gardeux, Vincent; Chelouah, Rachid; Wanderley, Maria F Barbosa; Siarry, Patrick; Braga, Antônio P; Reyal, Fabien; Rouzier, Roman; Pusztai, Lajos; Natowicz, René

    2015-01-01

    BACKGROUND Filter feature selection methods compute molecular signatures by selecting subsets of genes in the ranking of a valuation function. The motivations of the valuation functions choice are almost always clearly stated, but those for selecting the genes according to their ranking are hardly ever explicit. METHOD We addressed the computation of molecular signatures by searching the optima of a bi-objective function whose solution space was the set of all possible molecular signatures, ie, the set of subsets of genes. The two objectives were the size of the signature–to be minimized–and the interclass distance induced by the signature–to be maximized–. RESULTS We showed that: 1) the convex combination of the two objectives had exactly n optimal non empty signatures where n was the number of genes, 2) the n optimal signatures were nested, and 3) the optimal signature of size k was the subset of k top ranked genes that contributed the most to the interclass distance. We applied our feature selection method on five public datasets in oncology, and assessed the prediction performances of the optimal signatures as input to the diagonal linear discriminant analysis (DLDA) classifier. They were at the same level or better than the best-reported ones. The predictions were robust, and the signatures were almost always significantly smaller. We studied in more details the performances of our predictive modeling on two breast cancer datasets to predict the response to a preoperative chemotherapy: the performances were higher than the previously reported ones, the signatures were three times smaller (11 versus 30 gene signatures), and the genes member of the signature were known to be involved in the response to chemotherapy. CONCLUSIONS Defining molecular signatures as the optima of a bi-objective function that combined the signature size and the interclass distance was well founded and efficient for prediction in oncogenomics. The complexity of the computation

  14. Experimental indication of a naphthalene-base molecular aggregate for the carrier of the 2175 angstroms interstellar extinction feature

    NASA Technical Reports Server (NTRS)

    Beegle, L. W.; Wdowiak, T. J.; Robinson, M. S.; Cronin, J. R.; McGehee, M. D.; Clemett, S. J.; Gillette, S.

    1997-01-01

    Experiments where the simple polycyclic aromatic hydrocarbon (PAH) naphthalene (C10H8) is subjected to the energetic environment of a plasma have resulted in the synthesis of a molecular aggregate that has ultraviolet spectral characteristics that suggest it provides insight into the nature of the carrier of the 2175 angstroms interstellar extinction feature and may be a laboratory analog. Ultraviolet, visible, infrared, and mass spectroscopy, along with gas chromatography, indicate that it is a molecular aggregate in which an aromatic double ring ("naphthalene") structural base serves as the electron "box" chromophore that gives rise to the envelope of the 2175 angstroms feature. This chromophore can also provide the peak of the feature or function as a mantle in concert with another peak provider such as graphite. The molecular base/chromophore manifests itself both as a structural component of an alkyl-aromatic polymer and as a substructure of hydrogenated PAH species. Its spectral and molecular characteristics are consistent with what is generally expected for a complex molecular aggregate that has a role as an interstellar constituent.

  15. Screening features to improve the class prediction of acute myeloid leukemia and myelodysplastic syndrome.

    PubMed

    Li, Kaishi; Yang, Meixue; Sablok, Gaurav; Fan, Jianping; Zhou, Fengfeng

    2013-01-10

    After more than three decades of intensive investigations, the underpinning mechanism of myelodysplastic syndrome (MDS) and acute myeloid leukemia (AML) pathogenesis still remains largely uncharacterized, and their diagnosis relies heavily on the subjective factors. Recently gene expression profiling technique showed significant improvement in classifying some subtypes of AML, but the model's discriminating power of MDS from AML is still in its infancy. Feature selection plays an important role in the classification of the samples on the basis of the gene expression profiles. Our hypothesis explains that a better choice of features could improve the classification of the diseased and normal stage samples, and the potential application of feature screening to produce feature sets, with better accuracies and lowest number of embedded features. The observed results suggest that feature selection proves to be an essential and affirmative step in the biomedical data mining models based on gene expression profiles.

  16. The molecular features of uncoupling protein 1 support a conventional mitochondrial carrier-like mechanism.

    PubMed

    Crichton, Paul G; Lee, Yang; Kunji, Edmund R S

    2017-03-01

    Uncoupling protein 1 (UCP1) is an integral membrane protein found in the mitochondrial inner membrane of brown adipose tissue, and facilitates the process of non-shivering thermogenesis in mammals. Its activation by fatty acids, which overcomes its inhibition by purine nucleotides, leads to an increase in the proton conductance of the inner mitochondrial membrane, short-circuiting the mitochondrion to produce heat rather than ATP. Despite 40 years of intense research, the underlying molecular mechanism of UCP1 is still under debate. The protein belongs to the mitochondrial carrier family of transporters, which have recently been shown to utilise a domain-based alternating-access mechanism, cycling between a cytoplasmic and matrix state to transport metabolites across the inner membrane. Here, we review the protein properties of UCP1 and compare them to those of mitochondrial carriers. UCP1 has the same structural fold as other mitochondrial carriers and, in contrast to past claims, is a monomer, binding one purine nucleotide and three cardiolipin molecules tightly. The protein has a single substrate binding site, which is similar to those of the dicarboxylate and oxoglutarate carriers, but also contains a proton binding site and several hydrophobic residues. As found in other mitochondrial carriers, UCP1 has two conserved salt bridge networks on either side of the central cavity, which regulate access to the substrate binding site in an alternating way. The conserved domain structures and mobile inter-domain interfaces are consistent with an alternating access mechanism too. In conclusion, UCP1 has retained all of the key features of mitochondrial carriers, indicating that it operates by a conventional carrier-like mechanism.

  17. Fiber-specific molecular features of tumors induced in rat peritoneum.

    PubMed

    Unfried, K; Roller, M; Pott, F; Friemann, J; Dehnen, W

    1997-09-01

    Molecular markers such as mutational spectra or mRNA expression patterns may give some indication of the mechanisms of carcinogenesis induced by fibers and other carcinogens. In our study, tumors were induced by application of crocidolite asbestos or benzo[a]pyrene (B[a]P) to rat peritoneum. DNA and RNA of these tumors were subjected to analysis of point mutations and to investigation of mRNA expression patterns. With both assays we found typical features depending on the type of carcinogen applied. The analysis of point mutations in the tumor suppressor gene p53 revealed mutations in the B[a]P-induced tumors. However, in the tumors induced by crocidolite asbestos that were of the same tumor type as those induced by B[a]P, mutations in p53 were not detectable. Every mutation detected on the DNA level causes an amino acid substitution within one of the functional domains of the tumor suppressor protein. Therefore, these mutations seem to be of biological relevance for tumor progression and indicate a difference in the carcinogenesis regarding the type of the carcinogenic substance. An additional specificity of crocidolite-induced tumors was detectable by analyzing the mRNA expression of the tumor suppressor gene WT1, which is known to be expressed in human mesothelial and mesothelioma cells. A relatively high amount of WT1 mRNA was measured by quantitative competitive reverse transcription-polymerase using RNA extracted from crocidolite-induced tumors. However, WT1 seems to be expressed on a rather low level in tumors induced by B[a]P.

  18. Molecular genetic predictive testing for Alzheimer's disease: deliberations and preliminary recommendations.

    PubMed

    Lennox, A; Karlinsky, H; Meschino, W; Buchanan, J A; Percy, M E; Berg, J M

    1994-01-01

    Forty-one participants representing diverse professional back-grounds attended a workshop on genetic predictive testing for familial Alzheimer's disease (FAD) on January 23, 1993 at Surrey Place Centre in Toronto, Canada. Rapidly emerging molecular genetic findings in AD indicate that predictive testing is now technologically feasible for selected individuals, although defining eligibility criteria remains problematic. Legal, ethical, biomedical, and psychosocial issues related to establishing predictive testing programs for AD were discussed at the workshop. This article reflects these discussions, provides the current biomedical background for them and examines the Huntington's disease (HD) predictive testing experience. Observations concerning molecular genetic predictive testing for AD in light of its genetic heterogeneity and clinical characteristics, such as usual later age of onset than HD, are presented. It is proposed that predictive testing for AD can now be cautiously offered in a research setting primarily according to the recommendations contained within the Ethical Issues Policy Statement on Huntington's Disease Molecular Genetics Predictive Test. However, in their application to AD, some points in the statement are considered to require emphasis, modification, or currently to be of uncertain applicability. This represents an initial step in an on-going process of debate concerning AD that will be required as new advances occur in genetic and clinical research and in bioethics.

  19. Solubility curves and nucleation rates from molecular dynamics for polymorph prediction - moving beyond lattice energy minimization.

    PubMed

    Parks, Conor; Koswara, Andy; DeVilbiss, Frank; Tung, Hsien-Hsin; Nere, Nandkishor K; Bordawekar, Shailendra; Nagy, Zoltan K; Ramkrishna, Doraiswami

    2017-02-15

    Current polymorph prediction methods, known as lattice energy minimization, seek to determine the crystal lattice with the lowest potential energy, rendering it unable to predict solvent dependent metastable form crystallization. Facilitated by embarrassingly parallel, multiple replica, large-scale molecular dynamics simulations, we report on a new method concerned with predicting crystal structures using the kinetics and solubility of the low energy polymorphs predicted by lattice energy minimization. The proposed molecular dynamics simulation methodology provides several new predictions to the field of crystallization. (1) The methodology is shown to correctly predict the kinetic preference for β-glycine nucleation in water relative to α- and γ-glycine. (2) Analysis of nanocrystal melting temperatures show γ- nanocrystals have melting temperatures up to 20 K lower than either α- or β-glycine. This provides a striking explanation of how an energetically unstable classical nucleation theory (CNT) transition state complex leads to kinetic inaccessibility of γ-glycine in water, despite being the thermodynamically preferred polymorph predicted by lattice energy minimization. (3) The methodology also predicts polymorph-specific solubility curves, where the α-glycine solubility curve is reproduced to within 19% error, over a 45 K temperature range, using nothing but atomistic-level information provided from nucleation simulations. (4) Finally, the methodology produces the correct solubility ranking of β- > α-glycine. In this work, we demonstrate how the methodology supplements lattice energy minimization with molecular dynamics nucleation simulations to give the correct polymorph prediction, at different length scales, when lattice energy minimization alone would incorrectly predict the formation of γ-glycine in water from the ranking of lattice energies. Thus, lattice energy minimization optimization algorithms are supplemented with the necessary solvent

  20. Application of computer-extracted breast tissue texture features in predicting false-positive recalls from screening mammography

    NASA Astrophysics Data System (ADS)

    Ray, Shonket; Choi, Jae Y.; Keller, Brad M.; Chen, Jinbo; Conant, Emily F.; Kontos, Despina

    2014-03-01

    Mammographic texture features have been shown to have value in breast cancer risk assessment. Previous models have also been developed that use computer-extracted mammographic features of breast tissue complexity to predict the risk of false-positive (FP) recall from breast cancer screening with digital mammography. This work details a novel locallyadaptive parenchymal texture analysis algorithm that identifies and extracts mammographic features of local parenchymal tissue complexity potentially relevant for false-positive biopsy prediction. This algorithm has two important aspects: (1) the adaptive nature of automatically determining an optimal number of region-of-interests (ROIs) in the image and each ROI's corresponding size based on the parenchymal tissue distribution over the whole breast region and (2) characterizing both the local and global mammographic appearances of the parenchymal tissue that could provide more discriminative information for FP biopsy risk prediction. Preliminary results show that this locallyadaptive texture analysis algorithm, in conjunction with logistic regression, can predict the likelihood of false-positive biopsy with an ROC performance value of AUC=0.92 (p<0.001) with a 95% confidence interval [0.77, 0.94]. Significant texture feature predictors (p<0.05) included contrast, sum variance and difference average. Sensitivity for false-positives was 51% at the 100% cancer detection operating point. Although preliminary, clinical implications of using prediction models incorporating these texture features may include the future development of better tools and guidelines regarding personalized breast cancer screening recommendations. Further studies are warranted to prospectively validate our findings in larger screening populations and evaluate their clinical utility.

  1. Andic soil features and debris flows in Italy. New perspective towards prediction

    NASA Astrophysics Data System (ADS)

    Scognamiglio, Solange; Calcaterra, Domenico; Iamarino, Michela; Langella, Giuliano; Orefice, Nadia; Vingiani, Simona; Terribile, Fabio

    2016-04-01

    Debris flows are dangerous hazards causing fatalities and damage. Previous works have demonstrated that the materials involved by debris flows in Campania (southern Italy) are soils classified as Andosols. These soils have peculiar chemical and physical properties which make them fertile but also vulnerable to landslide. In Italy, andic soil properties are found both in volcanic and non-volcanic mountain ecosystems (VME and NVME). Here, we focused on the assessment of the main chemical and physical properties of the soils in the detachment areas of eight debris flows occurred in NVME of Italy in the last 70 years. Such landslides were selected by consulting the official Italian geodatabase (IFFI Project). Andic properties (by means of ammonium oxalate extractable Fe, Si and Al forms for the calculation of Alo+1/2Feo) were also evaluated and a comparison with soils of VME was performed to assess possible common features. Landslide source areas were characterised by slope gradient ranging from 25° to 50° and lithological heterogeneity of the bedrock. The soils showed similar, i.e. all were very deep, had a moderately thick topsoil with a high organic carbon (OC) content decreasing regularly with depth. The cation exchange capacity trend was generally consistent with the OC and the pH varied from extremely to slightly acid, but increased with depth. Furthermore, the soils had high water retention values both at saturation (0.63 to 0.78 cm3 cm-3) and in the dryer part of the water retention curve, and displayed a prevalent loamy texture. Such properties denote the chemical and physical fertility of the investigated ecosystems. The values of Alo+1/2Feoindicated that the soils had vitric or andic features and can be classified as Andosols. The comparison between NVME soils and those of VME showed similar depth, thickness of soil horizons, and family texture, whereas soil pH, degree of development of andic properties and allophane content were higher for VME soils. Such

  2. Criminal recidivism among juvenile offenders: testing the incremental and predictive validity of three measures of psychopathic features.

    PubMed

    Douglas, Kevin S; Epstein, Monica E; Poythress, Norman G

    2008-10-01

    We studied the predictive, comparative, and incremental validity of three measures of psychopathic features (Psychopathy Checklist: Youth Version [PCL:YV]; Antisocial Process Screening Device [APSD]; Childhood Psychopathy Scale [CPS]) vis-à-vis criminal recidivism among 83 delinquent youth within a truly prospective design. Bivariate and multivariate analyses (Cox proportional hazard analyses) showed that of the three measures, the CPS was most consistently related to most types of recidivism in comparison to the other measures. However, incremental validity analyses demonstrated that all of the predictive effects for the measures of psychopathic features disappeared after conceptually relevant covariates (i.e., substance use, conduct disorder, young age, past property crime) were included in multivariate predictive models. Implications for the limits of these measures in applied juvenile justice assessment are discussed.

  3. Prediction of the Fate of Organic Compounds in the Environment From Their Molecular Properties: A Review

    PubMed Central

    Mamy, Laure; Patureau, Dominique; Barriuso, Enrique; Bedos, Carole; Bessac, Fabienne; Louchart, Xavier; Martin-laurent, Fabrice; Miege, Cecile; Benoit, Pierre

    2015-01-01

    A comprehensive review of quantitative structure-activity relationships (QSAR) allowing the prediction of the fate of organic compounds in the environment from their molecular properties was done. The considered processes were water dissolution, dissociation, volatilization, retention on soils and sediments (mainly adsorption and desorption), degradation (biotic and abiotic), and absorption by plants. A total of 790 equations involving 686 structural molecular descriptors are reported to estimate 90 environmental parameters related to these processes. A significant number of equations was found for dissociation process (pKa), water dissolution or hydrophobic behavior (especially through the KOW parameter), adsorption to soils and biodegradation. A lack of QSAR was observed to estimate desorption or potential of transfer to water. Among the 686 molecular descriptors, five were found to be dominant in the 790 collected equations and the most generic ones: four quantum-chemical descriptors, the energy of the highest occupied molecular orbital (EHOMO) and the energy of the lowest unoccupied molecular orbital (ELUMO), polarizability (α) and dipole moment (μ), and one constitutional descriptor, the molecular weight. Keeping in mind that the combination of descriptors belonging to different categories (constitutional, topological, quantum-chemical) led to improve QSAR performances, these descriptors should be considered for the development of new QSAR, for further predictions of environmental parameters. This review also allows finding of the relevant QSAR equations to predict the fate of a wide diversity of compounds in the environment. PMID:25866458

  4. A grading system combining architectural features and mitotic count predicts recurrence in stage I lung adenocarcinoma.

    PubMed

    Kadota, Kyuichi; Suzuki, Kei; Kachala, Stefan S; Zabor, Emily C; Sima, Camelia S; Moreira, Andre L; Yoshizawa, Akihiko; Riely, Gregory J; Rusch, Valerie W; Adusumilli, Prasad S; Travis, William D

    2012-08-01

    The International Association for the Study of Lung Cancer (IASLC)/American Thoracic Society (ATS)/European Respiratory Society (ERS) has recently proposed a new lung adenocarcinoma classification. We investigated whether nuclear features can stratify prognostic subsets. Slides of 485 stage I lung adenocarcinoma patients were reviewed. We evaluated nuclear diameter, nuclear atypia, nuclear/cytoplasmic ratio, chromatin pattern, prominence of nucleoli, intranuclear inclusions, mitotic count/10 high-power fields (HPFs) or 2.4 mm(2), and atypical mitoses. Tumors were classified into histologic subtypes according to the IASLC/ATS/ERS classification and grouped by architectural grade into low (adenocarcinoma in situ, minimally invasive adenocarcinoma, or lepidic predominant), intermediate (papillary or acinar), and high (micropapillary or solid). Log-rank tests and Cox regression models evaluated the ability of clinicopathologic factors to predict recurrence-free probability. In univariate analyses, nuclear diameter (P=0.007), nuclear atypia (P=0.006), mitotic count (P<0.001), and atypical mitoses (P<0.001) were significant predictors of recurrence. The recurrence-free probability of patients with high mitotic count (≥5/10 HPF: n=175) was the lowest (5-year recurrence-free probability=73%), followed by intermediate (2-4/10 HPF: n=106, 80%), and low (0-1/10 HPF: n=204, 91%, P<0.001). Combined architectural/mitotic grading system stratified patient outcomes (P<0.001): low grade (low architectural grade with any mitotic count and intermediate architectural grade with low mitotic count: n=201, 5-year recurrence-free probability=92%), intermediate grade (intermediate architectural grade with intermediate-high mitotic counts: n=206, 78%), and high grade (high architectural grade with any mitotic count: n=78, 68%). The advantage of adding mitotic count to architectural grade is in stratifying patients with intermediate architectural grade into two prognostically

  5. Towards the improved discovery and design of functional peptides: common features of diverse classes permit generalized prediction of bioactivity.

    PubMed

    Mooney, Catherine; Haslam, Niall J; Pollastri, Gianluca; Shields, Denis C

    2012-01-01

    The conventional wisdom is that certain classes of bioactive peptides have specific structural features that endow their particular functions. Accordingly, predictions of bioactivity have focused on particular subgroups, such as antimicrobial peptides. We hypothesized that bioactive peptides may share more general features, and assessed this by contrasting the predictive power of existing antimicrobial predictors as well as a novel general predictor, PeptideRanker, across different classes of peptides.We observed that existing antimicrobial predictors had reasonable predictive power to identify peptides of certain other classes i.e. toxin and venom peptides. We trained two general predictors of peptide bioactivity, one focused on short peptides (4-20 amino acids) and one focused on long peptides (> 20 amino acids). These general predictors had performance that was typically as good as, or better than, that of specific predictors. We noted some striking differences in the features of short peptide and long peptide predictions, in particular, high scoring short peptides favour phenylalanine. This is consistent with the hypothesis that short and long peptides have different functional constraints, perhaps reflecting the difficulty for typical short peptides in supporting independent tertiary structure.We conclude that there are general shared features of bioactive peptides across different functional classes, indicating that computational prediction may accelerate the discovery of novel bioactive peptides and aid in the improved design of existing peptides, across many functional classes. An implementation of the predictive method, PeptideRanker, may be used to identify among a set of peptides those that may be more likely to be bioactive.

  6. Prediction of Selected Physical and Mechanical Properties of a Telechelic Polybenzoxazine by Molecular Simulation

    PubMed Central

    Wan Hassan, Wan Aminah; Hamerton, Ian; Howlin, Brendan J.

    2013-01-01

    Molecular simulation is becoming an important tool for both understanding polymeric structures and predicting their physical and mechanical properties. In this study, temperature ramped molecular dynamics simulations are used to predict two physical properties (i.e., glass transition temperature and thermal degradation temperature) of a previously synthesised and published telechelic benzoxazine. Plots of simulated density versus temperature show decreases in density within the same temperature range as experimental values for the thermal degradation. The predicted value for the thermal degradation temperature for the cured polybenzoxazine based on the telechelic polyetherketone (PEK) monomer was ca. 400°C, in line with the experimental thermal degradation temperature range of 450°C to 500°C. Mechanical Properties of both the unmodified PEK and the telechelic benzoxazines are simulated and compared to experimental values (where available). The introduction of the benoxazine moieties are predicted to increase the elastic moduli in line with the increase of crosslinking in the system. PMID:23577206

  7. Predicting molecular scale skin-effect in electrochemical impedance due to anomalous subdiffusion mediated adsorption phenomenon

    NASA Astrophysics Data System (ADS)

    Kushagra, Arindam

    2016-02-01

    Anomalous subdiffusion governs the processes which are not energetically driven, on a molecular scale. This paper proposes a model to predict the response of electrochemical impedance due to such diffusion process. Previous works considered the use of fractional calculus to predict the impedance behaviour in response to the anomalous diffusion. Here, we have developed an expression which predicts the skin-effect, marked by an increase in the impedance with increasing frequency, in this regime. Negative inductances have also been predicted as a consequence of the inertial response of adsorbed species upon application of frequency-mediated perturbations. It might help the researchers in the fields of impedimetric sensors to choose the working frequency and those working in the field of batteries to choose the parameters, likewise. This work would shed some light into the molecular mechanisms governing the impedance when exposed to frequency-based perturbations like electromagnetic waves (microwaves to ionizing radiations) and in charge storage devices like batteries etc.

  8. Web-based cheminformatics and molecular property prediction tools supporting drug design and development at Novartis.

    PubMed

    Ertl, P; Mühlbacher, J; Rohde, B; Selzer, P

    2003-01-01

    Web-based tools offer many advantages for processing chemical information, most notably ease of use and high interactivity. Therefore more and more pharmaceutical companies are using web technology to deliver sophisticated molecular processing tools directly to the desks of their chemists, to assist them in the process of designing and developing new drugs. In this paper, the web-based cheminformatics system developed at Novartis and currently used by more than thousand users is described. The system allows various molecular modeling and molecular processing tasks, including the calculation of molecular and substituent properties, property-based virtual screening, visualization of molecules, bioisosteric design, diversity analysis, and support of combinatorial chemistry. The methodology to calculate various molecular properties relevant to drug design is described, including the prediction of intestinal absorption, blood-brain barrier penetration, efflux, and water solubility. Information about the web technology used is also provided.

  9. A highly accurate protein structural class prediction approach using auto cross covariance transformation and recursive feature elimination.

    PubMed

    Li, Xiaowei; Liu, Taigang; Tao, Peiying; Wang, Chunhua; Chen, Lanming

    2015-12-01

    Structural class characterizes the overall folding type of a protein or its domain. Many methods have been proposed to improve the prediction accuracy of protein structural class in recent years, but it is still a challenge for the low-similarity sequences. In this study, we introduce a feature extraction technique based on auto cross covariance (ACC) transformation of position-specific score matrix (PSSM) to represent a protein sequence. Then support vector machine-recursive feature elimination (SVM-RFE) is adopted to select top K features according to their importance and these features are input to a support vector machine (SVM) to conduct the prediction. Performance evaluation of the proposed method is performed using the jackknife test on three low-similarity datasets, i.e., D640, 1189 and 25PDB. By means of this method, the overall accuracies of 97.2%, 96.2%, and 93.3% are achieved on these three datasets, which are higher than those of most existing methods. This suggests that the proposed method could serve as a very cost-effective tool for predicting protein structural class especially for low-similarity datasets.

  10. Predicting and replacing the pathological Gleason grade with automated gland ring morphometric features from immunofluorescent prostate cancer images.

    PubMed

    Khan, Faisal M; Scott, Richard; Donovan, Michael; Fernandez, Gerardo

    2017-04-01

    The Gleason grade is the most common architectural and morphological assessment of prostate cancer severity and prognosis. There have been numerous algorithms developed to approximate and duplicate the Gleason scoring system, mostly developed in standard H&E brightfield microscopy. Immunofluorescence (IF) image analysis of tissue pathology has recently been proven to be robust in developing prognostic assessments of disease, particularly in prostate cancer. We leverage a method of segmenting gland rings in IF images for predicting the pathological Gleason, both the clinical and the image specific grades, which may not necessarily be the same. We combine these measures with nuclear specific characteristics. In 324 images from 324 patients, our individual features correlate well univariately with the Gleason grades and in a multivariate setting have an accuracy of 85% in predicting the Gleason grade. Additionally, these features correlate strongly with clinical progression outcomes [concordance index (CI) of 0.89], significantly outperforming the clinical Gleason grades (CI of 0.78). Finally, in multivariate models for multiple prostate cancer progression endpoints, replacing the Gleason with these features results in equivalent or improved performances. This work presents the first assessment of morphological gland unit features from IF images for predicting the Gleason grade, and even replacing it in prostate cancer prognostics.

  11. Collision cross section prediction of deprotonated phenolics in a travelling-wave ion mobility spectrometer using molecular descriptors and chemometrics.

    PubMed

    Gonzales, Gerard Bryan; Smagghe, Guy; Coelus, Sofie; Adriaenssens, Dieter; De Winter, Karel; Desmet, Tom; Raes, Katleen; Van Camp, John

    2016-06-14

    The combination of ion mobility and mass spectrometry (MS) affords significant improvements over conventional MS/MS, especially in the characterization of isomeric metabolites due to the differences in their collision cross sections (CCS). Experimentally obtained CCS values are typically matched with theoretical CCS values from Trajectory Method (TM) and/or Projection Approximation (PA) calculations. In this paper, predictive models for CCS of deprotonated phenolics were developed using molecular descriptors and chemometric tools, stepwise multiple linear regression (SMLR), principal components regression (PCR), and partial least squares regression (PLS). A total of 102 molecular descriptors were generated and reduced to 28 after employing a feature selection tool, composed of mass, topological descriptors, Jurs descriptors and shadow indices. Therefore, the generated models considered the effects of mass, 3D conformation and partial charge distribution on CCS, which are the main parameters for either TM or PA (only 3D conformation) calculations. All three techniques yielded highly predictive models for both the training (R(2)SMLR = 0.9911; R(2)PCR = 0.9917; R(2)PLS = 0.9918) and validation datasets (R(2)SMLR = 0.9489; R(2)PCR = 0.9761; R(2)PLS = 0.9760). Also, the high cross validated R(2) values indicate that the generated models are robust and highly predictive (Q(2)SMLR = 0.9859; Q(2)PCR = 0.9748; Q(2)PLS = 0.9760). The predictions were also very comparable to the results from TM calculations using modified mobcal (N2). Most importantly, this method offered a rapid (<10 min) alternative to TM calculations without compromising predictive ability. These methods could therefore be used in routine analysis and could be easily integrated to metabolite identification platforms.

  12. Supervised multi-view canonical correlation analysis (sMVCCA): integrating histologic and proteomic features for predicting recurrent prostate cancer.

    PubMed

    Lee, George; Singanamalli, Asha; Wang, Haibo; Feldman, Michael D; Master, Stephen R; Shih, Natalie N C; Spangler, Elaine; Rebbeck, Timothy; Tomaszewski, John E; Madabhushi, Anant

    2015-01-01

    In this work, we present a new methodology to facilitate prediction of recurrent prostate cancer (CaP) following radical prostatectomy (RP) via the integration of quantitative image features and protein expression in the excised prostate. Creating a fused predictor from high-dimensional data streams is challenging because the classifier must 1) account for the "curse of dimensionality" problem, which hinders classifier performance when the number of features exceeds the number of patient studies and 2) balance potential mismatches in the number of features across different channels to avoid classifier bias towards channels with more features. Our new data integration methodology, supervised Multi-view Canonical Correlation Analysis (sMVCCA), aims to integrate infinite views of highdimensional data to provide more amenable data representations for disease classification. Additionally, we demonstrate sMVCCA using Spearman's rank correlation which, unlike Pearson's correlation, can account for nonlinear correlations and outliers. Forty CaP patients with pathological Gleason scores 6-8 were considered for this study. 21 of these men revealed biochemical recurrence (BCR) following RP, while 19 did not. For each patient, 189 quantitative histomorphometric attributes and 650 protein expression levels were extracted from the primary tumor nodule. The fused histomorphometric/proteomic representation via sMVCCA combined with a random forest classifier predicted BCR with a mean AUC of 0.74 and a maximum AUC of 0.9286. We found sMVCCA to perform statistically significantly (p < 0.05) better than comparative state-of-the-art data fusion strategies for predicting BCR. Furthermore, Kaplan-Meier analysis demonstrated improved BCR-free survival prediction for the sMVCCA-fused classifier as compared to histology or proteomic features alone.

  13. Computer-aided global breast MR image feature analysis for prediction of tumor response to chemotherapy: performance assessment

    NASA Astrophysics Data System (ADS)

    Aghaei, Faranak; Tan, Maxine; Hollingsworth, Alan B.; Zheng, Bin; Cheng, Samuel

    2016-03-01

    Dynamic contrast-enhanced breast magnetic resonance imaging (DCE-MRI) has been used increasingly in breast cancer diagnosis and assessment of cancer treatment efficacy. In this study, we applied a computer-aided detection (CAD) scheme to automatically segment breast regions depicting on MR images and used the kinetic image features computed from the global breast MR images acquired before neoadjuvant chemotherapy to build a new quantitative model to predict response of the breast cancer patients to the chemotherapy. To assess performance and robustness of this new prediction model, an image dataset involving breast MR images acquired from 151 cancer patients before undergoing neoadjuvant chemotherapy was retrospectively assembled and used. Among them, 63 patients had "complete response" (CR) to chemotherapy in which the enhanced contrast levels inside the tumor volume (pre-treatment) was reduced to the level as the normal enhanced background parenchymal tissues (post-treatment), while 88 patients had "partially response" (PR) in which the high contrast enhancement remain in the tumor regions after treatment. We performed the studies to analyze the correlation among the 22 global kinetic image features and then select a set of 4 optimal features. Applying an artificial neural network trained with the fusion of these 4 kinetic image features, the prediction model yielded an area under ROC curve (AUC) of 0.83+/-0.04. This study demonstrated that by avoiding tumor segmentation, which is often difficult and unreliable, fusion of kinetic image features computed from global breast MR images without tumor segmentation can also generate a useful clinical marker in predicting efficacy of chemotherapy.

  14. Prediction of residue-residue contact matrix for protein-protein interaction with Fisher score features and deep learning.

    PubMed

    Du, Tianchuan; Liao, Li; Wu, Cathy H; Sun, Bilin

    2016-11-01

    Protein-protein interactions play essential roles in many biological processes. Acquiring knowledge of the residue-residue contact information of two interacting proteins is not only helpful in annotating functions for proteins, but also critical for structure-based drug design. The prediction of the protein residue-residue contact matrix of the interfacial regions is challenging. In this work, we introduced deep learning techniques (specifically, stacked autoencoders) to build deep neural network models to tackled the residue-residue contact prediction problem. In tandem with interaction profile Hidden Markov Models, which was used first to extract Fisher score features from protein sequences, stacked autoencoders were deployed to extract and learn hidden abstract features. The deep learning model showed significant improvement over the traditional machine learning model, Support Vector Machines (SVM), with the overall accuracy increased by 15% from 65.40% to 80.82%. We showed that the stacked autoencoders could extract novel features, which can be utilized by deep neural networks and other classifiers to enhance learning, out of the Fisher score features. It is further shown that deep neural networks have significant advantages over SVM in making use of the newly extracted features.

  15. A machine learning approach to investigate the relationship between shape features and numerically predicted risk of ascending aortic aneurysm.

    PubMed

    Liang, Liang; Liu, Minliang; Martin, Caitlin; Elefteriades, John A; Sun, Wei

    2017-04-06

    Geometric features of the aorta are linked to patient risk of rupture in the clinical decision to electively repair an ascending aortic aneurysm (AsAA). Previous approaches have focused on relationship between intuitive geometric features (e.g., diameter and curvature) and wall stress. This work investigates the feasibility of a machine learning approach to establish the linkages between shape features and FEA-predicted AsAA rupture risk, and it may serve as a faster surrogate for FEA associated with long simulation time and numerical convergence issues. This method consists of four main steps: (1) constructing a statistical shape model (SSM) from clinical 3D CT images of AsAA patients; (2) generating a dataset of representative aneurysm shapes and obtaining FEA-predicted risk scores defined as systolic pressure divided by rupture pressure (rupture is determined by a threshold criterion); (3) establishing relationship between shape features and risk by using classifiers and regressors; and (4) evaluating such relationship in cross-validation. The results show that SSM parameters can be used as strong shape features to make predictions of risk scores consistent with FEA, which lead to an average risk classification accuracy of 95.58% by using support vector machine and an average regression error of 0.0332 by using support vector regression, while intuitive geometric features have relatively weak performance. Compared to FEA, this machine learning approach is magnitudes faster. In our future studies, material properties and inhomogeneous thickness will be incorporated into the models and learning algorithms, which may lead to a practical system for clinical applications.

  16. Excess thermodynamic properties of chainlike mixtures. II. Self-associating systems: predictions from soft-SAFT and molecular simulation

    NASA Astrophysics Data System (ADS)

    Blas, Felipe J.

    The excess thermodynamic behaviour of self-associating binary mixtures of chainlike molecules is studied using modified statistical associating fluid theory, the so-called soft-SAFT equation of state. The chainlike molecules are described as Lennard-Jones spherical segments tangentially bonded together. The associating Lennard-Jones chains are modelled considering additional embedded off-centre square-well bonding sites. This model, which accounts explicitly for the most important microscopic features of real non-associating and associating chainlike molecules, such as repulsive and attractive forces between chemical groups, the connectivity of the segments to form the chains and the specific interactions (association), is also solved using the Monte Carlo molecular simulation technique. Comparisons between theoretical predictions and simulation results for selected mixtures are made in order to assess the adequacy of the theory in predicting excess properties. Agreement between simulation and soft-SAFT predictions indicates that the theory is able to provide a good description of the major excess properties. The theory is used also to study the effect of the molecular parameters on the excess properties of self-associating binary mixtures, with particular emphasis on the effect of association (including the bonding energy and number of associating sites) and chain length. The thermodynamic behaviour of these systems is governed by a delicate interplay between two important effects: the bond breaking of the structure formed by the associating molecules and the interstitial accommodation of the non-associating chains within the branched multimeric structure of the associating fluid. The theory is able to explain qualitatively the most salient features of the excess properties in real systems, including positive, negative and sigmoidal shape behaviour. After an in depth analysis of the effect of the association and chain length, an application of soft-SAFT that

  17. Prediction of protein structural classes for low-similarity sequences using reduced PSSM and position-based secondary structural features.

    PubMed

    Wang, Junru; Wang, Cong; Cao, Jiajia; Liu, Xiaoqing; Yao, Yuhua; Dai, Qi

    2015-01-10

    Many efficient methods have been proposed to advance protein structural class prediction, but there are still some challenges where additional insight or technology is needed for low-similarity sequences. In this work, we schemed out a new prediction method for low-similarity datasets using reduced PSSM and position-based secondary structural features. We evaluated the proposed method with four experiments and compared it with the available competing prediction methods. The results indicate that the proposed method achieved the best performance among the evaluated methods, with overall accuracy 3-5% higher than the existing best-performing method. This paper also found that the reduced alphabets with size 13 simplify PSSM structures efficiently while reserving its maximal information. This understanding can be used to design more powerful prediction methods for protein structural class.

  18. Features of Knowledge Building in Biology: Understanding Undergraduate Students' Ideas about Molecular Mechanisms

    ERIC Educational Resources Information Center

    Southard, Katelyn; Wince, Tyler; Meddleton, Shanice; Bolger, Molly S.

    2016-01-01

    Research has suggested that teaching and learning in molecular and cellular biology (MCB) is difficult. We used a new lens to understand undergraduate reasoning about molecular mechanisms: the knowledge-integration approach to conceptual change. Knowledge integration is the dynamic process by which learners acquire new ideas, develop connections…

  19. Machine learning for molecular scattering dynamics: Gaussian Process models for improved predictions of molecular collision observables

    NASA Astrophysics Data System (ADS)

    Krems, Roman; Cui, Jie; Li, Zhiying

    2016-05-01

    We show how statistical learning techniques based on kriging (Gaussian Process regression) can be used for improving the predictions of classical and/or quantum scattering theory. In particular, we show how Gaussian Process models can be used for: (i) efficient non-parametric fitting of multi-dimensional potential energy surfaces without the need to fit ab initio data with analytical functions; (ii) obtaining scattering observables as functions of individual PES parameters; (iii) using classical trajectories to interpolate quantum results; (iv) extrapolation of scattering observables from one molecule to another; (v) obtaining scattering observables with error bars reflecting the inherent inaccuracy of the underlying potential energy surfaces. We argue that the application of Gaussian Process models to quantum scattering calculations may potentially elevate the theoretical predictions to the same level of certainty as the experimental measurements and can be used to identify the role of individual atoms in determining the outcome of collisions of complex molecules. We will show examples and discuss the applications of Gaussian Process models to improving the predictions of scattering theory relevant for the cold molecules research field. Work supported by NSERC of Canada.

  20. Molecular modeling as a predictive tool for the development of solid dispersions.

    PubMed

    Maniruzzaman, Mohammed; Pang, Jiayun; Morgan, David J; Douroumis, Dennis

    2015-04-06

    In this study molecular modeling is introduced as a novel approach for the development of pharmaceutical solid dispersions. A computational model based on quantum mechanical (QM) calculations was used to predict the miscibility of various drugs in various polymers by predicting the binding strength between the drug and dimeric form of the polymer. The drug/polymer miscibility was also estimated by using traditional approaches such as Van Krevelen/Hoftyzer and Bagley solubility parameters or Flory-Huggins interaction parameter in comparison to the molecular modeling approach. The molecular modeling studies predicted successfully the drug-polymer binding energies and the preferable site of interaction between the functional groups. The drug-polymer miscibility and the physical state of bulk materials, physical mixtures, and solid dispersions were determined by thermal analysis (DSC/MTDSC) and X-ray diffraction. The produced solid dispersions were analyzed by X-ray photoelectron spectroscopy (XPS), which confirmed not only the exact type of the intermolecular interactions between the drug-polymer functional groups but also the binding strength by estimating the N coefficient values. The findings demonstrate that QM-based molecular modeling is a powerful tool to predict the strength and type of intermolecular interactions in a range of drug/polymeric systems for the development of solid dispersions.

  1. PREDICTION OF MOLECULAR PROPERTIES WITH MID-INFRARED SPECTRA AND INTERFEROGRAMS

    EPA Science Inventory

    We have built infrared spectroscopy-based partial least squares (PLS) models for molecular polarizabilities using a 97 member training set and a 59 member independent prediction set. These 156 compounds span a very wide range of chemical structure. Our goal was to use this well...

  2. Predicting Ki67% expression from DCE-MR images of breast tumors using textural kinetic features in tumor habitats

    NASA Astrophysics Data System (ADS)

    Chaudhury, Baishali; Zhou, Mu; Farhidzadeh, Hamidreza; Goldgof, Dmitry B.; Hall, Lawrence O.; Gatenby, Robert A.; Gillies, Robert J.; Weinfurtner, Robert J.; Drukteinis, Jennifer S.

    2016-03-01

    The use of Ki67% expression, a cell proliferation marker, as a predictive and prognostic factor has been widely studied in the literature. Yet its usefulness is limited due to inconsistent cut off scores for Ki67% expression, subjective differences in its assessment in various studies, and spatial variation in expression, which makes it difficult to reproduce as a reliable independent prognostic factor. Previous studies have shown that there are significant spatial variations in Ki67% expression, which may limit its clinical prognostic utility after core biopsy. These variations are most evident when examining the periphery of the tumor vs. the core. To date, prediction of Ki67% expression from quantitative image analysis of DCE-MRI is very limited. This work presents a novel computer aided diagnosis framework to use textural kinetics to (i) predict the ratio of periphery Ki67% expression to core Ki67% expression, and (ii) predict Ki67% expression from individual tumor habitats. The pilot cohort consists of T1 weighted fat saturated DCE-MR images from 17 patients. Support vector regression with a radial basis function was used for predicting the Ki67% expression and ratios. The initial results show that texture features from individual tumor habitats are more predictive of the Ki67% expression ratio and spatial Ki67% expression than features from the whole tumor. The Ki67% expression ratio could be predicted with a root mean square error (RMSE) of 1.67%. Quantitative image analysis of DCE-MRI using textural kinetic habitats, has the potential to be used as a non-invasive method for predicting Ki67 percentage and ratio, thus more accurately reporting high KI-67 expression for patient prognosis.

  3. Quantifying predictability in a model with statistical features of the atmosphere

    PubMed Central

    Kleeman, Richard; Majda, Andrew J.; Timofeyev, Ilya

    2002-01-01

    The Galerkin truncated inviscid Burgers equation has recently been shown by the authors to be a simple model with many degrees of freedom, with many statistical properties similar to those occurring in dynamical systems relevant to the atmosphere. These properties include long time-correlated, large-scale modes of low frequency variability and short time-correlated “weather modes” at smaller scales. The correlation scaling in the model extends over several decades and may be explained by a simple theory. Here a thorough analysis of the nature of predictability in the idealized system is developed by using a theoretical framework developed by R.K. This analysis is based on a relative entropy functional that has been shown elsewhere by one of the authors to measure the utility of statistical predictions precisely. The analysis is facilitated by the fact that most relevant probability distributions are approximately Gaussian if the initial conditions are assumed to be so. Rather surprisingly this holds for both the equilibrium (climatological) and nonequilibrium (prediction) distributions. We find that in most cases the absolute difference in the first moments of these two distributions (the “signal” component) is the main determinant of predictive utility variations. Contrary to conventional belief in the ensemble prediction area, the dispersion of prediction ensembles is generally of secondary importance in accounting for variations in utility associated with different initial conditions. This conclusion has potentially important implications for practical weather prediction, where traditionally most attention has focused on dispersion and its variability. PMID:12429863

  4. Morphological and molecular features of the mammalian olfactory sensory neuron axons: What makes these axons so special?

    PubMed

    Nedelec, Stéphane; Dubacq, Caroline; Trembleau, Alain

    2005-03-01

    The main organization and gross morphology of the mammalian olfactory primary pathway, from the olfactory epithelium to the olfactory bulb, has been initially characterized using classical anatomical and ultrastructural approaches. During the last fifteen years, essentially thanks to the cloning of the odorant receptor genes, and to the characterization of a number of molecules expressed by the olfactory sensory neuron axons and their environment, significant new insights have been gained into the understanding of the development and adult functioning of this system. In the course of these genetic, biochemical and neuroanatomical studies, however, several molecular and structural features were uncovered that appear somehow to be unique to these axons. For example, these axons express odorant receptors in their terminal segment, and transport several mRNA species and at least two transcription factors. In the present paper, we review these unusual structural and molecular features and speculate about their possible functions in the development and maintenance of the olfactory system.

  5. Comparison of Algorithms for Prediction of Protein Structural Features from Evolutionary Data.

    PubMed

    Bywater, Robert P

    2016-01-01

    Proteins have many functions and predicting these is still one of the major challenges in theoretical biophysics and bioinformatics. Foremost amongst these functions is the need to fold correctly thereby allowing the other genetically dictated tasks that the protein has to carry out to proceed efficiently. In this work, some earlier algorithms for predicting protein domain folds are revisited and they are compared with more recently developed methods. In dealing with intractable problems such as fold prediction, when different algorithms show convergence onto the same result there is every reason to take all algorithms into account such that a consensus result can be arrived at. In this work it is shown that the application of different algorithms in protein structure prediction leads to results that do not converge as such but rather they collude in a striking and useful way that has never been considered before.

  6. The molecular structural features controlling stickiness in cooked rice, a major palatability determinant.

    PubMed

    Li, Hongyan; Fitzgerald, Melissa A; Prakash, Sangeeta; Nicholson, Timothy M; Gilbert, Robert G

    2017-03-06

    The stickiness of cooked rice is important for eating quality and consumer acceptance. The first molecular understanding of stickiness is obtained from leaching and molecular structural characteristics during cooking. Starch is a highly branched glucose polymer. We find (i) the molecular size of leached amylopectin is 30 times smaller than that of native amylopectin while (ii) that of leached amylose is 5 times smaller than that of native amylose, (iii) the chain-length distribution (CLD: the number of monomer units in a chain on the branched polymer) of leached amylopectin is similar to native amylopectin while (iv) the CLD of leached amylose is much narrower than that of the native amylose, and (v) mainly amylopectin, not amylose, leaches out of the granule and rice kernel during cooking. Stickiness is found to increase with decreasing amylose content in the whole grain, and, in the leachate, with increasing total amount of amylopectin, the proportion of short amylopectin chains, and amylopectin molecular size. Molecular adhesion mechanisms are put forward to explain this result. This molecular structural mechanism provides a new tool for rice breeders to select cultivars with desirable palatability by quantifying the components and molecular structure of leached starch.

  7. The molecular structural features controlling stickiness in cooked rice, a major palatability determinant

    PubMed Central

    Li, Hongyan; Fitzgerald, Melissa A.; Prakash, Sangeeta; Nicholson, Timothy M.; Gilbert, Robert G.

    2017-01-01

    The stickiness of cooked rice is important for eating quality and consumer acceptance. The first molecular understanding of stickiness is obtained from leaching and molecular structural characteristics during cooking. Starch is a highly branched glucose polymer. We find (i) the molecular size of leached amylopectin is 30 times smaller than that of native amylopectin while (ii) that of leached amylose is 5 times smaller than that of native amylose, (iii) the chain-length distribution (CLD: the number of monomer units in a chain on the branched polymer) of leached amylopectin is similar to native amylopectin while (iv) the CLD of leached amylose is much narrower than that of the native amylose, and (v) mainly amylopectin, not amylose, leaches out of the granule and rice kernel during cooking. Stickiness is found to increase with decreasing amylose content in the whole grain, and, in the leachate, with increasing total amount of amylopectin, the proportion of short amylopectin chains, and amylopectin molecular size. Molecular adhesion mechanisms are put forward to explain this result. This molecular structural mechanism provides a new tool for rice breeders to select cultivars with desirable palatability by quantifying the components and molecular structure of leached starch. PMID:28262830

  8. The molecular structural features controlling stickiness in cooked rice, a major palatability determinant

    NASA Astrophysics Data System (ADS)

    Li, Hongyan; Fitzgerald, Melissa A.; Prakash, Sangeeta; Nicholson, Timothy M.; Gilbert, Robert G.

    2017-03-01

    The stickiness of cooked rice is important for eating quality and consumer acceptance. The first molecular understanding of stickiness is obtained from leaching and molecular structural characteristics during cooking. Starch is a highly branched glucose polymer. We find (i) the molecular size of leached amylopectin is 30 times smaller than that of native amylopectin while (ii) that of leached amylose is 5 times smaller than that of native amylose, (iii) the chain-length distribution (CLD: the number of monomer units in a chain on the branched polymer) of leached amylopectin is similar to native amylopectin while (iv) the CLD of leached amylose is much narrower than that of the native amylose, and (v) mainly amylopectin, not amylose, leaches out of the granule and rice kernel during cooking. Stickiness is found to increase with decreasing amylose content in the whole grain, and, in the leachate, with increasing total amount of amylopectin, the proportion of short amylopectin chains, and amylopectin molecular size. Molecular adhesion mechanisms are put forward to explain this result. This molecular structural mechanism provides a new tool for rice breeders to select cultivars with desirable palatability by quantifying the components and molecular structure of leached starch.

  9. Novel features of early burst suppression predict outcome after birth asphyxia

    PubMed Central

    Iyer, Kartik K; Roberts, James A; Metsäranta, Marjo; Finnigan, Simon; Breakspear, Michael; Vanhatalo, Sampsa

    2014-01-01

    Burst suppression patterns in the electroencephalogram are a reliable marker of recent severe brain insult. Here we analyze statistical properties of bursts occurring in 20 electroencephalographic recordings acquired from hypothermic asphyxic newborns in the hours immediately following birth. We show that the distributions of burst area and duration in these acute data predict later clinical outcome in both structural neuroimaging and neurodevelopment. Our findings indicate the first early electroencephalographic metrics that offer outcome prediction in asphyxic neonates undergoing hypothermia treatment. PMID:25356399

  10. TargetCrys: protein crystallization prediction by fusing multi-view features with two-layered SVM.

    PubMed

    Hu, Jun; Han, Ke; Li, Yang; Yang, Jing-Yu; Shen, Hong-Bin; Yu, Dong-Jun

    2016-11-01

    The accurate prediction of whether a protein will crystallize plays a crucial role in improving the success rate of protein crystallization projects. A common critical problem in the development of machine-learning-based protein crystallization predictors is how to effectively utilize protein features extracted from different views. In this study, we aimed to improve the efficiency of fusing multi-view protein features by proposing a new two-layered SVM (2L-SVM) which switches the feature-level fusion problem to a decision-level fusion problem: the SVMs in the 1st layer of the 2L-SVM are trained on each of the multi-view feature sets; then, the outputs of the 1st layer SVMs, which are the "intermediate" decisions made based on the respective feature sets, are further ensembled by a 2nd layer SVM. Based on the proposed 2L-SVM, we implemented a sequence-based protein crystallization predictor called TargetCrys. Experimental results on several benchmark datasets demonstrated the efficacy of the proposed 2L-SVM for fusing multi-view features. We also compared TargetCrys with existing sequence-based protein crystallization predictors and demonstrated that the proposed TargetCrys outperformed most of the existing predictors and is competitive with the state-of-the-art predictors. The TargetCrys webserver and datasets used in this study are freely available for academic use at: http://csbio.njust.edu.cn/bioinf/TargetCrys .

  11. Prediction of clathrate structure type and guest position by molecular mechanics.

    PubMed

    Fleischer, Everly B; Janda, Kenneth C

    2013-05-16

    The clathrate hydrates occur in various types in which the number, size, and shape of the various cages differ. Usually the clathrate type of a specific guest is predicted by the size and shape of the molecular guest. We have developed a methodology to determine the clathrate type employing molecular mechanics with the MMFF force field employing a strategy to calculate the energy of formation of the clathrate from the sum of the guest/cage energies. The clathrate type with the most negative (most stable) energy of formation would be the type predicted (we mainly focused on type I, type II, or bromine type). This strategy allows for a calculation to predict the clathrate type for any cage guest in a few minutes on a laptop computer. It proved successful in predicting the clathrate structure for 46 out of 47 guest molecules. The molecular mechanics calculations also provide a prediction of the guest position within the cage and clathrate structure. These predictions are generally consistent with the X-ray and neutron diffraction studies. By supplementing the diffraction study with molecular mechanics, we gain a more detailed insight regarding the details of the structure. We have also compared MM calculations to studies of the multiple occupancy of the cages. Finally, we present a density functional calculation that demonstrates that the inside of the clathrates cages have a relatively uniform and low electrostatic potential in comparison with the outside oxygen and hydrogen atoms. This implies that van der Waals forces will usually be dominant in the guest-cage interactions.

  12. Applying quantitative adiposity feature analysis models to predict benefit of bevacizumab-based chemotherapy in ovarian cancer patients

    NASA Astrophysics Data System (ADS)

    Wang, Yunzhi; Qiu, Yuchen; Thai, Theresa; More, Kathleen; Ding, Kai; Liu, Hong; Zheng, Bin

    2016-03-01

    How to rationally identify epithelial ovarian cancer (EOC) patients who will benefit from bevacizumab or other antiangiogenic therapies is a critical issue in EOC treatments. The motivation of this study is to quantitatively measure adiposity features from CT images and investigate the feasibility of predicting potential benefit of EOC patients with or without receiving bevacizumab-based chemotherapy treatment using multivariate statistical models built based on quantitative adiposity image features. A dataset involving CT images from 59 advanced EOC patients were included. Among them, 32 patients received maintenance bevacizumab after primary chemotherapy and the remaining 27 patients did not. We developed a computer-aided detection (CAD) scheme to automatically segment subcutaneous fat areas (VFA) and visceral fat areas (SFA) and then extracted 7 adiposity-related quantitative features. Three multivariate data analysis models (linear regression, logistic regression and Cox proportional hazards regression) were performed respectively to investigate the potential association between the model-generated prediction results and the patients' progression-free survival (PFS) and overall survival (OS). The results show that using all 3 statistical models, a statistically significant association was detected between the model-generated results and both of the two clinical outcomes in the group of patients receiving maintenance bevacizumab (p<0.01), while there were no significant association for both PFS and OS in the group of patients without receiving maintenance bevacizumab. Therefore, this study demonstrated the feasibility of using quantitative adiposity-related CT image features based statistical prediction models to generate a new clinical marker and predict the clinical outcome of EOC patients receiving maintenance bevacizumab-based chemotherapy.

  13. Automatic Recognition of Solar Features for Developing Data Driven Prediction Models of Solar Activity and Space Weather

    DTIC Science & Technology

    2012-07-06

    Ephemeral Brightening,” 2nd ATST – East Workshop In Solar Physics: Magnetic Fields From The Photosphere To The Corona , Washington D.C., Mar 2012. [6...AFRL-RV-PS- AFRL-RV-PS- TR-2012-0133 TR-2012-0133 AUTOMATIC RECOGNITION OF SOLAR FEATURES FOR DEVELOPING DATA DRIVEN PREDICTION MODELS OF... SOLAR ACTIVITY AND SPACE WEATHER Jason Jackiewicz New Mexico State University Department of Astronomy PO Box 30001, MSC 4500 Las

  14. Induction of CaSR expression circumvents the molecular features of malignant CaSR null colon cancer cells.

    PubMed

    Singh, Navneet; Chakrabarty, Subhas

    2013-11-15

    We recently reported on the isolation and characterization of calcium sensing receptor (CaSR) null human colon cancer cells (Singh et al., Int J Cancer 2013; 132: 1996-2005). CaSR null cells possess a myriad of molecular features that are linked to a highly malignant and drug resistant phenotype of colon cancer. The CaSR null phenotype can be maintained in defined human embryonic stem cell culture medium. We now show that the CaSR null cells can be induced to differentiate in conventional culture medium, regained the expression of CaSR with a concurrent reversal of the cellular and molecular features associated with the null phenotype. These features include cellular morphology, expression of colon cancer stem cell markers, expression of survivin and thymidylate synthase and sensitivity to fluorouracil. Other features include the expression of epithelial mesenchymal transition linked molecules and transcription factors, oncogenic miRNAs and tumor suppressive molecule and miRNA. With the exception of cancer stem cell markers, the reversal of molecular features, upon the induction of CaSR expression, is directly linked to the expression and function of CaSR because blocking CaSR induction by shRNA circumvented such reversal. We further report that methylation and demethylation of the CaSR gene promoter underlie CaSR expression. Due to the malignant nature of the CaSR null cells, inclusion of the CaSR null phenotype in disease management may improve on the mortality of this disease. Because CaSR is a robust promoter of differentiation and mediates its action through diverse mechanisms and pathways, inactivation of CaSR may serve as a new paradigm in colon carcinogenesis.

  15. A Bayesian network approach to predicting nest presence of thefederally-threatened piping plover (Charadrius melodus) using barrier island features

    USGS Publications Warehouse

    Gieder, Katherina D.; Karpanty, Sarah M.; Frasera, James D.; Catlin, Daniel H.; Gutierrez, Benjamin T.; Plant, Nathaniel G.; Turecek, Aaron M.; Thieler, E. Robert

    2014-01-01

    Sea-level rise and human development pose significant threats to shorebirds, particularly for species that utilize barrier island habitat. The piping plover (Charadrius melodus) is a federally-listed shorebird that nests on barrier islands and rapidly responds to changes in its physical environment, making it an excellent species with which to model how shorebird species may respond to habitat change related to sea-level rise and human development. The uncertainty and complexity in predicting sea-level rise, the responses of barrier island habitats to sea-level rise, and the responses of species to sea-level rise and human development necessitate a modelling approach that can link species to the physical habitat features that will be altered by changes in sea level and human development. We used a Bayesian network framework to develop a model that links piping plover nest presence to the physical features of their nesting habitat on a barrier island that is impacted by sea-level rise and human development, using three years of data (1999, 2002, and 2008) from Assateague Island National Seashore in Maryland. Our model performance results showed that we were able to successfully predict nest presence given a wide range of physical conditions within the model’s dataset. We found that model predictions were more successful when the range of physical conditions included in model development was varied rather than when those physical conditions were narrow. We also found that all model predictions had fewer false negatives (nests predicted to be absent when they were actually present in the dataset) than false positives (nests predicted to be present when they were actually absent in the dataset), indicating that our model correctly predicted nest presence better than nest absence. These results indicated that our approach of using a Bayesian network to link specific physical features to nest presence will be useful for modelling impacts of sea-level rise- or human

  16. Predicting error in detecting mammographic masses among radiology trainees using statistical models based on BI-RADS features

    SciTech Connect

    Grimm, Lars J. Ghate, Sujata V.; Yoon, Sora C.; Kim, Connie; Kuzmiak, Cherie M.; Mazurowski, Maciej A.

    2014-03-15

    Purpose: The purpose of this study is to explore Breast Imaging-Reporting and Data System (BI-RADS) features as predictors of individual errors made by trainees when detecting masses in mammograms. Methods: Ten radiology trainees and three expert breast imagers reviewed 100 mammograms comprised of bilateral medial lateral oblique and craniocaudal views on a research workstation. The cases consisted of normal and biopsy proven benign and malignant masses. For cases with actionable abnormalities, the experts recorded breast (density and axillary lymph nodes) and mass (shape, margin, and density) features according to the BI-RADS lexicon, as well as the abnormality location (depth and clock face). For each trainee, a user-specific multivariate model was constructed to predict the trainee's likelihood of error based on BI-RADS features. The performance of the models was assessed using area under the receive operating characteristic curves (AUC). Results: Despite the variability in errors between different trainees, the individual models were able to predict the likelihood of error for the trainees with a mean AUC of 0.611 (range: 0.502–0.739, 95% Confidence Interval: 0.543–0.680,p < 0.002). Conclusions: Patterns in detection errors for mammographic masses made by radiology trainees can be modeled using BI-RADS features. These findings may have potential implications for the development of future educational materials that are personalized to individual trainees.

  17. Downstream Antisense Transcription Predicts Genomic Features That Define the Specific Chromatin Environment at Mammalian Promoters

    PubMed Central

    Lavender, Christopher A.; Hoffman, Jackson A.; Trotter, Kevin W.; Gilchrist, Daniel A.; Bennett, Brian D.; Burkholder, Adam B.; Fargo, David C.; Archer, Trevor K.

    2016-01-01

    Antisense transcription is a prevalent feature at mammalian promoters. Previous studies have primarily focused on antisense transcription initiating upstream of genes. Here, we characterize promoter-proximal antisense transcription downstream of gene transcription starts sites in human breast cancer cells, investigating the genomic context of downstream antisense transcription. We find extensive correlations between antisense transcription and features associated with the chromatin environment at gene promoters. Antisense transcription downstream of promoters is widespread, with antisense transcription initiation observed within 2 kb of 28% of gene transcription start sites. Antisense transcription initiates between nucleosomes regularly positioned downstream of these promoters. The nucleosomes between gene and downstream antisense transcription start sites carry histone modifications associated with active promoters, such as H3K4me3 and H3K27ac. This region is bound by chromatin remodeling and histone modifying complexes including SWI/SNF subunits and HDACs, suggesting that antisense transcription or resulting RNA transcripts contribute to the creation and maintenance of a promoter-associated chromatin environment. Downstream antisense transcription overlays additional regulatory features, such as transcription factor binding, DNA accessibility, and the downstream edge of promoter-associated CpG islands. These features suggest an important role for antisense transcription in the regulation of gene expression and the maintenance of a promoter-associated chromatin environment. PMID:27487356

  18. Molecular characterization of an 11q interstitial deletion in a patient with the clinical features of Jacobsen syndrome.

    PubMed

    Wenger, Sharon L; Grossfeld, Paul D; Siu, Benjamin L; Coad, James E; Keller, Frank G; Hummel, Marybeth

    2006-04-01

    The 11q terminal deletion disorder or Jacobsen syndrome is a contiguous gene disorder. It is characterized by psychomotor retardation, cardiac defects, blood dyscrasias (Paris-Trousseau syndrome) and craniofacial anomalies. We report on a female patient with an approximately 10 Mb interstitial deletion with many of the features of Jacobsen syndrome: A congenital heart defect, dysmorphic features, developmental delay, and Paris-Trousseau syndrome. The karyotype of the patient is 46,XX,del(11)(q24.1q24.3). The interstitial deletion was confirmed using FISH probes for distal 11q, and the breakpoints were characterized by microarray analysis. This is the first molecularly characterized interstitial deletion in a patient with the clinical features of Jacobsen syndrome. The deletion includes FLI-1, but not JAM-3, which will help to determine the critical genes involved in this syndrome.

  19. Molecular Biomarkers for Prediction of Targeted Therapy Response in Metastatic Breast Cancer: Trick or Treat?

    PubMed Central

    Toss, Angela; Venturelli, Marta; Peterle, Chiara; Piacentini, Federico; Cascinu, Stefano; Cortesi, Laura

    2017-01-01

    In recent years, the study of genomic alterations and protein expression involved in the pathways of breast cancer carcinogenesis has provided an increasing number of targets for drugs development in the setting of metastatic breast cancer (i.e., trastuzumab, everolimus, palbociclib, etc.) significantly improving the prognosis of this disease. These drugs target specific molecular abnormalities that confer a survival advantage to cancer cells. On these bases, emerging evidence from clinical trials provided increasing proof that the genetic landscape of any tumor may dictate its sensitivity or resistance profile to specific agents and some studies have already showed that tumors treated with therapies matched with their molecular alterations obtain higher objective response rates and longer survival. Predictive molecular biomarkers may optimize the selection of effective therapies, thus reducing treatment costs and side effects. This review offers an overview of the main molecular pathways involved in breast carcinogenesis, the targeted therapies developed to inhibit these pathways, the principal mechanisms of resistance and, finally, the molecular biomarkers that, to date, are demonstrated in clinical trials to predict response/resistance to targeted treatments in metastatic breast cancer. PMID:28054957

  20. Quality Assessment of Predicted Protein Models Using Energies Calculated by the Fragment Molecular Orbital Method.

    PubMed

    Simoncini, David; Nakata, Hiroya; Ogata, Koji; Nakamura, Shinichiro; Zhang, Kam Yj

    2015-02-01

    Protein structure prediction directly from sequences is a very challenging problem in computational biology. One of the most successful approaches employs stochastic conformational sampling to search an empirically derived energy function landscape for the global energy minimum state. Due to the errors in the empirically derived energy function, the lowest energy conformation may not be the best model. We have evaluated the use of energy calculated by the fragment molecular orbital method (FMO energy) to assess the quality of predicted models and its ability to identify the best model among an ensemble of predicted models. The fragment molecular orbital method implemented in GAMESS was used to calculate the FMO energy of predicted models. When tested on eight protein targets, we found that the model ranking based on FMO energies is better than that based on empirically derived energies when there is sufficient diversity among these models. This model diversity can be estimated prior to the FMO energy calculations. Our result demonstrates that the FMO energy calculated by the fragment molecular orbital method is a practical and promising measure for the assessment of protein model quality and the selection of the best protein model among many generated.

  1. Machine learning identification of EEG features predicting working memory performance in schizophrenia and healthy adults

    PubMed Central

    Johannesen, Jason K.; Bi, Jinbo; Jiang, Ruhua; Kenney, Joshua G.; Chen, Chi-Ming A.

    2016-01-01

    Background With millisecond-level resolution, electroencephalographic (EEG) recording provides a sensitive tool to assay neural dynamics of human cognition. However, selection of EEG features used to answer experimental questions is typically determined a priori. The utility of machine learning was investigated as a computational framework for extracting the most relevant features from EEG data empirically. Methods Schizophrenia (SZ; n = 40) and healthy community (HC; n = 12) subjects completed a Sternberg Working Memory Task (SWMT) during EEG recording. EEG was analyzed to extract 5 frequency components (theta1, theta2, alpha, beta, gamma) at 4 processing stages (baseline, encoding, retention, retrieval) and 3 scalp sites (frontal-Fz, central-Cz, occipital-Oz) separately for correctly and incorrectly answered trials. The 1-norm support vector machine (SVM) method was used to build EEG classifiers of SWMT trial accuracy (correct vs. incorrect; Model 1) and diagnosis (HC vs. SZ; Model 2). External validity of SVM models was examined in relation to neuropsychological test performance and diagnostic classification using conventional regression-based analyses. Results SWMT performance was significantly reduced in SZ (p < .001). Model 1 correctly classified trial accuracy at 84 % in HC, and at 74 % when cross-validated in SZ data. Frontal gamma at encoding and central theta at retention provided highest weightings, accounting for 76 % of variance in SWMT scores and 42 % variance in neuropsychological test performance across samples. Model 2 identified frontal theta at baseline and frontal alpha during retrieval as primary classifiers of diagnosis, providing 87 % classification accuracy as a discriminant function. Conclusions EEG features derived by SVM are consistent with literature reports of gamma’s role in memory encoding, engagement of theta during memory retention, and elevated resting low-frequency activity in schizophrenia. Tests of model performance and cross

  2. Borderline Personality Features in Students: the Predicting Role of Schema, Emotion Regulation, Dissociative Experience and Suicidal Ideation

    PubMed Central

    Sajadi, Seyede Fateme; Arshadi, Nasrin; Zargar, Yadolla; Mehrabizade Honarmand, Mahnaz; Hajjari, Zahra

    2015-01-01

    Background: Numerous studies have demonstrated that early maladaptive schemas, emotional dysregulation are supposed to be the defining core of borderline personality disorder. Many studies have also found a strong association between the diagnosis of borderline personality and the occurrence of suicide ideation and dissociative symptoms. Objectives: The present study was designed to investigate the relationship between borderline personality features and schema, emotion regulation, dissociative experiences and suicidal ideation among high school students in Shiraz City, Iran. Patients and Methods: In this descriptive correlational study, 300 students (150 boys and 150 girls) were selected from the high schools in Shiraz, Iran, using the multi-stage random sampling. Data were collected using some instruments including borderline personality feature scale for children, young schema questionnaire-short form, difficulties in emotion-regulation scale (DERS), dissociative experience scale and beck suicide ideation scale. Data were analyzed using the Pearson correlation coefficient and multivariate regression analysis. Results: The results showed a significant positive correlation between schema, emotion regulation, dissociative experiences and suicide ideation with borderline personality features. Moreover, the results of multivariate regression analysis suggested that among the studied variables, schema was the most effective predicting variable of borderline features (P < 0.001). Conclusions: The findings of this study are in accordance with findings from previous studies, and generally show a meaningful association between schema, emotion regulation, dissociative experiences, and suicide ideation with borderline personality features. PMID:26401490

  3. Measuring the successes and deficiencies of constant pH molecular dynamics: A blind prediction study

    PubMed Central

    Williams, Sarah L; Blachly, Patrick G; McCammon, J Andrew

    2011-01-01

    A constant pH molecular dynamics method has been used in the blind prediction of pKa values of titratable residues in wild type and mutated structures of the Staphylococcal nuclease (SNase) protein. The predicted values have been subsequently compared to experimental values provided by the laboratory of García-Moreno. CpHMD performs well in predicting the pKa of solvent-exposed residues. For residues in the protein interior, the CpHMD method encounters some difficulties in reaching convergence and predicting the pKa values for residues having strong interactions with neighboring residues. These results show the need to accurately and sufficiently sample conformational space in order to obtain pKa values consistent with experimental results. PMID:22072520

  4. Isolation of a Latimeria menadoensis heat shock protein 70 (Lmhsp70) that has all the features of an inducible gene and encodes a functional molecular chaperone.

    PubMed

    Modisakeng, Keoagile W; Jiwaji, Meesbah; Pesce, Eva-Rachele; Robert, Jacques; Amemiya, Chris T; Dorrington, Rosemary A; Blatch, Gregory L

    2009-08-01

    Molecular chaperones facilitate the correct folding of other proteins, and heat shock proteins form one of the major classes of molecular chaperones. Heat shock protein 70 (Hsp70) has been extensively studied, and shown to be critically important for cellular protein homeostasis in almost all prokaryotic and eukaryotic systems studied to date. Since there have been very limited studies conducted on coelacanth chaperones, the main objective of this study was to genetically and biochemically characterize a coelacanth Hsp70. We have successfully isolated an Indonesian coelacanth (L. menadoensis) hsp70 gene, Lmhsp70, and found that it contained an intronless coding region and a potential upstream regulatory region. Lmhsp70 encoded a typical Hsp70 based on conserved structural and functional features, and the predicted upstream regulatory region was found to contain six potential promoter elements, and three potential heat shock elements (HSEs). The intronless nature of the coding region and the presence of HSEs suggested that Lmhsp70 was stress-inducible. Phylogenetic analyses provided further evidence that Lmhsp70 was probably inducible, and that it branched as a clade intermediate between bony fish and tetrapods. Recombinant LmHsp70 was successfully overproduced, purified and found to be functional using ATPase activity assays. Taken together, these data provide evidence for the first time that the coelacanth encodes a functional molecular chaperone system.

  5. Predicting molecular formulas of fragment ions with isotope patterns in tandem mass spectra.

    PubMed

    Zhang, Jingfen; Gao, Wen; Cai, Jinjin; He, Simin; Zeng, Rong; Chen, Runsheng

    2005-01-01

    A number of different approaches have been proposed to predict elemental component formulas (or molecular formulas) of molecular ions in low and medium resolution mass spectra. Most of them rely on isotope patterns, enumerate all possible formulas for an ion, and exclude certain formulas violating chemical constraints. However, these methods cannot be well generalized to the component prediction of fragment ions in tandem mass spectra. In this paper, a new method, FFP (Fragment ion Formula Prediction), is presented to predict elemental component formulas of fragment ions. In the FFP method, the prediction of the best formulas is converted into the minimization of the distance between theoretical and observed isotope patterns. And, then, a novel local search model is proposed to generate a set of candidate formulas efficiently. After the search, FFP applies a new multiconstraint filtering to exclude as many invalid and improbable formulas as possible. FFP is experimentally compared with the previous enumeration methods, and shown to outperform them significantly. The results of this paper can help to improve the reliability of de novo in the identification of peptide sequences.

  6. Predicting the molecular shape of polysaccharides from dynamic interactions with water.

    PubMed

    Almond, Andrew; Sheehan, John K

    2003-04-01

    How simple monosaccharides, once polymerized, become the basis for structural materials remains a mystery. A framework is developed to investigate the role of water in the emergence of dynamic structure in polysaccharides, using the important beta(1-->4) linkage as an example. This linkage is studied within decasaccharide fragments of cellulose, chitin, mannan, xylan, and hyaluronan, using molecular simulations in the presence of explicit water solvent. Although cellulose, mannan, chitin, and xylan are chemically similar, their intramolecular hydrogen-bond dynamics and interaction with water are predicted to differ. Cellulose, mannan, and chitin favor relatively static intramolecular hydrogen bonds, xylan prefers dynamic water bridges, and multiple water configurations are predicted at the beta(1-->4) linkages of hyaluronan. With such a variety of predicted dynamics, the hypothesis that the beta(1-->4) linkage is stabilized by intramolecular hydrogen bonds was rejected. Instead, it is proposed that favored molecular configurations are consistent with maximum rotamer and water degrees of freedom, explaining observations made previously by X-ray diffraction. Furthermore, polysaccharides predicted to be conformationally restricted in simulations (cellulose, chitin, and mannan) prefer the solid state in reality, even as oligosaccharides. Those predicted to be more flexible (xylan and hyaluronan) are known to be soluble, even as high polymers. Therefore an intriguing correlation between chemical composition, water organization, polymer properties, and biological function is proposed.

  7. WeGET: predicting new genes for molecular systems by weighted co-expression

    PubMed Central

    Szklarczyk, Radek; Megchelenbrink, Wout; Cizek, Pavel; Ledent, Marie; Velemans, Gonny; Szklarczyk, Damian; Huynen, Martijn A.

    2016-01-01

    We have developed the Weighted Gene Expression Tool and database (WeGET, http://weget.cmbi.umcn.nl) for the prediction of new genes of a molecular system by correlated gene expression. WeGET utilizes a compendium of 465 human and 560 murine gene expression datasets that have been collected from multiple tissues under a wide range of experimental conditions. It exploits this abundance of expression data by assigning a high weight to datasets in which the known genes of a molecular system are harmoniously up- and down-regulated. WeGET ranks new candidate genes by calculating their weighted co-expression with that system. A weighted rank is calculated for human genes and their mouse orthologs. Then, an integrated gene rank and p-value is computed using a rank-order statistic. We applied our method to predict novel genes that have a high degree of co-expression with Gene Ontology terms and pathways from KEGG and Reactome. For each query set we provide a list of predicted novel genes, computed weights for transcription datasets used and cell and tissue types that contributed to the final predictions. The performance for each query set is assessed by 10-fold cross-validation. Finally, users can use the WeGET to predict novel genes that co-express with a custom query set. PMID:26582928

  8. Reactive oxygen species–associated molecular signature predicts survival in patients with sepsis

    PubMed Central

    Zhou, Tong; Wang, Ting; Slepian, Marvin J.; Garcia, Joe G. N.; Hecker, Louise

    2016-01-01

    Abstract Sepsis-related multiple organ dysfunction syndrome is a leading cause of death in intensive care units. There is overwhelming evidence that oxidative stress plays a significant role in the pathogenesis of sepsis-associated multiple organ failure; however, reactive oxygen species (ROS)–associated biomarkers and/or diagnostics that define mortality or predict survival in sepsis are lacking. Lung or peripheral blood gene expression analysis has gained increasing recognition as a potential prognostic and/or diagnostic tool. The objective of this study was to identify ROS-associated biomarkers predictive of survival in patients with sepsis. In-silico analyses of expression profiles allowed the identification of a 21-gene ROS-associated molecular signature that predicts survival in sepsis patients. Importantly, this signature performed well in a validation cohort consisting of sepsis patients aggregated from distinct patient populations recruited from different sites. Our signature outperforms randomly generated signatures of the same signature gene size. Our findings further validate the critical role of ROSs in the pathogenesis of sepsis and provide a novel gene signature that predicts survival in sepsis patients. These results also highlight the utility of peripheral blood molecular signatures as biomarkers for predicting mortality risk in patients with sepsis, which could facilitate the development of personalized therapies. PMID:27252846

  9. Yeast prions and human prion-like proteins: sequence features and prediction methods.

    PubMed

    Cascarina, Sean M; Ross, Eric D

    2014-06-01

    Prions are self-propagating infectious protein isoforms. A growing number of prions have been identified in yeast, each resulting from the conversion of soluble proteins into an insoluble amyloid form. These yeast prions have served as a powerful model system for studying the causes and consequences of prion aggregation. Remarkably, a number of human proteins containing prion-like domains, defined as domains with compositional similarity to yeast prion domains, have recently been linked to various human degenerative diseases, including amyotrophic lateral sclerosis. This suggests that the lessons learned from yeast prions may help in understanding these human diseases. In this review, we examine what has been learned about the amino acid sequence basis for prion aggregation in yeast, and how this information has been used to develop methods to predict aggregation propensity. We then discuss how this information is being applied to understand human disease, and the challenges involved in applying yeast prediction methods to higher organisms.

  10. Establishing whether the structural feature controlling the mechanical properties of starch films is molecular or crystalline.

    PubMed

    Li, Ming; Xie, Fengwei; Hasjim, Jovin; Witt, Torsten; Halley, Peter J; Gilbert, Robert G

    2015-03-06

    The effects of molecular and crystalline structures on the tensile mechanical properties of thermoplastic starch (TPS) films from waxy, normal, and high-amylose maize were investigated. Starch structural variations were obtained through extrusion and hydrothermal treatment (HTT). The molecular and crystalline structures were characterized using size-exclusion chromatography and X-ray diffractometry, respectively. TPS from high-amylose maize showed higher elongation at break and tensile strength than those from normal maize and waxy maize starches when processed with 40% plasticizer. Within the same amylose content, the mechanical properties were not affected by amylopectin molecular size or the crystallinity of TPS prior to HTT. This lack of correlation between the molecular size, crystallinity and mechanical properties may be due to the dominant effect of the plasticizer on the mechanical properties. Further crystallization of normal maize TPS by HTT increased the tensile strength and Young's modulus, while decreasing the elongation at break. The results suggest that the crystallinity from the remaining ungelatinized starch granules has less significant effect on the mechanical properties than that resulting from starch recrystallization, possibly due to a stronger network from leached-out amylose surrounding the remaining starch granules.

  11. Genetic features of Huntington disease in Cuban population: implications for phenotype, epidemiology and predictive testing.

    PubMed

    Vázquez-Mojena, Yaimeé; Laguna-Salvia, Leonides; Laffita-Mesa, José M; González-Zaldívar, Yanetza; Almaguer-Mederos, Luis E; Rodríguez-Labrada, Roberto; Almaguer-Gotay, Dennis; Zayas-Feria, Pedro; Velázquez-Pérez, Luis

    2013-12-15

    Huntington disease is the most frequent polyglutamine disorder with variable worldwide prevalence. Although some Latin American populations have been studied, HD prevalence in Cuban population remains unknown. In order to characterize the disease in Cuba, the relative frequency of HD was determined by studying 130 patients with chorea and 63 unrelated healthy controls, emphasizing in the molecular epidemiology of the disease. Sixty-two patients with chorea belonging to 16 unrelated families carried a pathological CAG expansion in the HTT gene, ranging from 39 to 67 repeats. Eighty-three percent of them come from the eastern region of the country. A significant inverse correlation between age at onset and expanded CAG repeats was seen. Intermediate alleles in affected individuals and controls represented 4.8% and 3.97% respectively, which have been a putative source of de novo mutation. This study represents the largest molecular characterization of Huntington disease in the Cuban population. These results may have significant implications for an understanding of the disease, its diagnosis and prognosis in Cuban patients, giving health professionals the tools to implement confirmatory genetic testing, pre-symptomatic testing and clinical trials in this population.

  12. Text as data: using text-based features for proteins representation and for computational prediction of their characteristics.

    PubMed

    Shatkay, Hagit; Brady, Scott; Wong, Andrew

    2015-03-01

    The current era of large-scale biology is characterized by a fast-paced growth in the number of sequenced genomes and, consequently, by a multitude of identified proteins whose function has yet to be determined. Simultaneously, any known or postulated information concerning genes and proteins is part of the ever-growing published scientific literature, which is expanding at a rate of over a million new publications per year. Computational tools that attempt to automatically predict and annotate protein characteristics, such as function and localization patterns, are being developed along with systems that aim to support the process via text mining. Most work on protein characterization focuses on features derived directly from protein sequence data. Protein-related work that does aim to utilize the literature typically concentrates on extracting specific facts (e.g., protein interactions) from text. In the past few years we have taken a different route, treating the literature as a source of text-based features, which can be employed just as sequence-based protein-features were used in earlier work, for predicting protein subcellular location and possibly also function. We discuss here in detail the overall approach, along with results from work we have done in this area demonstrating the value of this method and its potential use.

  13. Stability, surface features, and atom leaching of palladium nanoparticles: toward prediction of catalytic functionality.

    PubMed

    Ramezani-Dakhel, Hadi; Mirau, Peter A; Naik, Rajesh R; Knecht, Marc R; Heinz, Hendrik

    2013-04-21

    Surfactant-stabilized metal nanoparticles have shown promise as catalysts although specific surface features and their influence on catalytic performance have not been well understood. We quantify the thermodynamic stability, the facet composition of the surface, and distinct atom types that affect rates of atom leaching for a series of twenty near-spherical Pd nanoparticles of 1.8 to 3.1 nm size using computational models. Cohesive energies indicate higher stability of certain particles that feature an approximate 60/20/20 ratio of {111}, {100}, and {110} facets while less stable particles exhibit widely variable facet composition. Unique patterns of atom types on the surface cause apparent differences in binding energies and changes in reactivity. Estimates of the relative rate of atom leaching as a function of particle size were obtained by the summation of Boltzmann-weighted binding energies over all surface atoms. Computed leaching rates are in good qualitative correlation with the measured catalytic activity of peptide-stabilized Pd nanoparticles of the same shape and size in Stille coupling reactions. The agreement supports rate-controlling contributions by atom leaching in the presence of reactive substrates. The computational approach provides a pathway to estimate the catalytic activity of metal nanostructures of engineered shape and size, and possible further refinements are described.

  14. Predicting bacteriophage proteins located in host cell with feature selection technique.

    PubMed

    Ding, Hui; Liang, Zhi-Yong; Guo, Feng-Biao; Huang, Jian; Chen, Wei; Lin, Hao

    2016-04-01

    A bacteriophage is a virus that can infect a bacterium. The fate of an infected bacterium is determined by the bacteriophage proteins located in the host cell. Thus, reliably identifying bacteriophage proteins located in the host cell is extremely important to understand their functions and discover potential anti-bacterial drugs. Thus, in this paper, a computational method was developed to recognize bacteriophage proteins located in host cells based only on their amino acid sequences. The analysis of variance (ANOVA) combined with incremental feature selection (IFS) was proposed to optimize the feature set. Using a jackknife cross-validation, our method can discriminate between bacteriophage proteins located in a host cell and the bacteriophage proteins not located in a host cell with a maximum overall accuracy of 84.2%, and can further classify bacteriophage proteins located in host cell cytoplasm and in host cell membranes with a maximum overall accuracy of 92.4%. To enhance the value of the practical applications of the method, we built a web server called PHPred (〈http://lin.uestc.edu.cn/server/PHPred〉). We believe that the PHPred will become a powerful tool to study bacteriophage proteins located in host cells and to guide related drug discovery.

  15. Molecular self-organization: Predicting the pattern diversity and lowest energy state of competing ordering motifs

    NASA Astrophysics Data System (ADS)

    Hermann, B. A.; Rohr, C.; Balbás Gambra, M.; Malecki, A.; Malarek, M. S.; Frey, E.; Franosch, T.

    2010-10-01

    Self-organized monolayers of highly flexible Fréchet dendrons were deposited on graphite surfaces by solution casting. Scanning tunneling microscopy (STM) reveals an unprecedented variety of patterns with up to seven stable hierarchical ordering motifs allowing us to use these molecules as a versatile model system. The essential molecular properties determined by molecular mechanics simulations are condensed to a coarse grained interaction-site model of various chain configurations. In a Monte Carlo approach with random starting configurations, the experimental pattern diversity can be reproduced in all facets of the local and global ordering. Based on an energy analysis of the Monte Carlo and molecular mechanics modeling, the thermodynamically most stable pattern is predicted and shown to coincide with the pattern which dominates the STM images after several hours or upon moderate heating.

  16. Musical emotions: predicting second-by-second subjective feelings of emotion from low-level psychoacoustic features and physiological measurements.

    PubMed

    Coutinho, Eduardo; Cangelosi, Angelo

    2011-08-01

    We sustain that the structure of affect elicited by music is largely dependent on dynamic temporal patterns in low-level music structural parameters. In support of this claim, we have previously provided evidence that spatiotemporal dynamics in psychoacoustic features resonate with two psychological dimensions of affect underlying judgments of subjective feelings: arousal and valence. In this article we extend our previous investigations in two aspects. First, we focus on the emotions experienced rather than perceived while listening to music. Second, we evaluate the extent to which peripheral feedback in music can account for the predicted emotional responses, that is, the role of physiological arousal in determining the intensity and valence of musical emotions. Akin to our previous findings, we will show that a significant part of the listeners' reported emotions can be predicted from a set of six psychoacoustic features--loudness, pitch level, pitch contour, tempo, texture, and sharpness. Furthermore, the accuracy of those predictions is improved with the inclusion of physiological cues--skin conductance and heart rate. The interdisciplinary work presented here provides a new methodology to the field of music and emotion research based on the combination of computational and experimental work, which aid the analysis of the emotional responses to music, while offering a platform for the abstract representation of those complex relationships. Future developments may aid specific areas, such as, psychology and music therapy, by providing coherent descriptions of the emotional effects of specific music stimuli.

  17. Physical re-examination of parameters on a molecular collisions-based diffusion model for diffusivity prediction in polymers.

    PubMed

    Ohashi, Hidenori; Tamaki, Takanori; Yamaguchi, Takeo

    2011-12-29

    Molecular collisions, which are the microscopic origin of molecular diffusive motion, are affected by both the molecular surface area and the distance between molecules. Their product can be regarded as the free space around a penetrant molecule defined as the "shell-like free volume" and can be taken as a characteristic of molecular collisions. On the basis of this notion, a new diffusion theory has been developed. The model can predict molecular diffusivity in polymeric systems using only well-defined single-component parameters of molecular volume, molecular surface area, free volume, and pre-exponential factors. By consideration of the physical description of the model, the actual body moved and which neighbor molecules are collided with are the volume and the surface area of the penetrant molecular core. In the present study, a semiempirical quantum chemical calculation was used to calculate both of these parameters. The model and the newly developed parameters offer fairly good predictive ability.

  18. Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space.

    PubMed

    Hansen, Katja; Biegler, Franziska; Ramakrishnan, Raghunathan; Pronobis, Wiktor; von Lilienfeld, O Anatole; Müller, Klaus-Robert; Tkatchenko, Alexandre

    2015-06-18

    Simultaneously accurate and efficient prediction of molecular properties throughout chemical compound space is a critical ingredient toward rational compound design in chemical and pharmaceutical industries. Aiming toward this goal, we develop and apply a systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules. These methods range from a simple sum over atoms, to addition of bond energies, to pairwise interatomic force fields, reaching to the more sophisticated machine learning approaches that are capable of describing collective interactions between many atoms or bonds. In the case of equilibrium molecular geometries, even simple pairwise force fields demonstrate prediction accuracy comparable to benchmark energies calculated using density functional theory with hybrid exchange-correlation functionals; however, accounting for the collective many-body interactions proves to be essential for approaching the “holy grail” of chemical accuracy of 1 kcal/mol for both equilibrium and out-of-equilibrium geometries. This remarkable accuracy is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space. In addition, the same representation allows us to predict accurate electronic properties of molecules, such as their polarizability and molecular frontier orbital energies.

  19. Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space

    DOE PAGES

    Hansen, Katja; Biegler, Franziska; Ramakrishnan, Raghunathan; ...

    2015-06-04

    Simultaneously accurate and efficient prediction of molecular properties throughout chemical compound space is a critical ingredient toward rational compound design in chemical and pharmaceutical industries. Aiming toward this goal, we develop and apply a systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules. These methods range from a simple sum over atoms, to addition of bond energies, to pairwise interatomic force fields, reaching to the more sophisticated machine learning approaches that are capable of describing collective interactions between many atoms or bonds. In the case of equilibrium molecular geometries, even simple pairwise force fields demonstratemore » prediction accuracy comparable to benchmark energies calculated using density functional theory with hybrid exchange-correlation functionals; however, accounting for the collective many-body interactions proves to be essential for approaching the “holy grail” of chemical accuracy of 1 kcal/mol for both equilibrium and out-of-equilibrium geometries. This remarkable accuracy is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space. The same representation allows us to predict accurate electronic properties of molecules, such as their polarizability and molecular frontier orbital energies.« less

  20. Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space

    SciTech Connect

    Hansen, Katja; Biegler, Franziska; Ramakrishnan, Raghunathan; Pronobis, Wiktor; von Lilienfeld, O. Anatole; Müller, Klaus -Robert; Tkatchenko, Alexandre

    2015-06-04

    Simultaneously accurate and efficient prediction of molecular properties throughout chemical compound space is a critical ingredient toward rational compound design in chemical and pharmaceutical industries. Aiming toward this goal, we develop and apply a systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules. These methods range from a simple sum over atoms, to addition of bond energies, to pairwise interatomic force fields, reaching to the more sophisticated machine learning approaches that are capable of describing collective interactions between many atoms or bonds. In the case of equilibrium molecular geometries, even simple pairwise force fields demonstrate prediction accuracy comparable to benchmark energies calculated using density functional theory with hybrid exchange-correlation functionals; however, accounting for the collective many-body interactions proves to be essential for approaching the “holy grail” of chemical accuracy of 1 kcal/mol for both equilibrium and out-of-equilibrium geometries. This remarkable accuracy is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space. The same representation allows us to predict accurate electronic properties of molecules, such as their polarizability and molecular frontier orbital energies.

  1. A comparison of molecular dynamics and diffuse interface model predictions of Lennard-Jones fluid evaporation

    SciTech Connect

    Barbante, Paolo; Frezzotti, Aldo; Gibelli, Livio

    2014-12-09

    The unsteady evaporation of a thin planar liquid film is studied by molecular dynamics simulations of Lennard-Jones fluid. The obtained results are compared with the predictions of a diffuse interface model in which capillary Korteweg contributions are added to hydrodynamic equations, in order to obtain a unified description of the liquid bulk, liquid-vapor interface and vapor region. Particular care has been taken in constructing a diffuse interface model matching the thermodynamic and transport properties of the Lennard-Jones fluid. The comparison of diffuse interface model and molecular dynamics results shows that, although good agreement is obtained in equilibrium conditions, remarkable deviations of diffuse interface model predictions from the reference molecular dynamics results are observed in the simulation of liquid film evaporation. It is also observed that molecular dynamics results are in good agreement with preliminary results obtained from a composite model which describes the liquid film by a standard hydrodynamic model and the vapor by the Boltzmann equation. The two mathematical model models are connected by kinetic boundary conditions assuming unit evaporation coefficient.

  2. Structural features of binary mixtures of supercritical CO2 with polar entrainers by molecular dynamics simulation

    NASA Astrophysics Data System (ADS)

    Gurina, D. L.; Antipova, M. L.; Petrenko, V. E.

    2013-10-01

    Computer simulations of supercritical carbon dioxide and its mixtures with polar cosolvents: water, methanol, and ethanol (concentration, 0.125 mole fractions) at T = 318 K and ρ = 0.7 g/cm3 are performed. Atom-atom radial distribution functions are calculated by classical molecular dynamics, while the probability distributions of relative orientation of CO2 molecules in the first and second coordination spheres describing the geometry of the nearest environment of CO2 molecules and the trajectories of cosolvent molecules are found using Car-Parrinello molecular dynamics. Based on the latter, the conclusions regarding structure and interactions of polar entrainers in their mixtures with supercritical CO2 are made. It is shown that the microstructure of carbon dioxide varies only slightly upon the introduction of cosolvents.

  3. Characteristic Features of Molecular Structure and Packing of Organopolysilanes with Asymmetric Side Chains

    NASA Astrophysics Data System (ADS)

    Furukawa, Shoji; Ohta, Hidetaka

    2005-01-01

    The molecular structure and packing of poly(methyl ethyl silane), [(CH3)Si(C2H5)]n, and poly(methyl n-propyl silane), [(CH3)Si(C3H7)]n, have been examined by the X-ray diffraction method. For poly(methyl ethyl silane), several configurations are possible for the arrangement of the C2H5 group, whereas the C3H7 groups stretch along one equivalent direction for poly(methyl n-propyl silane). In both cases, the molecular structure and packing are mostly determined by the intramolecular steric hindrance and van der Waals interaction between side chains, which is the same as that of polysilanes with symmetric side chains.

  4. Surgical anatomy, radiological features, and molecular biology of the lumbar intervertebral discs.

    PubMed

    Ghannam, Malik; Jumah, Fareed; Mansour, Shaden; Samara, Amjad; Alkhdour, Saja; Alzuabi, Muayad A; Aker, Loai; Adeeb, Nimer; Massengale, Justin; Oskouian, Rod J; Shane Tubbs, R

    2017-03-01

    The intervertebral disc (IVD) is a joint unique in structure and functions. Lying between adjacent vertebrae, it provides both the primary support and the elasticity required for the spine to move stably. Various aspects of the IVD have long been studied by researchers seeking a better understanding of its dynamics, aging, and subsequent disorders. In this article, we review the surgical anatomy, imaging modalities, and molecular biology of the lumbar IVD. Clin. Anat. 30:251-266, 2017. © 2017 Wiley Periodicals, Inc.

  5. Molecular features distinguish ten neuronal types in the mouse superficial superior colliculus.

    PubMed

    Byun, Haewon; Kwon, Soohyun; Ahn, Hee-Jeong; Liu, Hong; Forrest, Douglas; Demb, Jonathan B; Kim, In-Jung

    2016-08-01

    The superior colliculus (SC) is a midbrain center involved in controlling head and eye movements in response to inputs from multiple sensory modalities. Visual inputs arise from both the retina and visual cortex and converge onto the superficial layer of the SC (sSC). Neurons in the sSC send information to deeper layers of the SC and to thalamic nuclei that modulate visually guided behaviors. Presently, our understanding of sSC neurons is impeded by a lack of molecular markers that define specific cell types. To better understand the identity and organization of sSC neurons, we took a systematic approach to investigate gene expression within four molecular families: transcription factors, cell adhesion molecules, neuropeptides, and calcium binding proteins. Our analysis revealed 12 molecules with distinct expression patterns in mouse sSC: cadherin 7, contactin 3, netrin G2, cadherin 6, protocadherin 20, retinoid-related orphan receptor β, brain-specific homeobox/POU domain protein 3b, Ets variant gene 1, substance P, somatostatin, vasoactive intestinal polypeptide, and parvalbumin. Double labeling experiments, by either in situ hybridization or immunostaining, demonstrated that the 12 molecular markers collectively define 10 different sSC neuronal types. The characteristic positions of these cell types divide the sSC into four distinct layers. The 12 markers identified here will serve as valuable tools to examine molecular mechanisms that regulate development of sSC neuronal types. These markers could also be used to examine the connections between specific cell types that form retinocollicular, corticocollicular, or colliculothalamic pathways. J. Comp. Neurol. 524:2300-2321, 2016. © 2016 Wiley Periodicals, Inc.

  6. De novo PHIP-predicted deleterious variants are associated with developmental delay, intellectual disability, obesity, and dysmorphic features

    PubMed Central

    Webster, Emily; Cho, Megan T.; Alexander, Nora; Desai, Sonal; Naidu, Sakkubai; Bekheirnia, Mir Reza; Lewis, Andrea; Retterer, Kyle; Juusola, Jane; Chung, Wendy K.

    2016-01-01

    Using whole-exome sequencing, we have identified novel de novo heterozygous pleckstrin homology domain-interacting protein (PHIP) variants that are predicted to be deleterious, including a frameshift deletion, in two unrelated patients with common clinical features of developmental delay, intellectual disability, anxiety, hypotonia, poor balance, obesity, and dysmorphic features. A nonsense mutation in PHIP has previously been associated with similar clinical features. Patients with microdeletions of 6q14.1, including PHIP, have a similar phenotype of developmental delay, intellectual disability, hypotonia, and obesity, suggesting that the phenotype of our patients is a result of loss-of-function mutations. PHIP produces multiple protein products, such as PHIP1 (also known as DCAF14), PHIP, and NDRP. PHIP1 is one of the multiple substrate receptors of the proteolytic CUL4-DDB1 ubiquitin ligase complex. CUL4B deficiency has been associated with intellectual disability, central obesity, muscle wasting, and dysmorphic features. The overlapping phenotype associated with CUL4B deficiency suggests that PHIP mutations cause disease through disruption of the ubiquitin ligase pathway. PMID:27900362

  7. Molecular features determining different partitioning patterns of papain and bromelain in aqueous two-phase systems.

    PubMed

    Rocha, Maria Victoria; Nerli, Bibiana Beatriz

    2013-10-01

    The partitioning patterns of papain (PAP) and bromelain (BR), two well-known cysteine-proteases, in polyethyleneglycol/sodium citrate aqueous two-phase systems (ATPSs) were determined. Polyethyleneglycols of different molecular weight (600, 1000, 2000, 4600 and 8000) were assayed. Thermodynamic characterization of partitioning process, spectroscopy measurements and computational calculations of protein surface properties were also carried out in order to explain their differential partitioning behavior. PAP was observed to be displaced to the salt-enriched phase in all the assayed systems with partition coefficients (KpPAP) values between 0.2 and 0.9, while BR exhibited a high affinity for the polymer phase in systems formed by PEGs of low molecular weight (600 and 1000) with partition coefficients (KpBR) values close to 3. KpBR values resulted higher than KpPAP in all the cases. This difference could be assigned neither to the charge nor to the size of the partitioned biomolecules since PAP and BR possess similar molecular weight (23,000) and isoelectric point (9.60). The presence of highly exposed tryptophans and positively charged residues (Lys, Arg and His) in BR molecule would be responsible for a charge transfer interaction between PEG and the protein and, therefore, the uneven distribution of BR in these systems.

  8. Features of Knowledge Building in Biology: Understanding Undergraduate Students’ Ideas about Molecular Mechanisms

    PubMed Central

    Southard, Katelyn; Wince, Tyler; Meddleton, Shanice; Bolger, Molly S.

    2016-01-01

    Research has suggested that teaching and learning in molecular and cellular biology (MCB) is difficult. We used a new lens to understand undergraduate reasoning about molecular mechanisms: the knowledge-integration approach to conceptual change. Knowledge integration is the dynamic process by which learners acquire new ideas, develop connections between ideas, and reorganize and restructure prior knowledge. Semistructured, clinical think-aloud interviews were conducted with introductory and upper-division MCB students. Interviews included a written conceptual assessment, a concept-mapping activity, and an opportunity to explain the biomechanisms of DNA replication, transcription, and translation. Student reasoning patterns were explored through mixed-method analyses. Results suggested that students must sort mechanistic entities into appropriate mental categories that reflect the nature of MCB mechanisms and that conflation between these categories is common. We also showed how connections between molecular mechanisms and their biological roles are part of building an integrated knowledge network as students develop expertise. We observed differences in the nature of connections between ideas related to different forms of reasoning. Finally, we provide a tentative model for MCB knowledge integration and suggest its implications for undergraduate learning. PMID:26931398

  9. Clinical and Laboratory Features of the Nocardia spp. Based on Current Molecular Taxonomy

    PubMed Central

    Brown-Elliott, Barbara A.; Brown, June M.; Conville, Patricia S.; Wallace, Richard J.

    2006-01-01

    The recent explosion of newly described species of Nocardia results from the impact in the last decade of newer molecular technology, including PCR restriction enzyme analysis and 16S rRNA sequencing. These molecular techniques have revolutionized the identification of the nocardiae by providing rapid and accurate identification of recognized nocardiae and, at the same time, revealing new species and a number of yet-to-be-described species. There are currently more than 30 species of nocardiae of human clinical significance, with the majority of isolates being N. nova complex, N. abscessus, N. transvalensis complex, N. farcinica, N. asteroides type VI (N. cyriacigeorgica), and N. brasiliensis. These species cause a wide variety of diseases and have variable drug susceptibilities. Accurate identification often requires referral to a reference laboratory with molecular capabilities, as many newer species are genetically distinct from established species yet have few or no distinguishing phenotypic characteristics. Correct identification is important in deciding the clinical relevance of a species and in the clinical management and treatment of patients with nocardial disease. This review characterizes the currently known pathogenic species of Nocardia, including clinical disease, drug susceptibility, and methods of identification. PMID:16614249

  10. Features of Knowledge Building in Biology: Understanding Undergraduate Students' Ideas about Molecular Mechanisms.

    PubMed

    Southard, Katelyn; Wince, Tyler; Meddleton, Shanice; Bolger, Molly S

    2016-01-01

    Research has suggested that teaching and learning in molecular and cellular biology (MCB) is difficult. We used a new lens to understand undergraduate reasoning about molecular mechanisms: the knowledge-integration approach to conceptual change. Knowledge integration is the dynamic process by which learners acquire new ideas, develop connections between ideas, and reorganize and restructure prior knowledge. Semistructured, clinical think-aloud interviews were conducted with introductory and upper-division MCB students. Interviews included a written conceptual assessment, a concept-mapping activity, and an opportunity to explain the biomechanisms of DNA replication, transcription, and translation. Student reasoning patterns were explored through mixed-method analyses. Results suggested that students must sort mechanistic entities into appropriate mental categories that reflect the nature of MCB mechanisms and that conflation between these categories is common. We also showed how connections between molecular mechanisms and their biological roles are part of building an integrated knowledge network as students develop expertise. We observed differences in the nature of connections between ideas related to different forms of reasoning. Finally, we provide a tentative model for MCB knowledge integration and suggest its implications for undergraduate learning.

  11. Correlation spectroscopy and molecular dynamics simulations to study the structural features of proteins.

    PubMed

    Varriale, Antonio; Marabotti, Anna; Mei, Giampiero; Staiano, Maria; D'Auria, Sabato

    2013-01-01

    In this work, we used a combination of fluorescence correlation spectroscopy (FCS) and molecular dynamics (MD) simulation methodologies to acquire structural information on pH-induced unfolding of the maltotriose-binding protein from Thermus thermophilus (MalE2). FCS has emerged as a powerful technique for characterizing the dynamics of molecules and it is, in fact, used to study molecular diffusion on timescale of microsecond and longer. Our results showed that keeping temperature constant, the protein diffusion coefficient decreased from 84±4 µm(2)/s to 44±3 µm(2)/s when pH was changed from 7.0 to 4.0. An even more marked decrease of the MalE2 diffusion coefficient (31±3 µm(2)/s) was registered when pH was raised from 7.0 to 10.0. According to the size of MalE2 (a monomeric protein with a molecular weight of 43 kDa) as well as of its globular native shape, the values of 44 µm(2)/s and 31 µm(2)/s could be ascribed to deformations of the protein structure, which enhances its propensity to form aggregates at extreme pH values. The obtained fluorescence correlation data, corroborated by circular dichroism, fluorescence emission and light-scattering experiments, are discussed together with the MD simulations results.

  12. Predicting Essential Genes and Proteins Based on Machine Learning and Network Topological Features: A Comprehensive Review

    PubMed Central

    Zhang, Xue; Acencio, Marcio Luis; Lemke, Ney

    2016-01-01

    Essential proteins/genes are indispensable to the survival or reproduction of an organism, and the deletion of such essential proteins will result in lethality or infertility. The identification of essential genes is very important not only for understanding the minimal requirements for survival of an organism, but also for finding human disease genes and new drug targets. Experimental methods for identifying essential genes are costly, time-consuming, and laborious. With the accumulation of sequenced genomes data and high-throughput experimental data, many computational methods for identifying essential proteins are proposed, which are useful complements to experimental methods. In this review, we show the state-of-the-art methods for identifying essential genes and proteins based on machine learning and network topological features, point out the progress and limitations of current methods, and discuss the challenges and directions for further research. PMID:27014079

  13. Integrating in silico prediction methods, molecular docking, and molecular dynamics simulation to predict the impact of ALK missense mutations in structural perspective.

    PubMed

    Doss, C George Priya; Chakraborty, Chiranjib; Chen, Luonan; Zhu, Hailong

    2014-01-01

    Over the past decade, advancements in next generation sequencing technology have placed personalized genomic medicine upon horizon. Understanding the likelihood of disease causing mutations in complex diseases as pathogenic or neutral remains as a major task and even impossible in the structural context because of its time consuming and expensive experiments. Among the various diseases causing mutations, single nucleotide polymorphisms (SNPs) play a vital role in defining individual's susceptibility to disease and drug response. Understanding the genotype-phenotype relationship through SNPs is the first and most important step in drug research and development. Detailed understanding of the effect of SNPs on patient drug response is a key factor in the establishment of personalized medicine. In this paper, we represent a computational pipeline in anaplastic lymphoma kinase (ALK) for SNP-centred study by the application of in silico prediction methods, molecular docking, and molecular dynamics simulation approaches. Combination of computational methods provides a way in understanding the impact of deleterious mutations in altering the protein drug targets and eventually leading to variable patient's drug response. We hope this rapid and cost effective pipeline will also serve as a bridge to connect the clinicians and in silico resources in tailoring treatments to the patients' specific genotype.

  14. Molecular-Scale Features that Govern the Effects of O-Glycosylation on a Carbohydrate-Binding Module

    DOE PAGES

    Guan, Xiaoyang; Chaffey, Patrick K.; Zeng, Chen; ...

    2015-09-21

    The protein glycosylation is a ubiquitous post-translational modification in all kingdoms of life. Despite its importance in molecular and cellular biology, the molecular-level ramifications of O-glycosylation on biomolecular structure and function remain elusive. Here, we took a small model glycoprotein and changed the glycan structure and size, amino acid residues near the glycosylation site, and glycosidic linkage while monitoring any corresponding changes to physical stability and cellulose binding affinity. The results of this study reveal the collective importance of all the studied features in controlling the most pronounced effects of O-glycosylation in this system. This study suggests the possibility ofmore » designing proteins with multiple improved properties by simultaneously varying the structures of O-glycans and amino acids local to the glycosylation site.« less

  15. Molecular-Scale Features that Govern the Effects of O-Glycosylation on a Carbohydrate-Binding Module

    SciTech Connect

    Guan, Xiaoyang; Chaffey, Patrick K.; Zeng, Chen; Greene, Eric R.; Chen, Liqun; Drake, Matthew R.; Chen, Claire; Groobman, Ari; Resch, Michael G.; Himmel, Michael E.; Beckham, Gregg T.; Tan, Zhongping

    2015-09-21

    The protein glycosylation is a ubiquitous post-translational modification in all kingdoms of life. Despite its importance in molecular and cellular biology, the molecular-level ramifications of O-glycosylation on biomolecular structure and function remain elusive. Here, we took a small model glycoprotein and changed the glycan structure and size, amino acid residues near the glycosylation site, and glycosidic linkage while monitoring any corresponding changes to physical stability and cellulose binding affinity. The results of this study reveal the collective importance of all the studied features in controlling the most pronounced effects of O-glycosylation in this system. This study suggests the possibility of designing proteins with multiple improved properties by simultaneously varying the structures of O-glycans and amino acids local to the glycosylation site.

  16. HybridGO-Loc: mining hybrid features on gene ontology for predicting subcellular localization of multi-location proteins.

    PubMed

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2014-01-01

    Protein subcellular localization prediction, as an essential step to elucidate the functions in vivo of proteins and identify drugs targets, has been extensively studied in previous decades. Instead of only determining subcellular localization of single-label proteins, recent studies have focused on predicting both single- and multi-location proteins. Computational methods based on Gene Ontology (GO) have been demonstrated to be superior to methods based on other features. However, existing GO-based methods focus on the occurrences of GO terms and disregard their relationships. This paper proposes a multi-label subcellular-localization predictor, namely HybridGO-Loc, that leverages not only the GO term occurrences but also the inter-term relationships. This is achieved by hybridizing the GO frequencies of occurrences and the semantic similarity between GO terms. Given a protein, a set of GO terms are retrieved by searching against the gene ontology database, using the accession numbers of homologous proteins obtained via BLAST search as the keys. The frequency of GO occurrences and semantic similarity (SS) between GO terms are used to formulate frequency vectors and semantic similarity vectors, respectively, which are subsequently hybridized to construct fusion vectors. An adaptive-decision based multi-label support vector machine (SVM) classifier is proposed to classify the fusion vectors. Experimental results based on recent benchmark datasets and a new dataset containing novel proteins show that the proposed hybrid-feature predictor significantly outperforms predictors based on individual GO features as well as other state-of-the-art predictors. For readers' convenience, the HybridGO-Loc server, which is for predicting virus or plant proteins, is available online at http://bioinfo.eie.polyu.edu.hk/HybridGoServer/.

  17. Systematic analysis of non-structural protein features for the prediction of PTM function potential by artificial neural networks.

    PubMed

    Dewhurst, Henry M; Torres, Matthew P

    2017-01-01

    Post-translational modifications (PTMs) provide an extensible framework for regulation of protein behavior beyond the diversity represented within the genome alone. While the rate of identification of PTMs has rapidly increased in recent years, our knowledge of PTM functionality encompasses less than 5% of this data. We previously developed SAPH-ire (Structural Analysis of PTM Hotspots) for the prioritization of eukaryotic PTMs based on function potential of discrete modified alignment positions (MAPs) in a set of 8 protein families. A proteome-wide expansion of the dataset to all families of PTM-bearing, eukaryotic proteins with a representational crystal structure and the application of artificial neural network (ANN) models demonstrated the broader applicability of this approach. Although structural features of proteins have been repeatedly demonstrated to be predictive of PTM functionality, the availability of adequately resolved 3D structures in the Protein Data Bank (PDB) limits the scope of these methods. In order to bridge this gap and capture the larger set of PTM-bearing proteins without an available, homologous structure, we explored all available MAP features as ANN inputs to identify predictive models that do not rely on 3D protein structural data. This systematic, algorithmic approach explores 8 available input features in exhaustive combinations (247 models; size 2-8). To control for potential bias in random sampling for holdback in training sets, we iterated each model across 100 randomized, sample training and testing sets-yielding 24,700 individual ANNs. The size of the analyzed dataset and iterative generation of ANNs represents the largest and most thorough investigation of predictive models for PTM functionality to date. Comparison of input layer combinations allows us to quantify ANN performance with a high degree of confidence and subsequently select a top-ranked, robust fit model which highlights 3,687 MAPs, including 10,933 PTMs with a high

  18. Systematic analysis of non-structural protein features for the prediction of PTM function potential by artificial neural networks

    PubMed Central

    2017-01-01

    Post-translational modifications (PTMs) provide an extensible framework for regulation of protein behavior beyond the diversity represented within the genome alone. While the rate of identification of PTMs has rapidly increased in recent years, our knowledge of PTM functionality encompasses less than 5% of this data. We previously developed SAPH-ire (Structural Analysis of PTM Hotspots) for the prioritization of eukaryotic PTMs based on function potential of discrete modified alignment positions (MAPs) in a set of 8 protein families. A proteome-wide expansion of the dataset to all families of PTM-bearing, eukaryotic proteins with a representational crystal structure and the application of artificial neural network (ANN) models demonstrated the broader applicability of this approach. Although structural features of proteins have been repeatedly demonstrated to be predictive of PTM functionality, the availability of adequately resolved 3D structures in the Protein Data Bank (PDB) limits the scope of these methods. In order to bridge this gap and capture the larger set of PTM-bearing proteins without an available, homologous structure, we explored all available MAP features as ANN inputs to identify predictive models that do not rely on 3D protein structural data. This systematic, algorithmic approach explores 8 available input features in exhaustive combinations (247 models; size 2–8). To control for potential bias in random sampling for holdback in training sets, we iterated each model across 100 randomized, sample training and testing sets—yielding 24,700 individual ANNs. The size of the analyzed dataset and iterative generation of ANNs represents the largest and most thorough investigation of predictive models for PTM functionality to date. Comparison of input layer combinations allows us to quantify ANN performance with a high degree of confidence and subsequently select a top-ranked, robust fit model which highlights 3,687 MAPs, including 10,933 PTMs with a

  19. HybridGO-Loc: Mining Hybrid Features on Gene Ontology for Predicting Subcellular Localization of Multi-Location Proteins

    PubMed Central

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2014-01-01

    Protein subcellular localization prediction, as an essential step to elucidate the functions in vivo of proteins and identify drugs targets, has been extensively studied in previous decades. Instead of only determining subcellular localization of single-label proteins, recent studies have focused on predicting both single- and multi-location proteins. Computational methods based on Gene Ontology (GO) have been demonstrated to be superior to methods based on other features. However, existing GO-based methods focus on the occurrences of GO terms and disregard their relationships. This paper proposes a multi-label subcellular-localization predictor, namely HybridGO-Loc, that leverages not only the GO term occurrences but also the inter-term relationships. This is achieved by hybridizing the GO frequencies of occurrences and the semantic similarity between GO terms. Given a protein, a set of GO terms are retrieved by searching against the gene ontology database, using the accession numbers of homologous proteins obtained via BLAST search as the keys. The frequency of GO occurrences and semantic similarity (SS) between GO terms are used to formulate frequency vectors and semantic similarity vectors, respectively, which are subsequently hybridized to construct fusion vectors. An adaptive-decision based multi-label support vector machine (SVM) classifier is proposed to classify the fusion vectors. Experimental results based on recent benchmark datasets and a new dataset containing novel proteins show that the proposed hybrid-feature predictor significantly outperforms predictors based on individual GO features as well as other state-of-the-art predictors. For readers' convenience, the HybridGO-Loc server, which is for predicting virus or plant proteins, is available online at http://bioinfo.eie.polyu.edu.hk/HybridGoServer/. PMID:24647341

  20. Role of electrostatic potential in the in silico prediction of molecular bioactivation and mutagenesis.

    PubMed

    Ford, Kevin A

    2013-04-01

    Electrostatic potential (ESP) is a useful physicochemical property of a molecule that provides insights into inter- and intramolecular associations, as well as prediction of likely sites of electrophilic and nucleophilic metabolic attack. Knowledge of sites of metabolic attack is of paramount importance in DMPK research since drugs frequently fail in clinical trials due to the formation of bioactivated metabolites which are often difficult to measure experimentally due to their reactive nature and relatively short half-lives. Computational chemistry methods have proven invaluable in recent years as a means to predict and study bioactivated metabolites without the need for chemical syntheses, or testing on experimental animals. Additional molecular properties (heat of formation, heat of solvation and E(LUMO) - E(HOMO)) are discussed in this paper as complementary indicators of the behavior of metabolites in vivo. Five diverse examples are presented (acetaminophen, aniline/phenylamines, imidacloprid, nefazodone and vinyl chloride) which illustrate the utility of this multidimensional approach in predicting bioactivation, and in each case the predicted data agreed with experimental data described in the scientific literature. A further example of the usefulness of calculating ESP, in combination with the molecular properties mentioned above, is provided by an examination of the use of these parameters in providing an explanation for the sites of nucleophilic attack of the nucleic acid cytosine. Exploration of sites of nucleophilic attack of nucleic acids is important as adducts of DNA have the potential to result in mutagenesis.

  1. Assessment of a Four-View Mammographic Image Feature Based Fusion Model to Predict Near-Term Breast Cancer Risk.

    PubMed

    Tan, Maxine; Pu, Jiantao; Cheng, Samuel; Liu, Hong; Zheng, Bin

    2015-10-01

    The purpose of this study was to develop and assess a new quantitative four-view mammographic image feature based fusion model to predict the near-term breast cancer risk of the individual women after a negative screening mammography examination of interest. The dataset included fully-anonymized mammograms acquired on 870 women with two sequential full-field digital mammography examinations. For each woman, the first "prior" examination in the series was interpreted as negative (not recalled) during the original image reading. In the second "current" examination, 430 women were diagnosed with pathology verified cancers and 440 remained negative ("cancer-free"). For each of four bilateral craniocaudal and mediolateral oblique view images of left and right breasts, we computed and analyzed eight groups of global mammographic texture and tissue density image features. A risk prediction model based on three artificial neural networks was developed to fuse image features computed from two bilateral views of four images. The risk model performance was tested using a ten-fold cross-validation method and a number of performance evaluation indices including the area under the receiver operating characteristic curve (AUC) and odds ratio (OR). The highest AUC = 0.725 ± 0.026 was obtained when the model was trained by gray-level run length statistics texture features computed on dense breast regions, which was significantly higher than the AUC values achieved using the model trained by only two bilateral one-view images (p < 0.02). The adjustable OR values monotonically increased from 1.0 to 11.8 as model-generated risk score increased. The regression analysis of OR values also showed a significant increase trend in slope (p < 0.01). As a result, this preliminary study demonstrated that a new four-view mammographic image feature based risk model could provide useful and supplementary image information to help predict the near-term breast cancer risk.

  2. Predicting the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol mixtures via molecular simulation.

    PubMed

    Paluch, Andrew S; Parameswaran, Sreeja; Liu, Shuai; Kolavennu, Anasuya; Mobley, David L

    2015-01-28

    We present a general framework to predict the excess solubility of small molecular solids (such as pharmaceutical solids) in binary solvents via molecular simulation free energy calculations at infinite dilution with conventional molecular models. The present study used molecular dynamics with the General AMBER Force Field to predict the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol solvents. The simulations are able to predict the existence of solubility enhancement and the results are in good agreement with available experimental data. The accuracy of the predictions in addition to the generality of the method suggests that molecular simulations may be a valuable design tool for solvent selection in drug development processes.

  3. Predicting the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol mixtures via molecular simulation

    NASA Astrophysics Data System (ADS)

    Paluch, Andrew S.; Parameswaran, Sreeja; Liu, Shuai; Kolavennu, Anasuya; Mobley, David L.

    2015-01-01

    We present a general framework to predict the excess solubility of small molecular solids (such as pharmaceutical solids) in binary solvents via molecular simulation free energy calculations at infinite dilution with conventional molecular models. The present study used molecular dynamics with the General AMBER Force Field to predict the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol solvents. The simulations are able to predict the existence of solubility enhancement and the results are in good agreement with available experimental data. The accuracy of the predictions in addition to the generality of the method suggests that molecular simulations may be a valuable design tool for solvent selection in drug development processes.

  4. Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies.

    PubMed

    Hansen, Katja; Montavon, Grégoire; Biegler, Franziska; Fazli, Siamac; Rupp, Matthias; Scheffler, Matthias; von Lilienfeld, O Anatole; Tkatchenko, Alexandre; Müller, Klaus-Robert

    2013-08-13

    The accurate and reliable prediction of properties of molecules typically requires computationally intensive quantum-chemical calculations. Recently, machine learning techniques applied to ab initio calculations have been proposed as an efficient approach for describing the energies of molecules in their given ground-state structure throughout chemical compound space (Rupp et al. Phys. Rev. Lett. 2012, 108, 058301). In this paper we outline a number of established machine learning techniques and investigate the influence of the molecular representation on the methods performance. The best methods achieve prediction errors of 3 kcal/mol for the atomization energies of a wide variety of molecules. Rationales for this performance improvement are given together with pitfalls and challenges when applying machine learning approaches to the prediction of quantum-mechanical observables.

  5. Enhancing Predictive Accuracy of Cardiac Autonomic Neuropathy Using Blood Biochemistry Features and Iterative Multitier Ensembles.

    PubMed

    Abawajy, Jemal; Kelarev, Andrei; Chowdhury, Morshed U; Jelinek, Herbert F

    2016-01-01

    Blood biochemistry attributes form an important class of tests, routinely collected several times per year for many patients with diabetes. The objective of this study is to investigate the role of blood biochemistry for improving the predictive accuracy of the diagnosis of cardiac autonomic neuropathy (CAN) progression. Blood biochemistry contributes to CAN, and so it is a causative factor that can provide additional power for the diagnosis of CAN especially in the absence of a complete set of Ewing tests. We introduce automated iterative multitier ensembles (AIME) and investigate their performance in comparison to base classifiers and standard ensemble classifiers for blood biochemistry attributes. AIME incorporate diverse ensembles into several tiers simultaneously and combine them into one automatically generated integrated system so that one ensemble acts as an integral part of another ensemble. We carried out extensive experimental analysis using large datasets from the diabetes screening research initiative (DiScRi) project. The results of our experiments show that several blood biochemistry attributes can be used to supplement the Ewing battery for the detection of CAN in situations where one or more of the Ewing tests cannot be completed because of the individual difficulties faced by each patient in performing the tests. The results show that AIME provide higher accuracy as a multitier CAN classification paradigm. The best predictive accuracy of 99.57% has been obtained by the AIME combining decorate on top tier with bagging on middle tier based on random forest. Practitioners can use these findings to increase the accuracy of CAN diagnosis.

  6. Predicting bioconcentration of chemicals into vegetation from soil or air using the molecular connectivity index

    SciTech Connect

    Dowdy, D.L.; McKone, T.E.; Hsieh, D.P.H.

    1995-12-31

    Bioconcentration factors (BCFs) are the ratio of chemical concentration found in an exposed organism (in this case a plant) to the concentration in an air or soil exposure medium. The authors examine here the use of molecular connectivity indices (MCIs) as quantitative structure-activity relationships (QSARS) for predicting BCFs for organic chemicals between plants and air or soil. The authors compare the reliability of the octanol-air partition coefficient (K{sub oa}) to the MC based prediction method for predicting plant/air partition coefficients. The authors also compare the reliability of the octanol/water partition coefficient (K{sub ow}) to the MC based prediction method for predicting plant/soil partition coefficients. The results here indicate that, relative to the use of K{sub ow} or K{sub oa} as predictors of BCFs the MC can substantially increase the reliability with which BCFs can be estimated. The authors find that the MC provides a relatively precise and accurate method for predicting the potential biotransfer of a chemical from environmental media into plants. In addition, the MC is much faster and more cost effective than direct measurements.

  7. Improved individualized prediction of schizophrenia in subjects at familial high risk, based on neuroanatomical data, schizotypal and neurocognitive features.

    PubMed

    Zarogianni, Eleni; Storkey, Amos J; Johnstone, Eve C; Owens, David G C; Lawrie, Stephen M

    2017-03-01

    To date, there are no reliable markers for predicting onset of schizophrenia in individuals at high risk (HR). Substantial promise is, however, shown by a variety of pattern classification approaches to neuroimaging data. Here, we examined the predictive accuracy of support vector machine (SVM) in later diagnosing schizophrenia, at a single-subject level, using a cohort of HR individuals drawn from multiply affected families and a combination of neuroanatomical, schizotypal and neurocognitive variables. Baseline structural magnetic resonance imaging (MRI), schizotypal and neurocognitive data from 17 HR subjects, who subsequently developed schizophrenia and a matched group of 17 HR subjects who did not make the transition, yet had psychotic symptoms, were included in the analysis. We employed recursive feature elimination (RFE), in a nested cross-validation scheme to identify the most significant predictors of disease transition and enhance diagnostic performance. Classification accuracy was 94% when a self-completed measure of schizotypy, a declarative memory test and structural MRI data were combined into a single learning algorithm; higher than when either quantitative measure was used alone. The discriminative neuroanatomical pattern involved gray matter volume differences in frontal, orbito-frontal and occipital lobe regions bilaterally as well as parts of the superior, medial temporal lobe and cerebellar regions. Our findings suggest that an early SVM-based prediction of schizophrenia is possible and can be improved by combining schizotypal and neurocognitive features with neuroanatomical variables. However, our predictive model needs to be tested by classifying a new, independent HR cohort in order to estimate its validity.

  8. Phenytoin-Induced Gingival Overgrowth: A Review of the Molecular, Immune, and Inflammatory Features

    PubMed Central

    Corrêa, Jôice Dias; Queiroz-Junior, Celso Martins; Costa, José Eustáquio; Teixeira, Antônio Lúcio; Silva, Tarcilia Aparecida

    2011-01-01

    Gingival overgrowth (GO) is a side effect associated with some distinct classes of drugs, such as anticonvulsants, immunosuppressant, and calcium channel blockers. GO is characterized by the accumulation of extracellular matrix in gingival connective tissues, particularly collagenous components, with varying degrees of inflammation. One of the main drugs associated with GO is the antiepileptic phenytoin, which affects gingival tissues by altering extracellular matrix metabolism. Nevertheless, the pathogenesis of such drug-induced GO remains fulfilled by some contradictory findings. This paper aims to present the most relevant studies regarding the molecular, immune, and inflammatory aspects of phenytoin-induced gingival overgrowth. PMID:21991476

  9. Predictive value of combined clinically diagnosed bruxism and occlusal features for TMJ pain.

    PubMed

    Manfredini, Daniele; Peretta, Redento; Guarda-Nardini, Luca; Ferronato, Giuseppe

    2010-04-01

    Several works showed a decreased role for occlusion in the etiology of temporomandibular disorders (TMD). Nonetheless, it may be hypothesized that occlusion acts as a modulator through which bruxism activities may cause damage to the stomatognathic structures. To test this hypothesis, a logistic regression model was created with the inclusion of clinically diagnosed bruxism and eight occlusal features as potential predictors for temporomandibular joint (TMJ) pain in a sample of 276 consecutive TMD patients. The final logit showed that the percentage of the total log likelihood for TMJ pain explained by the significant factors was small and amounted to 13.2%, with unacceptable levels of sensitivity (16.4%). The parameters overbite > or = 4 mm combined with clinically diagnosed bruxism [OR (odds ratio) 4.62], overjet > or = 5 mm (OR 2.83), and asymmetrical molar relationship combined with clinically diagnosed bruxism (OR 2.77) were those with the highest odds for disease, even though none of those values was significant with respect to confidence intervals. Thus, the hypothesis under evaluation has to be rejected. It is possible that future studies with a higher discriminatory power for the different bruxism activities might be indicated to get deeper into the analysis of the potential mechanisms through which occlusion may play a role, even if small, in the etiology of the different TMD.

  10. Plasma etching of high-resolution features in a fullerene molecular resist

    NASA Astrophysics Data System (ADS)

    Manyam, J.; Manickam, M.; Preece, J. A.; Palmer, R. E.; Robinson, A. P. G.

    2011-04-01

    As resist films become thinner, so as to reduce problems of aspect ratio related pattern collapse at high-resolution, it is becoming increasingly difficult to transfer patterns with useful aspect ratio by directly etching the resist. It has become common to use the photoresist to pattern an intermediate hardmask, which then protects the silicon substrate during etching, allowing useful aspect ratios but adding process complexity. We have previously described a fullerene based electron beam lithography resist capable of 20 nm halfpitch and 12 nm sparse features, at a sensitivity of less than 10 μC/cm2 at 20 keV. The fullerene resist has high etch durability - comparable to that of commercial novolac resists - and has previously demonstrated an etch selectivity of 3:1 to silicon using electron cyclotron resonance microwave plasma etching with SF6. Here a study of the capabilities of this resist when using Inductively Coupled Plasma etching is presented. Line-space patterns with half-pitches in the range 25 nm to 100 nm, together with sparse features (~20 nm linewidth on a 200 nm pitch) were produced in ~30 nm thick resist films using electron beam lithography, and transferred to silicon using an inductively coupled plasma etcher. Several combinations of SF6, CF4, CHF3 and C4F8process gases were explored. Etch selectivity and anisotropy were studied as a range of etching parameters, such as ICP and RF power, gas flow rate, pressure and temperature were varied. Etch selectivities in excess of 9:1 were demonstrated. Techniques for minimizing aspect ratio dependent etching effects in dense features, including the use of ashing or high etching pressures were also examined.

  11. Using Molecular Mechanics to Predict Bulk Material Properties of Fibronectin Fibers

    PubMed Central

    Bradshaw, Mark J.; Cheung, Man C.; Ehrlich, Daniel J.; Smith, Michael L.

    2012-01-01

    The structural proteins of the extracellular matrix (ECM) form fibers with finely tuned mechanical properties matched to the time scales of cell traction forces. Several proteins such as fibronectin (Fn) and fibrin undergo molecular conformational changes that extend the proteins and are believed to be a major contributor to the extensibility of bulk fibers. The dynamics of these conformational changes have been thoroughly explored since the advent of single molecule force spectroscopy and molecular dynamics simulations but remarkably, these data have not been rigorously applied to the understanding of the time dependent mechanics of bulk ECM fibers. Using measurements of protein density within fibers, we have examined the influence of dynamic molecular conformational changes and the intermolecular arrangement of Fn within fibers on the bulk mechanical properties of Fn fibers. Fibers were simulated as molecular strands with architectures that promote either equal or disparate molecular loading under conditions of constant extension rate. Measurements of protein concentration within micron scale fibers using deep ultraviolet transmission microscopy allowed the simulations to be scaled appropriately for comparison to in vitro measurements of fiber mechanics as well as providing estimates of fiber porosity and water content, suggesting Fn fibers are approximately 75% solute. Comparing the properties predicted by single molecule measurements to in vitro measurements of Fn fibers showed that domain unfolding is sufficient to predict the high extensibility and nonlinear stiffness of Fn fibers with surprising accuracy, with disparately loaded fibers providing the best fit to experiment. This work shows the promise of this microstructural modeling approach for understanding Fn fiber properties, which is generally applicable to other ECM fibers, and could be further expanded to tissue scale by incorporating these simulated fibers into three dimensional network models. PMID

  12. Molecular features of the prazosin molecule required for activation of Transport-P.

    PubMed

    da Silva, Joaquim Fernando Mendes; Walters, Marcus; Al-Damluji, Saad; Ganellin, C Robin

    2008-08-01

    Closely related structural analogues of prazosin have been synthesised and tested for inhibition and activation of Transport-P in order to identify the structural features of the prazosin molecule that appear to be necessary for activation of Transport-P. So far, all the compounds tested are less active than prazosin. It is shown that the structure of prazosin appears to be very specific for the activation. Only quinazolines have been found to activate, and the presence of the 6,7-dimethoxy and 4-amino groups appears to be critically important.

  13. Mouse Grueneberg ganglion neurons share molecular and functional features with C. elegans amphid neurons

    PubMed Central

    Brechbühl, Julien; Moine, Fabian; Broillet, Marie-Christine

    2013-01-01

    The mouse Grueneberg ganglion (GG) is an olfactory subsystem located at the tip of the nose close to the entry of the naris. It comprises neurons that are both sensitive to cold temperature and play an important role in the detection of alarm pheromones (APs). This chemical modality may be essential for species survival. Interestingly, GG neurons display an atypical mammalian olfactory morphology with neurons bearing deeply invaginated cilia mostly covered by ensheathing glial cells. We had previously noticed their morphological resemblance with the chemosensory amphid neurons found in the anterior region of the head of Caenorhabditis elegans (C. elegans). We demonstrate here further molecular and functional similarities. Thus, we found an orthologous expression of molecular signaling elements that was furthermore restricted to similar specific subcellular localizations. Calcium imaging also revealed a ligand selectivity for the methylated thiazole odorants that amphid neurons are known to detect. Cellular responses from GG neurons evoked by chemical or temperature stimuli were also partially cGMP-dependent. In addition, we found that, although behaviors depending on temperature sensing in the mouse, such as huddling and thermotaxis did not implicate the GG, the thermosensitivity modulated the chemosensitivity at the level of single GG neurons. Thus, the striking similarities with the chemosensory amphid neurons of C. elegans conferred to the mouse GG neurons unique multimodal sensory properties. PMID:24367309

  14. Crypto-rhombomeres of the mouse medulla oblongata, defined by molecular and morphological features.

    PubMed

    Tomás-Roca, Laura; Corral-San-Miguel, Rubén; Aroca, Pilar; Puelles, Luis; Marín, Faustino

    2016-03-01

    The medulla oblongata is the caudal portion of the vertebrate hindbrain. It contains major ascending and descending fiber tracts as well as several motor and interneuron populations, including neural centers that regulate the visceral functions and the maintenance of bodily homeostasis. In the avian embryo, it has been proposed that the primordium of this region is subdivided into five segments or crypto-rhombomeres (r7-r11), which were defined according to either their parameric position relative to intersomitic boundaries (Cambronero and Puelles, in J Comp Neurol 427:522-545, 2000) or a stepped expression of Hox genes (Marín et al., in Dev Biol 323:230-247, 2008). In the present work, we examine the implied similar segmental organization of the mouse medulla oblongata. To this end, we analyze the expression pattern of Hox genes from groups 3 to 8, comparing them to the expression of given cytoarchitectonic and molecular markers, from mid-gestational to perinatal stages. As a result of this approach, we conclude that the mouse medulla oblongata is segmentally organized, similarly as in avian embryos. Longitudinal structures such as the nucleus of the solitary tract, the dorsal vagal motor nucleus, the hypoglossal motor nucleus, the descending trigeminal and vestibular columns, or the reticular formation appear subdivided into discrete segmental units. Additionally, our analysis identified an internal molecular organization of the migrated pontine nuclei that reflects a differential segmental origin of their neurons as assessed by Hox gene expression.

  15. [Analysis of Conformational Features of Watson-Crick Duplex Fragments by Molecular Mechanics and Quantum Mechanics Methods].

    PubMed

    Poltev, V I; Anisimov, V M; Sanchez, C; Deriabina, A; Gonzalez, E; Garcia, D; Rivas, F; Polteva, N A

    2016-01-01

    It is generally accepted that the important characteristic features of the Watson-Crick duplex originate from the molecular structure of its subunits. However, it still remains to elucidate what properties of each subunit are responsible for the significant characteristic features of the DNA structure. The computations of desoxydinucleoside monophosphates complexes with Na-ions using density functional theory revealed a pivotal role of DNA conformational properties of single-chain minimal fragments in the development of unique features of the Watson-Crick duplex. We found that directionality of the sugar-phosphate backbone and the preferable ranges of its torsion angles, combined with the difference between purines and pyrimidines. in ring bases, define the dependence of three-dimensional structure of the Watson-Crick duplex on nucleotide base sequence. In this work, we extended these density functional theory computations to the minimal' fragments of DNA duplex, complementary desoxydinucleoside monophosphates complexes with Na-ions. Using several computational methods and various functionals, we performed a search for energy minima of BI-conformation for complementary desoxydinucleoside monophosphates complexes with different nucleoside sequences. Two sequences are optimized using ab initio method at the MP2/6-31++G** level of theory. The analysis of torsion angles, sugar ring puckering and mutual base positions of optimized structures demonstrates that the conformational characteristic features of complementary desoxydinucleoside monophosphates complexes with Na-ions remain within BI ranges and become closer to the corresponding characteristic features of the Watson-Crick duplex crystals. Qualitatively, the main characteristic features of each studied complementary desoxydinucleoside monophosphates complex remain invariant when different computational methods are used, although the quantitative values of some conformational parameters could vary lying within the

  16. Plume and surface feature structure and compositional effects on Europa's global exosphere: Preliminary Europa mission predictions

    NASA Astrophysics Data System (ADS)

    Teolis, B. D.; Wyrick, D. Y.; Bouquet, A.; Magee, B. A.; Waite, J. H.

    2017-03-01

    A Europa plume source, if present, may produce a global exosphere with complex spatial structure and temporal variability in its density and composition. To investigate this interaction we have integrated a water plume source containing multiple organic and nitrile species into a Europan Monte Carlo exosphere model, considering the effect of Europa's gravity in returning plume ejecta to the surface, and the subsequent spreading of adsorbed and exospheric material by thermal desorption and re-sputtering across the entire body. We consider sputtered, radiolytic and potential plume sources, together with surface adsorption, regolith diffusion, polar cold trapping, and re-sputtering of adsorbed materials, and examine the spatial distribution and temporal evolution of the exospheric density and composition. These models provide a predictive basis for telescopic observations (e.g. HST, JWST) and planned missions to the Jovian system by NASA and ESA. We apply spacecraft trajectories to our model to explore possible exospheric compositions which may be encountered along proposed flybys of Europa to inform the spatial and temporal relationship of spacecraft measurements to surface and plume source compositions. For the present preliminary study, we have considered four cases: Case A: an equatorial flyby through a sputtered only exosphere (no plumes), Case B: a flyby over a localized sputtered 'macula' terrain enriched in non-ice species, Case C: a south polar plume with an Enceladus-like composition, equatorial flyby, and Case D: a south polar plume, flyby directly through the plume.

  17. Foreign exchange market data analysis reveals statistical features that predict price movement acceleration

    NASA Astrophysics Data System (ADS)

    Nacher, Jose C.; Ochiai, Tomoshiro

    2012-05-01

    Increasingly accessible financial data allow researchers to infer market-dynamics-based laws and to propose models that are able to reproduce them. In recent years, several stylized facts have been uncovered. Here we perform an extensive analysis of foreign exchange data that leads to the unveiling of a statistical financial law. First, our findings show that, on average, volatility increases more when the price exceeds the highest (or lowest) value, i.e., breaks the resistance line. We call this the breaking-acceleration effect. Second, our results show that the probability P(T) to break the resistance line in the past time T follows power law in both real data and theoretically simulated data. However, the probability calculated using real data is rather lower than the one obtained using a traditional Black-Scholes (BS) model. Taken together, the present analysis characterizes a different stylized fact of financial markets and shows that the market exceeds a past (historical) extreme price fewer times than expected by the BS model (the resistance effect). However, when the market does, we predict that the average volatility at that time point will be much higher. These findings indicate that any Markovian model does not faithfully capture the market dynamics.

  18. Iodine atoms: a new molecular feature for the design of potent transthyretin fibrillogenesis inhibitors.

    PubMed

    Mairal, Teresa; Nieto, Joan; Pinto, Marta; Almeida, Maria Rosário; Gales, Luis; Ballesteros, Alfredo; Barluenga, José; Pérez, Juan J; Vázquez, Jesús T; Centeno, Nuria B; Saraiva, Maria Joao; Damas, Ana M; Planas, Antoni; Arsequell, Gemma; Valencia, Gregorio

    2009-01-01

    The thyroid hormone and retinol transporter protein known as transthyretin (TTR) is in the origin of one of the 20 or so known amyloid diseases. TTR self assembles as a homotetramer leaving a central hydrophobic channel with two symmetrical binding sites. The aggregation pathway of TTR into amiloid fibrils is not yet well characterized but in vitro binding of thyroid hormones and other small organic molecules to TTR binding channel results in tetramer stabilization which prevents amyloid formation in an extent which is proportional to the binding constant. Up to now, TTR aggregation inhibitors have been designed looking at various structural features of this binding channel others than its ability to host iodine atoms. In the present work, greatly improved inhibitors have been designed and tested by taking into account that thyroid hormones are unique in human biochemistry owing to the presence of multiple iodine atoms in their molecules which are probed to interact with specific halogen binding domains sitting at the TTR binding channel. The new TTR fibrillogenesis inhibitors are based on the diflunisal core structure because diflunisal is a registered salicylate drug with NSAID activity now undergoing clinical trials for TTR amyloid diseases. Biochemical and biophysical evidence confirms that iodine atoms can be an important design feature in the search for candidate drugs for TTR related amyloidosis.

  19. Correlation of clinical and molecular features in spinal bulbar muscular atrophy

    PubMed Central

    Nirmalananthan, Niranjanan; Masset, Luc; Skorupinska, Iwona; Collins, Toby; Cortese, Andrea; Pemble, Sally; Malaspina, Andrea; Fisher, Elizabeth M.C.; Greensmith, Linda; Hanna, Michael G.

    2014-01-01

    Objectives: To characterize the clinical and genetic features of spinal bulbar muscular atrophy (SBMA), a rare neurodegenerative disorder caused by the expansion of a CAG repeat in the first exon of the androgen receptor gene, in the United Kingdom. Methods: We created a national register for SBMA in the United Kingdom and recruited 61 patients between 2005 and 2013. In our cross-sectional study, we assessed, by direct questioning, impairment of activities of daily living (ADL) milestones, functional rating, and subjective disease impact, and performed correlations with both CAG repeat size and degree of somatic mosaicism. Ten patients were deceased, 46 patients participated in the study, and 5 declined. Results: Subjects had an average age at onset of 43.4 years, and weakness onset most frequently occurred in the lower limbs (87%). Impaired mobility was the most frequently reported problem by patients, followed by bulbar dysfunction. Age distribution of the impairment of ADL milestones showed remarkable overlap with a Japanese study. We have identified a significant correlation between the number of CAG repeats and both age at onset and ADL milestones. Somatic mosaicism also showed a correlation with CAG expansion size and age at onset. Conclusions: Clinical features in SBMA show a substantial overlap when comparing populations with different genetic backgrounds. This finding has major implications, because multicenter trials will be necessary to obtain sufficient power in future clinical trials. Clinical-genetic correlations are strong in SBMA and should inform any clinical research strategy in this condition. PMID:24814851

  20. Electrophysiological features of inherited demyelinating neuropathies: A reappraisal in the era of molecular diagnosis.

    PubMed

    Lewis, R A; Sumner, A J; Shy, M E

    2000-10-01

    The observation that inherited demyelinating neuropathies have uniform conduction slowing and that acquired disorders have nonuniform or multifocal slowing was made prior to the identification of mutations in myelin-specific genes which cause many of the inherited disorders involving peripheral nerve myelin. It is now clear that the electrophysiological aspects of these disorders are more complex than previously realized. Specifically, certain mutations appear to induce nonuniform slowing of conduction which resemble the findings in acquired demyelinating neuropathies. It is clinically important to recognize the different electrodiagnostic patterns of the various inherited demyelinating neuropathies. In addition, an understanding of the relationship between mutations of specific genes and their associated neurophysiological findings is likely to facilitate understanding of the role of these myelin proteins in peripheral nerve function and of how abnormalities in myelin proteins lead to neuropathy. We therefore review the current information on the electrophysiological features of the inherited demyelinating neuropathies in hopes of clarifying their electrodiagnostic features and to shed light on the physiological consequences of the different genetic mutations.

  1. Iodine Atoms: A New Molecular Feature for the Design of Potent Transthyretin Fibrillogenesis Inhibitors

    PubMed Central

    Pinto, Marta; Almeida, Maria Rosário; Gales, Luis; Ballesteros, Alfredo; Barluenga, José; Pérez, Juan J.; Vázquez, Jesús T.; Centeno, Nuria B.; Saraiva, Maria Joao; Damas, Ana M.; Planas, Antoni; Arsequell, Gemma; Valencia, Gregorio

    2009-01-01

    The thyroid hormone and retinol transporter protein known as transthyretin (TTR) is in the origin of one of the 20 or so known amyloid diseases. TTR self assembles as a homotetramer leaving a central hydrophobic channel with two symmetrical binding sites. The aggregation pathway of TTR into amiloid fibrils is not yet well characterized but in vitro binding of thyroid hormones and other small organic molecules to TTR binding channel results in tetramer stabilization which prevents amyloid formation in an extent which is proportional to the binding constant. Up to now, TTR aggregation inhibitors have been designed looking at various structural features of this binding channel others than its ability to host iodine atoms. In the present work, greatly improved inhibitors have been designed and tested by taking into account that thyroid hormones are unique in human biochemistry owing to the presence of multiple iodine atoms in their molecules which are probed to interact with specific halogen binding domains sitting at the TTR binding channel. The new TTR fibrillogenesis inhibitors are based on the diflunisal core structure because diflunisal is a registered salicylate drug with NSAID activity now undergoing clinical trials for TTR amyloid diseases. Biochemical and biophysical evidence confirms that iodine atoms can be an important design feature in the search for candidate drugs for TTR related amyloidosis. PMID:19125186

  2. Specific molecular signatures of non-tumor liver tissue may predict a risk of hepatocarcinogenesis

    PubMed Central

    Utsunomiya, Tohru; Shimada, Mitsuo; Morine, Yuji; Tajima, Atsushi; Imoto, Issei

    2014-01-01

    Hepatocellular carcinoma (HCC) is one of the most common human cancers and a major cause of cancer-related death worldwide. The bleak outcomes of HCC patients even after curative treatment have been, at least partially, attributed to its multicentric origin. Therefore, it is necessary to examine not only tumor tissue but also non-tumor liver tissue to investigate the molecular mechanisms operating during hepatocarcinogenesis based on the concept of “field cancerization”. Several studies previously investigated the association of molecular alterations in non-tumor liver tissue with clinical features and prognosis in HCC patients on a genome-wide scale. In particular, specific alterations of DNA methylation profiles have been confirmed in non-tumor liver tissue. This review focuses on the possible clinical value of array-based comprehensive analyses of molecular alterations, especially aberrant DNA methylation, in non-tumor liver tissue to clarify the risk of hepatocarcinogenesis. Carcinogenetic risk estimation based on specific methylation signatures may be advantageous for close follow-up of patients who are at high risk of HCC development. Furthermore, epigenetic therapies for patients with chronic liver diseases may be helpful to reduce the risk of HCC development because epigenetic alterations are potentially reversible, and thus provide promising molecular targets for therapeutic intervention. PMID:24766251

  3. Clinicopathologic and Molecular Features of Colorectal Adenocarcinoma with Signet-Ring Cell Component

    PubMed Central

    Gao, Jing; Li, Jian; Li, Jie; Qi, Changsong; Li, Yanyan; Li, Zhongwu; Shen, Lin

    2016-01-01

    Background We performed a retrospective study to assess the clinicopathological characters, molecular alterations and multigene mutation profiles in colorectal cancer patients with signet-ring cell component. Methods Between November 2008 and January 2015, 61 consecutive primary colorectal carcinomas with signet-ring cell component were available for pathological confirmation. RAS/BRAF status was performed by direct sequencing. 14 genes associated with hereditary cancer syndromes were analyzed by targeted gene sequencing. Results A slight male predominance was detected in these patients (59.0%). Colorectal carcinomas with signet-ring cell component were well distributed along the large intestine. A frequently higher TNM stage at the time of diagnosis was observed, compared with the conventional adenocarcinoma. Family history of malignant tumor was remarkable with 49.2% in 61 cases. The median OS time of stage IV patients in our study was 14 months. RAS mutations were detected in 22.2% (12/54) cases with KRAS mutations in 16.7% (9/54) cases and Nras mutations in 5.4%(3/54) cases. BRAF V600E mutation was detected in 3.7% (2/54) cases. As an exploration, we analyzed 14 genes by targeted gene sequencing. These genes were selected based on their biological role in association with hereditary cancer syndromes. 79.6% cases carried at least one pathogenic mutation. Finally, the patients were classified by the percentage of signet-ring cell. 39 (63.9%) cases were composed of ≥50% signet-ring cells; 22 (36.1%) cases were composed of <50% signet-ring cells. We compared clinical parameters, molecular and genetic alterations between the two groups and found no significant differences. Conclusions Colorectal adenocarcinoma with signet-ring cell component is characterized by advanced stage at diagnosis with remarkable family history of malignant tumor. It is likely a negative prognostic factor and tends to affect male patients with low rates of RAS /BRAF mutation. Colorectal

  4. The Hasford Score May Predict Molecular Response in Chronic Myeloid Leukemia Patients: A Single Institution Experience

    PubMed Central

    Jaźwiec, Bożena; Haus, Olga; Urbaniak-Kujda, Donata; Kapelko-Słowik, Katarzyna; Wróbel, Tomasz; Lonc, Tomasz; Sawicki, Mateusz; Mędraś, Ewa; Kaczmar-Dybko, Agnieszka; Kuliczkowski, Kazimierz

    2016-01-01

    The Sokal, Hasford, and EUTOS scores were established in different treatment eras of chronic myeloid leukemia (CML). None of them was reported to predict molecular response. In this single center study we tried to reevaluate the usefulness of three main scores in TKI era. The study group included 88 CML patients in first chronic phase treated initially with standard imatinib dose. All of them achieved major molecular response (MMR) in time points defined by European LeukemiaNet (ELN). 42 patients lost MMR in a median time of 47 months and we found a significant difference in MMR maintenance between intermediate-risk (IR) and low-risk (LR) patients assessed by Hasford score. All 42 patients were switched to second-generation TKI (2G-TKI) treatment. At 18 months of 2G-TKI therapy we have still found a significant difference in BCR-ABL transcript levels and MMR rate between IR and LR groups. We did not find any of the described differences discriminating patients by Sokal or EUTOS score. In this retrospective single center analysis we found Hasford score to be useful in predicting molecular response in first chronic phase of CML patients. PMID:27818567

  5. Computational prediction and experimental selectivity coefficients for hydroxyzine and cetirizine molecularly imprinted polymer based potentiometric sensors.

    PubMed

    Azimi, Abolfazl; Javanbakht, Mehran

    2014-02-17

    In spite of the increasing usages number of molecularly imprinted polymers (MIPs) in many scientific applications, the theoretical aspects of participating intra molecular forces are not fully understood. This work investigates effects of the electrostatic force, the Mulliken charge and the role of cavity's backbone atoms on the selectivity of MIPs. Moreover, charge distribution, which is a computational parameter, was proposed for the prediction of the selectivity coefficients of MIP-based sensors. In the computational approaches and experimental study, methacrylic acid (MAA) was chosen as the functional monomer and ethylene glycol dimethacrylate (EGDMA) as the cross linker for hydroxyzine and cetirizine imprinted polymers. Ab initio, DFT B3LYP method was carried out on molecular optimization. With regard to results obtained from molecules optimization and hydrogen bonding properties, possible configurations of 1:n (n≤5) template/monomer complexes were designed and optimized. The binding energy for each complex in gas phase was calculated. Depending on the most stable configuration, hydroxyzine and cetirizine imprinted polymer models were designed. The calculations including the porogen were also investigated. The theoretical charge distributions for the template and some potential interfering molecules were calculated. The results showed a correlation between the selectivity coefficients and the theoretical charge distributions. The results surprisingly show that charge distribution based model was able to predict the selectivity coefficients of MIP based potentiometric sensors.

  6. Histopathologic features predict survival in diffuse pleural malignant mesothelioma on pleural biopsies.

    PubMed

    Habougit, Cyril; Trombert-Paviot, Béatrice; Karpathiou, Georgia; Casteillo, François; Bayle-Bleuez, Sophie; Fournel, Pierre; Vergnon, Jean-Michel; Tiffet, Olivier; Péoc'h, Michel; Forest, Fabien

    2017-03-27

    Malignant pleural mesothelioma is a rare tumor with a poor prognosis. The only universally recognized pathological prognostic factor is histopathological subtype with a shorter survival in non-epithelioid subtypes. Recently, a grading of epithelioid mesothelioma on surgical resection has been proposed. The aim of our work is to assess the prognostic role of several histopathological factors on a retrospective cohort of 116 patients diagnosed as a pleural mesothelioma for more than 95% of patients on pleural biopsy. Our work shows that mitotic count <3/10 HPF (p < 0.0001), the lack of necrosis (p = 0.0379), mild nuclear atypia (p = 0.0054), the lack of atypical mitoses (p = 0.0265), a nucleoli size <3 μm (p = 0.0139), and a nucleoli absent or visible at 200× or higher magnification (p = 0.0170) are significantly associated with a better median overall survival in epithelioid mesothelioma. The presence of atypical mitoses was found to be related to a worse median survival in non-epithelioid mesothelioma. Mitotic count, necrosis, nuclear atypia, and nucleoli size are not associated with overall survival in non-epithelioid mesothelioma. Our work highlights that histopathological prognostic factors can be assessed on pleural biopsies and can predict reliably median overall survival. This is of interest in order to define subgroups of patients who could benefit of different therapies and select patients who could benefit of surgical excision.

  7. Proteomics, metabolomics, and protein interactomics in the characterization of the molecular features of major depressive disorder.

    PubMed

    Martins-de-Souza, Daniel

    2014-03-01

    Omics technologies emerged as complementary strategies to genomics in the attempt to understand human illnesses. In general, proteomics technologies emerged earlier than those of metabolomics for major depressive disorder (MDD) research, but both are driven by the identification of proteins and/or metabolites that can delineate a comprehensive characterization of MDD's molecular mechanisms, as well as lead to the identification of biomarker candidates of all types-prognosis, diagnosis, treatment, and patient stratification. Also, one can explore protein and metabolite interactomes in order to pinpoint additional molecules associated with the disease that had not been picked up initially. Here, results and methodological aspects of MDD research using proteomics, metabolomics, and protein interactomics are reviewed, focusing on human samples.

  8. Clinical features and molecular mechanisms of spinal and bulbar muscular atrophy (SBMA).

    PubMed

    Katsuno, Masahisa; Banno, Haruhiko; Suzuki, Keisuke; Adachi, Hiroaki; Tanaka, Fumiaki; Sobue, Gen

    2010-01-01

    Spinal and bulbar muscular atrophy (SBMA) is an adult-onset neurodegenerative disease characterized by slowly progressive muscle weakness and atrophy. The cause of this disease is the expansion of a trinucleotide CAG repeat, which encodes the polyglutamine tract, within the first exon of the androgen receptor (AR) gene. SBMA exclusively occurs in adult males, whereas both heterozygous and homozygous females are usually asymptomatic. Lower motor neurons in the anterior horn of the spinal cord and those in the brainstem motor nuclei are predominantly affected in SBMA, and other neuronal and nonneuronal tissues are also widely involved to some extent. Testosterone-dependent nuclear accumulation of the pathogenic AR protein has been considered to be a fundamental step of neurodegenerative process, which is followed by several molecular events such as transcriptional dysregulation, axonal transport disruption and mitochondrial dysfunction. Results of animal studies suggest that androgen deprivation and activation of protein quality control systems are potential therapies for SBMA.

  9. Gender-Specific Molecular and Clinical Features underlie Malignant Pleural Mesothelioma

    PubMed Central

    Rienzo, Assunta De; Archer, Michael A.; Yeap, Beow Y.; Dao, Nhien; Sciaranghella, Daniele; Sideris, Antonios C.; Zheng, Yifan; Holman, Alexander G.; Wang, Yaoyu E.; Dal Cin, Paola S.; Fletcher, Jonathan A.; Rubio, Renee; Croft, Larry; Quackenbush, John; Sugarbaker, Peter E.; Munir, Kiara J.; Battilana, Jesse R.; Gustafson, Corinne E.; Chirieac, Lucian R.; Ching, Soo Meng; Wong, James; Tay, Liang Chung; Rudd, Stephen; Hercus, Robert; Sugarbaker, David J.; Richards, William G.; Bueno, Raphael

    2015-01-01

    Malignant pleural mesothelioma (MPM) is an aggressive cancer that occurs more frequently in men, but is associated with longer survival in women. Insight into the survival advantage of female patients may advance the molecular understanding of MPM and identify therapeutic interventions that will improve the prognosis for all MPM patients. In this study, we performed whole-genome sequencing of tumor specimens from 10 MPM patients and matched control samples to identify potential driver mutations underlying MPM. We identified molecular differences associated with gender and histology. Specifically, single-nucleotide variants of BAP1 were observed in 21% of cases, with lower mutation rates observed in sarcomatoid MPM (p<0.001). Chromosome 22q loss was more frequently associated with the epithelioid than that non-epitheliod histology (p=0.037), whereas CDKN2A deletions occurred more frequently in non-epithelioid subtypes among men (p=0.021) and were correlated with shorter overall survival for the entire cohort (p=0.002) and for men (p=0.012). Furthermore, women were more likely to harbor TP53 mutations (p=0.004). Novel mutations were found in genes associated with the integrin-linked kinase pathway, including MYH9 and RHOA. Moreover, expression levels of BAP1, MYH9, and RHOA were significantly higher in non-epithelioid tumors, and were associated with significant reduction in survival of the entire cohort and across gender subgroups. Collectively, our findings indicate that diverse mechanisms highly related to gender and histology appear to drive MPM. PMID:26554828

  10. Molecular characteristics and prognostic features of breast cancer in Nigerian compared with UK women.

    PubMed

    Agboola, A J; Musa, A A; Wanangwa, N; Abdel-Fatah, T; Nolan, C C; Ayoade, B A; Oyebadejo, T Y; Banjo, A A; Deji-Agboola, A M; Rakha, E A; Green, A R; Ellis, I O

    2012-09-01

    Although breast cancer (BC) incidence is lower in African-American women compared with White-American, in African countries such as Nigeria, BC is a common disease. Nigerian women have a higher risk for early-onset, with a high mortality rate from BC, prompting speculation that risk factors could be genetic and the molecular portrait of these tumours are different to those of western women. In this study, 308 BC samples from Nigerian women with complete clinical history and tumour characteristics were included and compared with a large series of BC from the UK as a control group. Immunoprofile of these tumours was characterised using a panel of 11 biomarkers of known relevance to BC. The immunoprofile and patients' outcome were compared with tumour grade-matched UK control group. Nigerian women presenting with BC were more frequently premenopausal, and their tumours were characterised by large primary tumour size, high tumour grade, advanced lymph node stage, and a higher rate of vascular invasion compared with UK women. In the grade-matched groups, Nigerian BC showed over representation of triple-negative and basal phenotypes and BRCA1 deficiency BC compared with UK women, but no difference was found regarding HER2 expression between the two series. Nigerian women showed significantly poorer outcome after development of BC compared with UK women. This study demonstrates that there are possible genetic and molecular differences between an indigenous Black population and a UK-based series. The basal-like, triple negative and BRCA1 dysfunction groups of tumours identified in this study may have implications in the development of screening programs and therapies for African patients and families that are likely to have a BRCA1 dysfunction, basal like and triple negative.

  11. Predictive diagnostic value for the clinical features accompanying intellectual disability in children with pathogenic copy number variations: a multivariate analysis

    PubMed Central

    2014-01-01

    Background Array comparative genomic hybridization (a-CGH) has become the first-tier investigation in patients with unexplained developmental delay/intellectual disability (DD/ID). Although the costs are progressively decreasing, a-CGH is still an expensive and labour-intensive technique: for this reason a definition of the categories of patients that can benefit the most of the analysis is needed. Aim of the study was to retrospectively analyze the clinical features of children with DD/ID attending the outpatient clinic of the Mother & Child Department of the University Hospital of Modena subjected to a-CGH, to verify by uni- and multivariate analysis the independent predictors of pathogenic CNVs. Methods 116 patients were included in the study. Data relative to the CNVs and to the patients’ clinical features were analyzed for genotype/phenotype correlations. Results and conclusions 27 patients (23.3%) presented pathogenic CNVs (21 deletions, 3 duplications and 3 cases with both duplications and deletions). Univariate analysis showed a significant association of the pathogenic CNVs with the early onset of symptoms (before 1 yr of age) and the presence of malformations and dysmorphisms. Logistic regression analysis showed a significant independent predictive value for diagnosing a pathogenic CNV for malformations (P = 0.002) and dysmorphisms (P = 0.023), suggesting that those features should address a-CGH analysis as a high-priority test for diagnosis. PMID:24775911

  12. i-cisTarget: an integrative genomics method for the prediction of regulatory features and cis-regulatory modules.

    PubMed

    Herrmann, Carl; Van de Sande, Bram; Potier, Delphine; Aerts, Stein

    2012-08-01

    The field of regulatory genomics today is characterized by the generation of high-throughput data sets that capture genome-wide transcription factor (TF) binding, histone modifications, or DNAseI hypersensitive regions across many cell types and conditions. In this context, a critical question is how to make optimal use of these publicly available datasets when studying transcriptional regulation. Here, we address this question in Drosophila melanogaster for which a large number of high-throughput regulatory datasets are available. We developed i-cisTarget (where the 'i' stands for integrative), for the first time enabling the discovery of different types of enriched 'regulatory features' in a set of co-regulated sequences in one analysis, being either TF motifs or 'in vivo' chromatin features, or combinations thereof. We have validated our approach on 15 co-expressed gene sets, 21 ChIP data sets, 628 curated gene sets and multiple individual case studies, and show that meaningful regulatory features can be confidently discovered; that bona fide enhancers can be identified, both by in vivo events and by TF motifs; and that combinations of in vivo events and TF motifs further increase the performance of enhancer prediction.

  13. High-order feature-based mixture models of classification learning predict individual learning curves and enable personalized teaching.

    PubMed

    Cohen, Yarden; Schneidman, Elad

    2013-01-08

    Pattern classification learning tasks are commonly used to explore learning strategies in human subjects. The universal and individual traits of learning such tasks reflect our cognitive abilities and have been of interest both psychophysically and clinically. From a computational perspective, these tasks are hard, because the number of patterns and rules one could consider even in simple cases is exponentially large. Thus, when we learn to classify we must use simplifying assumptions and generalize. Studies of human behavior in probabilistic learning tasks have focused on rules in which pattern cues are independent, and also described individual behavior in terms of simple, single-cue, feature-based models. Here, we conducted psychophysical experiments in which people learned to classify binary sequences according to deterministic rules of different complexity, including high-order, multicue-dependent rules. We show that human performance on such tasks is very diverse, but that a class of reinforcement learning-like models that use a mixture of features captures individual learning behavior surprisingly well. These models reflect the important role of subjects' priors, and their reliance on high-order features even when learning a low-order rule. Further, we show that these models predict future individual answers to a high degree of accuracy. We then use these models to build personally optimized teaching sessions and boost learning.

  14. High-order feature-based mixture models of classification learning predict individual learning curves and enable personalized teaching

    PubMed Central

    Cohen, Yarden; Schneidman, Elad

    2013-01-01

    Pattern classification learning tasks are commonly used to explore learning strategies in human subjects. The universal and individual traits of learning such tasks reflect our cognitive abilities and have been of interest both psychophysically and clinically. From a computational perspective, these tasks are hard, because the number of patterns and rules one could consider even in simple cases is exponentially large. Thus, when we learn to classify we must use simplifying assumptions and generalize. Studies of human behavior in probabilistic learning tasks have focused on rules in which pattern cues are independent, and also described individual behavior in terms of simple, single-cue, feature-based models. Here, we conducted psychophysical experiments in which people learned to classify binary sequences according to deterministic rules of different complexity, including high-order, multicue-dependent rules. We show that human performance on such tasks is very diverse, but that a class of reinforcement learning-like models that use a mixture of features captures individual learning behavior surprisingly well. These models reflect the important role of subjects’ priors, and their reliance on high-order features even when learning a low-order rule. Further, we show that these models predict future individual answers to a high degree of accuracy. We then use these models to build personally optimized teaching sessions and boost learning. PMID:23269833

  15. Quantitative Computed Tomography Features for Predicting Tumor Recurrence in Patients with Surgically Resected Adenocarcinoma of the Lung

    PubMed Central

    Shim, Woo Hyun; Xu, Hai; Choi, Chang-Min; Kim, Hyeong Ryul; Lee, Jung Bok

    2017-01-01

    Purpose The purpose of this study was to determine if preoperative quantitative computed tomography (CT) features including texture and histogram analysis measurements are associated with tumor recurrence in patients with surgically resected adenocarcinoma of the lung. Methods The study included 194 patients with surgically resected lung adenocarcinoma who underwent preoperative CT between January 2013 and December 2013. Quantitative CT feature analysis of the lung adenocarcinomas were performed using in-house software based on plug-in package for ImageJ. Ten quantitative features demonstrating the tumor size, attenuation, shape and texture were extracted. The CT parameters obtained from 1-mm and 5-mm data were compared using intraclass correlation coefficients. Univariate and multivariable logistic regression methods were used to investigate the association between tumor recurrence and preoperative CT findings. Results The 1-mm and 5-mm data were highly correlated in terms of diameter, perimeter, area, mean attenuation and entropy. Circularity and aspect ratio were moderately correlated. However, skewness and kurtosis were poorly correlated. Multivariable logistic regression analysis revealed that area (odds ratio [OR], 1.002 for each 1-mm2 increase; P = 0.003) and mean attenuation (OR, 1.005 for each 1.0-Hounsfield unit increase; P = 0.022) were independently associated with recurrence. The receiver operating curves using these two independent predictive factors showed high diagnostic performance in predicting recurrence (C-index = 0.81, respectively). Conclusion Tumor area and mean attenuation are independently associated with recurrence in patients with surgically resected adenocarcinoma of the lung. PMID:28068363

  16. Towards a predictive framework for predator risk effects: the interaction of landscape features and prey escape tactics.

    PubMed

    Heithaus, Michael R; Wirsing, Aaron J; Burkholder, Derek; Thomson, Jordan; Dill, Lawrence M

    2009-05-01

    1. Risk effects of predators can profoundly affect community dynamics, but the nature of these effects is context dependent. 2. Although context dependence has hindered the development of a general framework for predicting the nature and extent of risk effects, recent studies suggest that such a framework is attainable if the factors that shape anti-predator behaviour, and its effectiveness, in natural communities are well understood. 3. One of these factors, the interaction of prey escape tactics and landscape features, has been largely overlooked. 4. We tested whether this interaction gives rise to interspecific variation in habitat-use patterns of sympatric large marine vertebrates at risk of tiger shark (Galeocerdo cuvier Peron and LeSueur, 1822) predation. Specifically, we tested the a priori hypothesis that pied cormorants (Phalacrocorax varius Gmelin, 1789) would modify their use of shallow seagrass habitats in a manner opposite to that of previously studied dolphins (Tursiops aduncus Ehrenberg, 1833), dugongs (Dugong dugon Müller, 1776), and green turtles (Chelonia mydas Linnaeus, 1758) because, unlike these species, the effectiveness of cormorant escape behaviour does not vary spatially. 5. As predicted, cormorants used interior and edge portions of banks proportional to the abundance of their potential prey when sharks were absent but shifted to interior portions of banks to minimize encounters with tiger sharks as predation risk increased. Other shark prey, however, shift to edge microhabitats when shark densities increase to take advantage of easier escape despite higher encounter rates with sharks. 6. The interaction of landscape features and escape ability likely is important in diverse communities. 7. When escape probabilities are high in habitats with high predator density, risk effects of predators can reverse the direction of commonly assumed indirect effects of top predators. 8. The interaction between landscape features and prey escape tactics

  17. Using molecular simulation to predict solute solvation and partition coefficients in solvents of different polarity.

    PubMed

    Garrido, Nuno M; Jorge, Miguel; Queimada, António J; Macedo, Eugénia A; Economou, Ioannis G

    2011-05-28

    A methodology is proposed for the prediction of the Gibbs energy of solvation (Δ(Solv)G) based on MD simulations. The methodology is then used to predict Δ(Solv)G of four solutes (namely propane, benzene, ethanol and acetone) in several solvents of different polarities (including n-hexane, n-hexadecane, ethylbenzene, 1-octanol, acetone and water) while testing the validity of the TraPPE force field parameters. Excellent agreement with experimental data is obtained, with average deviations of 0.2, 1.1, 0.8 and 1.2 kJ mol(-1), for the four solutes respectively. Subsequently, partition coefficients (log P) for forty different solute/solvent systems are predicted. The a priori knowledge of partition coefficient values is of high importance in chemical and pharmaceutical separation process design or as a measure of the increasingly important environmental fate. Here again, the agreement between experimental data and simulation predictions is excellent, with an absolute average deviation of 0.28 log P units. However, this deviation can be decreased down to 0.14 log P units, just by optimizing partial atomic charges of acetone in the water phase. Consequently, molecular simulation is proven to be a tool with strong physical basis able to predict log P with competitive accuracy when compared to the popular statistical methods with weak physical basis.

  18. Clinical features and molecular genetics of two Tunisian families with abetalipoproteinemia.

    PubMed

    Hammer, Monia Benhamed; El Euch-Fayache, Ghada; Nehdi, Houda; Feki, Moncef; Maamouri-Hicheri, Wieme; Hentati, Fayçal; Amouri, Rim

    2014-02-01

    Abetalipoproteinemia (ABL) is a rare monogenic disease characterized by very low plasma levels of cholesterol and triglyceride and almost complete absence of apolipoprotein B (apoB)-containing lipoproteins. Typically, patients present with failure to thrive, acanthocytosis, pigmented retinopathy and neurological features. It has been shown that ABL results from mutations in the gene encoding the microsomal triglyceride transfer protein (MTTP). Sanger sequencing of MTTP was performed for two unrelated consanguineous Tunisian families with two affected individuals each, presenting a more severe ABL phenotype than previously reported in the literature. The patients were found to be homozygous for two novel mutations. In the first family, a nonsense mutation, c.2313T>A, leading to a truncated protein (p.Y771X) was identified. In the second family, a splice mutation, IVS 9+2T>G, was found. These mutations are believed to abolish the assembly and secretion of apoB-containing lipoproteins.

  19. Clinical and Molecular Features of Laron Syndrome, A Genetic Disorder Protecting from Cancer.

    PubMed

    Janecka, Anna; Kołodziej-Rzepa, Marta; Biesaga, Beata

    2016-01-01

    Laron syndrome (LS) is a rare, genetic disorder inherited in an autosomal recessive manner. The disease is caused by mutations of the growth hormone (GH) gene, leading to GH/insulin-like growth factor type 1 (IGF1) signalling pathway defect. Patients with LS have characteristic biochemical features, such as a high serum level of GH and low IGF1 concentration. Laron syndrome was first described by the Israeli physician Zvi Laron in 1966. Globally, around 350 people are affected by this syndrome and there are two large groups living in separate geographic regions: Israel (69 individuals) and Ecuador (90 individuals). They are all characterized by typical appearance such as dwarfism, facial phenotype, obesity and hypogenitalism. Additionally, they suffer from hypoglycemia, hypercholesterolemia and sleep disorders, but surprisingly have a very low cancer risk. Therefore, studies on LS offer a unique opportunity to better understand carcinogenesis and develop new strategies of cancer treatment.

  20. Histopathological and molecular features of persistent polyclonal B-cell lymphocytosis (PPBL) with progressive splenomegaly.

    PubMed

    Del Giudice, Ilaria; Pileri, Stefano A; Rossi, Maura; Sabattini, Elena; Campidelli, Cristina; Starza, Irene Della; De Propris, Maria S; Mancini, Francesca; Perrone, Maria P; Gesuiti, Paola; Armiento, Daniele; Quattrocchi, Luisa; Tafuri, Agostino; Amendola, Angela; Mauro, Francesca R; Guarini, Anna; Foà, Robin

    2009-03-01

    Five cases of persistent polyclonal B-cell lymphocytosis (PPBL) with progressive splenomegaly are reported; three were splenectomized. BCL2/IGH rearrangements were found in three cases; HLA-DRB1*07 in all. Bone marrow (BM) trephines showed a moderate lymphoid infiltrate with intrasinusoidal distribution resembling a splenic marginal-zone lymphoma. Splenic white pulp revealed an enlargement of the marginal-zone area; red pulp was infiltrated by the same lymphocytes engulfing the sinuses. Splenic and BM B-lymphocytes were CD79a(+)/CD20(+)/IgM(+)/IgD(+)/bcl-2(+)/CD27(+)/DBA.44(-)/CD31(-) and polyclonal by immunophenotype/polymerase chain reaction. PPBL features an expansion of splenic marginal-zone B-lymphocytes, which infiltrate BM sinusoids and circulate in the blood with no evidence of clonality, even in cases with progressive splenomegaly.

  1. COBRA: A Computational Brewing Application for Predicting the Molecular Composition of Organic Aerosols

    SciTech Connect

    Fooshee, David R.; Nguyen, Tran B.; Nizkorodov, Sergey A.; Laskin, Julia; Laskin, Alexander; Baldi, Pierre

    2012-05-08

    Atmospheric organic aerosols (OA) represent a significant fraction of airborne particulate matter and can impact climate, visibility, and human health. These mixtures are difficult to characterize experimentally due to the enormous complexity and dynamic nature of their chemical composition. We introduce a novel Computational Brewing Application (COBRA) and apply it to modeling oligomerization chemistry stemming from condensation and addition reactions of monomers pertinent to secondary organic aerosol (SOA) formed by photooxidation of isoprene. COBRA uses two lists as input: a list of chemical structures comprising the molecular starting pool, and a list of rules defining potential reactions between molecules. Reactions are performed iteratively, with products of all previous iterations serving as reactants for the next one. The simulation generated thousands of molecular structures in the mass range of 120-500 Da, and correctly predicted ~70% of the individual SOA constituents observed by high-resolution mass spectrometry (HR-MS). Selected predicted structures were confirmed with tandem mass spectrometry. Esterification and hemiacetal formation reactions were shown to play the most significant role in oligomer formation, whereas aldol condensation was shown to be insignificant. COBRA is not limited to atmospheric aerosol chemistry, but is broadly applicable to the prediction of reaction products in other complex mixtures for which reasonable reaction mechanisms and seed molecules can be supplied by experimental or theoretical methods.

  2. Clinical and molecular features of a TSH-secreting pituitary microadenoma.

    PubMed

    Usui, Takeshi; Izawa, Shoichiro; Sano, Toshiaki; Tagami, Tetsuya; Nagata, Daisuke; Shimatsu, Akira; Takahashi, Jun A; Naruse, Mitsuhide

    2005-01-01

    We describe a case of a thyroid stimulating hormone (TSH)-secreting pituitary microadenoma, and report the systematic gene expression profile of the surgically- removed tumor. A 50-year-old woman was referred to our hospital because she had high TSH, free-T4, and free-T3 levels, and a pituitary tumor that was visualized with magnetic resonance imaging. Her basal TSH level was high even after a high T3 loading dose, and increased following administration of thyroid releasing hormone (TRH) even after administration of a high dose of exogenous T3. Her clinical symptoms and peripheral markers for T3 were responsive to exogenous T3. There was no thyroid hormone receptor (TR) beta gene mutation. The patient was diagnosed with a TSH-secreting pituitary adenoma, and trans-sphenoid surgery was performed. The histologic features and immunophenotype were consistent with a TSH-secreting pituitary adenoma. Reverse transcription-polymerase chain reaction analysis of pituitary hormones, pituitary-specific transcription factors, receptors, and transcriptional cofactors of clinical significance was performed on the removed tumor. The tumor expressed TSH, growth hormone, prolactin, alpha-subunit, pituitary transcription factor-1 (pit-1) but not proopiomelanocortin (POMC), prophet of pit-1 (prop-1) and pituitary cell-restricted T box factor (Tpit). TRbeta and TRH-receptor gene expression was normal. Three steroid receptor coactivators (SRC)-1, SRC-2, and SRC-3 were expressed. Nuclear receptor corepressor (N-CoR)2 was absent in the tumor, whereas nuclear receptor corepressor (N-CoR1) was expressed. Somatostatin receptor type 1 expression was significantly decreased, whereas type 4 receptor was expressed, which are unusual characteristics for pituitary tumors. The gene expression pattern in the tumor might have a role in the clinical features of this case.

  3. Model predictions of features in microsaccade-related neural responses in a feedforward network with short-term synaptic depression

    NASA Astrophysics Data System (ADS)

    Zhou, Jian-Fang; Yuan, Wu-Jie; Zhou, Zhao; Zhou, Changsong

    2016-02-01

    Recently, the significant microsaccade-induced neural responses have been extensively observed in experiments. To explore the underlying mechanisms of the observed neural responses, a feedforward network model with short-term synaptic depression has been proposed [Yuan, W.-J., Dimigen, O., Sommer, W. and Zhou, C. Front. Comput. Neurosci. 7, 47 (2013)]. The depression model not only gave an explanation for microsaccades in counteracting visual fading, but also successfully reproduced several microsaccade-related features in experimental findings. These results strongly suggest that, the depression model is very useful to investigate microsaccade-related neural responses. In this paper, by using the model, we extensively study and predict the dependance of microsaccade-related neural responses on several key parameters, which could be tuned in experiments. Particularly, we provide a significant prediction that microsaccade-related neural response also complies with the property “sharper is better” observed in many contexts in neuroscience. Importantly, the property exhibits a power-law relationship between the width of input signal and the responsive effectiveness, which is robust against many parameters in the model. By using mean field theory, we analytically investigate the robust power-law property. Our predictions would give theoretical guidance for further experimental investigations of the functional role of microsaccades in visual information processing.

  4. CD44-expressing undifferentiated carcinoma with rhabdoid features of the pancreas: molecular analysis of aggressive invasion and metastasis.

    PubMed

    Ohmoto, Takuji; Yoshitani, Nobuyuki; Nishitsuji, Kazuchika; Takayama, Tetsuji; Yanagisawa, Yuto; Takeya, Motohiro; Sakashita, Naomi

    2015-05-01

    Carcinoma with rhabdoid features is a rare malignant tumor with a poor prognosis whose molecular mechanism for aggressive behavior is unclear. We describe an undifferentiated pancreatic carcinoma with rhabdoid features that demonstrated extensive invasion and metastasis. Examination of a 63-year-old man with back pain disclosed a retroperitoneal tumor with multiple metastases. Lymph node biopsy revealed an undifferentiated carcinoma of unknown origin. Intensive chemotherapy was ineffective; the patient died 3 months after initial symptoms. Autopsy showed that the tumor displaced the retroperitoneal space: it diffusely invaded and destroyed the pancreas and duodenum. Histology demonstrated tumor cells with eccentric vesicular nuclei, large nucleoli, juxtanuclear eosinophilic inclusions, and poor cell adhesion. Immunohistochemistry showed that tumor cells expressed cytokeratin and vimentin, and electron microscopy confirmed a perinuclear mass of intermediate fibrils and lipid droplets, which indicated an undifferentiated carcinoma with rhabdoid features. Tumor tissue contained hyaluronan; tumor cells strongly expressed CD44, matrix metalloproteinase-9, hypoxia-inducible factor-1α, hyaluronan synthase 2, and acyl-CoA:cholesterol acyltransferase 1 and had a high Ki-67(+) ratio. Since hyaluronan is a ligand for CD44, formation of CD44-hyaluronan complex on the cell surface activates CD44 and this activation may explain why the tumor manifested aggressive invasion and metastasis throughout the clinical course.

  5. Molecular analysis of Drosophila eyes absent mutants reveals features of the conserved Eya domain.

    PubMed Central

    Bui, Q T; Zimmerman, J E; Liu, H; Bonini, N M

    2000-01-01

    The eyes absent (eya) gene is critical to eye formation in Drosophila; upon loss of eya function, eye progenitor cells die by programmed cell death. Moreover, ectopic eya expression directs eye formation, and eya functionally synergizes in vivo and physically interacts in vitro with two other genes of eye development, sine oculis and dachshund. The Eya protein sequence, while highly conserved to vertebrates, is novel. To define amino acids critical to the function of the Eya protein, we have sequenced eya alleles. These mutations have revealed that loss of the entire Eya Domain is null for eya activity, but that alleles with truncations within the Eya Domain display partial function. We then extended the molecular genetic analysis to interactions within the Eya Domain. This analysis has revealed regions of special importance to interaction with Sine Oculis or Dachshund. Select eya missense mutations within the Eya Domain diminished the interactions with Sine Oculis or Dachshund. Taken together, these data suggest that the conserved Eya Domain is critical for eya activity and may have functional subregions within it. PMID:10835393

  6. Small molecule kinase inhibitors alleviate different molecular features of myotonic dystrophy type 1.

    PubMed

    Wojciechowska, Marzena; Taylor, Katarzyna; Sobczak, Krzysztof; Napierala, Marek; Krzyzosiak, Wlodzimierz J

    2014-01-01

    Expandable (CTG)n repeats in the 3' UTR of the DMPK gene are a cause of myotonic dystrophy type 1 (DM1), which leads to a toxic RNA gain-of-function disease. Mutant RNAs with expanded CUG repeats are retained in the nucleus and aggregate in discrete inclusions. These foci sequester splicing factors of the MBNL family and trigger upregulation of the CUGBP family of proteins resulting in the mis-splicing of their target transcripts. To date, many efforts to develop novel therapeutic strategies have been focused on disrupting the toxic nuclear foci and correcting aberrant alternative splicing via targeting mutant CUG repeats RNA; however, no effective treatment for DM1 is currently available. Herein, we present results of culturing of human DM1 myoblasts and fibroblasts with two small-molecule ATP-binding site-specific kinase inhibitors, C16 and C51, which resulted in the alleviation of the dominant-negative effects of CUG repeat expansion. Reversal of the DM1 molecular phenotype includes a reduction of the size and number of foci containing expanded CUG repeat transcripts, decreased steady-state levels of CUGBP1 protein, and consequent improvement of the aberrant alternative splicing of several pre-mRNAs misregulated in DM1.

  7. The crystal structure of dihydrofolate reductase from Thermotoga maritima: molecular features of thermostability.

    PubMed

    Dams, T; Auerbach, G; Bader, G; Jacob, U; Ploom, T; Huber, R; Jaenicke, R

    2000-03-31

    Two high-resolution structures have been obtained for dihydrofolate reductase from the hyperthermophilic bacterium Thermotoga maritima in its unliganded state, and in its ternary complex with the cofactor NADPH and the inhibitor, methotrexate. While the overall fold of the hyperthermophilic enzyme is closely similar to monomeric mesophilic dihydrofolate reductase molecules, its quaternary structure is exceptional, in that T. maritima dihydrofolate reductase forms a highly stable homodimer. Here, the molecular reasons for the high intrinsic stability of the enzyme are elaborated and put in context with the available data on the physical parameters governing the folding reaction. The molecule is extremely rigid, even with respect to structural changes during substrate binding and turnover. Subunit cooperativity can be excluded from structural and biochemical data. Major contributions to the high intrinsic stability of the enzyme result from the formation of the dimer. Within the monomer, only subtle stabilizing interactions are detectable, without clear evidence for any of the typical increments of thermal stabilization commonly reported for hyperthermophilic proteins. The docking of the subunits is optimized with respect to high packing density in the dimer interface, additional salt-bridges and beta-sheets. The enzyme does not show significant structural changes upon binding its coenzyme, NADPH, and the inhibitor, methotrexate. The active-site loop, which is known to play an important role in catalysis in mesophilic dihydrofolate reductase molecules, is rearranged, participating in the association of the subunits; it no longer participates in catalysis.

  8. Molecular genetics and clinical features of Birt-Hogg-Dubé syndrome.

    PubMed

    Schmidt, Laura S; Linehan, W Marston

    2015-10-01

    Birt-Hogg-Dubé (BHD) syndrome is an inherited renal cancer syndrome in which affected individuals are at risk of developing benign cutaneous fibrofolliculomas, bilateral pulmonary cysts and spontaneous pneumothoraces, and kidney tumours. Bilateral multifocal renal tumours that develop in BHD syndrome are most frequently hybrid oncocytic tumours and chromophobe renal carcinoma, but can present with other histologies. Germline mutations in the FLCN gene on chromosome 17 are responsible for BHD syndrome--BHD-associated renal tumours display inactivation of the wild-type FLCN allele by somatic mutation or chromosomal loss, confirming that FLCN is a tumour suppressor gene that fits the classic two-hit model. FLCN interacts with two novel proteins, FNIP1 and FNIP2, and with AMPK, a negative regulator of mTOR. Studies with FLCN-deficient cell and animal models support a role for FLCN in modulating the AKT-mTOR pathway. Emerging evidence links FLCN with a number of other molecular pathways and cellular processes important for cell homeostasis that are frequently deregulated in cancer, including regulation of TFE3 and/or TFEB transcriptional activity, amino-acid-dependent mTOR activation through Rag GTPases, TGFβ signalling, PGC1α-driven mitochondrial biogenesis, and autophagy. Currently, surgical intervention is the only therapy available for BHD-associated renal tumours, but improved understanding of the FLCN pathway will hopefully lead to the development of effective forms of targeted systemic therapy for this disease.

  9. Unraveling the distinctive features of hemorrhagic and non-hemorrhagic snake venom metalloproteinases using molecular simulations

    NASA Astrophysics Data System (ADS)

    de Souza, Raoni Almeida; Díaz, Natalia; Nagem, Ronaldo Alves Pinto; Ferreira, Rafaela Salgado; Suárez, Dimas

    2016-01-01

    Snake venom metalloproteinases are important toxins that play fundamental roles during envenomation. They share a structurally similar catalytic domain, but with diverse hemorrhagic capabilities. To understand the structural basis for this difference, we build and compare two dynamical models, one for the hemorrhagic atroxlysin-I from Bothrops atrox and the other for the non-hemorraghic leucurolysin-a from Bothrops leucurus. The analysis of the extended molecular dynamics simulations shows some changes in the local structure, flexibility and surface determinants that can contribute to explain the different hemorrhagic activity of the two enzymes. In agreement with previous results, the long Ω-loop (from residue 149 to 177) has a larger mobility in the hemorrhagic protein. In addition, we find some potentially-relevant differences at the base of the S1' pocket, what may be interesting for the structure-based design of new anti-venom agents. However, the sharpest differences in the computational models of atroxlysin-I and leucurolysin-a are observed in the surface electrostatic potential around the active site region, suggesting thus that the hemorrhagic versus non-hemorrhagic activity is probably determined by protein surface determinants.

  10. Molecular Genetics and Clinical Features of Birt-Hogg-Dubé-Syndrome

    PubMed Central

    Schmidt, Laura S.; Linehan, W. Marston

    2016-01-01

    Birt-Hogg-Dubé (BHD) syndrome is an inherited renal cancer syndrome in which affected individuals are at risk to develop benign, cutaneous fibrofolliculomas, bilateral pulmonary cysts and spontaneous pneumothoraces, and kidney tumors. Bilateral multifocal renal tumors that develop in BHD syndrome are most frequently hybrid oncocytic tumors and chromophobe renal carcinoma, but may present with other histologies. Germline mutations in the FLCN gene on chromosome 17 are responsible for BHD syndrome. BHD-associated renal tumors show inactivation of the wild-type FLCN allele by somatic mutation or chromosomal loss, confirming that FLCN is a tumor suppressor gene that fits the classic two-hit model. FLCN interacts with two novel proteins, FNIP1 and FNIP2, and with AMPK, a negative regulator of mTOR. Studies with FLCN-deficient cell and animal models support a role for FLCN in modulating the AKT-mTOR pathway. Emerging evidence links FLCN with a number of other molecular pathways and cellular processes important for cell homeostasis that are frequently deregulated in cancer, including regulation of TFE3/TFEB transcriptional activity, amino acid-dependent mTOR activation through Rag GTPases, TGF-β signaling, PGC1α-driven mitochondrial biogenesis, and autophagy. Currently, surgical intervention is the only therapy available for BHD-associated renal tumors. Further understanding of the FLCN pathway will hopefully lead to the development of effective forms of therapy for this disease. PMID:26334087

  11. Prediction of retention in hydrophilic interaction liquid chromatography using solute molecular descriptors based on chemical structures.

    PubMed

    Taraji, Maryam; Haddad, Paul R; Amos, Ruth I J; Talebi, Mohammad; Szucs, Roman; Dolan, John W; Pohl, Christopher A

    2017-02-24

    Quantitative structure-retention relationship (QSRR) models are developed to predict the retention times of analytes on five hydrophilic interaction liquid chromatography (HILIC) stationary phases (bare silica, amine, amide, diol and zwitterionic), with a view to selecting the most suitable stationary phase(s) for the separation of these analytes. The study was conducted using six β-adrenergic agonists as target analytes. Molecular descriptors were calculated based only on chemical structures optimized using density functional theory. A genetic algorithm (GA) was then used to select the most relevant molecular descriptors and these were used to build a retention model for each stationary phase using partial least squares (PLS) regression. This model was then used to predict the retention of the test set of target analytes. This process created an optimized descriptor set which enhanced the reliability of the developed QSRR models. Finally, the QSRR models developed in the work were utilized to provide some insight into the separation mechanisms operating in the HILIC mode. Three performance criteria - mean absolute error (MAE), root mean square error of prediction scaled to retention time (RMSEP), and the number of selected descriptors, were used to evaluate the developed models when applied to an external test set of six β-adrenergic agonists and showed highly predictive abilities. MAE values ranged from 13 to 25s on four of the stationary phases, with a somewhat higher error (50s) being observed for the zwitterionic phase. RMSEP values of 4.88-11.12% were recorded. Validation was performed through Y-randomization and chemical domain applicability, from which it was evident that the developed optimized GA-PLS models were robust. The high levels of accuracy, reliability and applicability of the models were to a large extent due to the optimization of the GA descriptor set and the presence of relevant structural and geometric molecular descriptors, together with

  12. Quantum chemical prediction of vibrational spectra of large molecular systems with radical or metallic electronic structure

    NASA Astrophysics Data System (ADS)

    Nishimoto, Yoshio; Irle, Stephan

    2017-01-01

    Quantum chemical simulation of infrared (IR) and Raman spectra for molecules with open-shell, radical, or multiradical electronic structure represents a major challenge. We report analytic second-order geometrical derivatives of the Mermin free energy for the second-order self-consistent-charge density-functional tight-binding (DFTB2) method with fractional occupation numbers (FONs). This new method is applied to the evaluation of Nsbnd O radical stretching modes in various open-shell molecules and to the prediction of the evolution of IR and Raman spectra of graphene nanoribbons with increasing molecular size.

  13. Zsyntax: a formal language for molecular biology with projected applications in text mining and biological prediction.

    PubMed

    Boniolo, Giovanni; D'Agostino, Marcello; Di Fiore, Pier Paolo

    2010-03-03

    We propose a formal language that allows for transposing biological information precisely and rigorously into machine-readable information. This language, which we call Zsyntax (where Z stands for the Greek word zetaomegaeta, life), is grounded on a particular type of non-classical logic, and it can be used to write algorithms and computer programs. We present it as a first step towards a comprehensive formal language for molecular biology in which any biological process can be written and analyzed as a sort of logical "deduction". Moreover, we illustrate the potential value of this language, both in the field of text mining and in that of biological prediction.

  14. Analytic Methods for Predicting Significant Multi-Quanta Effects in Collisional Molecular Energy Transfer

    NASA Technical Reports Server (NTRS)

    Bieniek, Ronald J.

    1996-01-01

    Collision-induced transitions can significantly affect molecular vibrational-rotational populations and energy transfer in atmospheres and gaseous systems. This, in turn. can strongly influence convective heat transfer through dissociation and recombination of diatomics. and radiative heat transfer due to strong vibrational coupling. It is necessary to know state-to-state rates to predict engine performance and aerothermodynamic behavior of hypersonic flows, to analyze diagnostic radiative data obtained from experimental test facilities, and to design heat shields and other thermal protective systems. Furthermore, transfer rates between vibrational and translational modes can strongly influence energy flow in various 'disturbed' environments, particularly where the vibrational and translational temperatures are not equilibrated.

  15. Molecular dimensions and structural features of neutral polysaccharides from the seed mucilage of Hyptis suaveolens L.

    PubMed

    Praznik, Werner; Čavarkapa, Andrea; Unger, Frank M; Loeppert, Renate; Holzer, Wolfgang; Viernstein, Helmut; Mueller, Monika

    2017-04-15

    The seed mucilage of Hyptis suaveolens L. includes acid - and neutral heteropolysaccharides in a ratio of about 1:1. The anionic charged fraction responsible for swelling and viscous behaviour possesses an average molar mass of Mw=350kg/mol, Mn=255kg/mol. The neutral polysaccharide fraction shows an average molar mass of Mw=47kg/mol and Mn=28kg/mol and is composed of d-Galp-, d-Glcp- and d-Manp residues in a molar ratio of about 3:2:1. The structural features present galactoglucan (30%) and galactoglucomannan (70%) with a high level of terminal β-linked d-Galp residues (18%). Structural details of galactoglucomannan are derived by combined enzymatic and chemical methods as well as NMR spectroscopy. Sequences of octa/nonasaccharide β-d-Glcp-(1→4)[β-d-Galp-(1→2)-α-d-Galp-(1→6)]-β-d-Manp-(1→4)-β-d-Glcp-(1→4)-β-d-Glcp-(1→4)[β-d-Galp-(1→2)-α-d-Galp-(1→6)]-β-d-Manp and lower mass tetrasaccharide repeating units β-d-Glcp-(1→4)[β-d-Galp-(1→2)-α-d-Galp-(1→6)]-β-d-Manp were found. The level of the prebiotic activity is related to the availability of β-linked d-Galp residues in the side chains of the molecules.

  16. Papulonecrotic tuberculid—clinicopathologic and molecular features of 12 Indian patients

    PubMed Central

    Tirumalae, Rajalakshmi; Yeliur, Inchara K.; Antony, Meryl; George, Geojith; Kenneth, John

    2014-01-01

    Background: Papulonecrotic tuberculid (PNT) is said to be a hypersensitivity reaction to M. tuberculosis. Some reports indicate that organisms are demonstrable by polymerase chain reaction (PCR). Methods: We describe 12 patients with PNT over 6 years. We reviewed the histopathologic features, clinical data and follow-up. PCR for M. tuberculosis DNA was done in all cases. Results: There were 7 men and 5 women. The ages ranged from 3–58 years. Upper limbs were commonly involved (8 cases). All patients had multiple papulonodular lesions, 5 showed ulceration and scarring. Mantoux test was strongly positive in all. Seven patients had systemic tuberculosis. On microscopy, necrosis was seen in 11 cases, varying from minimal to extensive. Epithelioid granulomas were common, except for 1 case with palisading and interstitial patterns. The infiltrate showed mostly lymphocytes, while 3 cases showed eosinophils. Vasculitis was seen in 8 cases. Two cases had dermal mucin, one also with interface dermatitis. This patient had concurrent LE. Mycobacterial DNA was detectable by PCR in 3 cases. Seven patients showed improvement/resolution of lesions on treatment. Conclusions: PNT is a rare disease. A positive PCR reiterates the question whether these are “tuberculids”. PNT may be better classified as true cutaneous tuberculosis and patients screened for systemic disease. PMID:24855568

  17. Predicting hydration free energies of amphetamine-type stimulants with a customized molecular model

    NASA Astrophysics Data System (ADS)

    Li, Jipeng; Fu, Jia; Huang, Xing; Lu, Diannan; Wu, Jianzhong

    2016-09-01

    Amphetamine-type stimulants (ATS) are a group of incitation and psychedelic drugs affecting the central nervous system. Physicochemical data for these compounds are essential for understanding the stimulating mechanism, for assessing their environmental impacts, and for developing new drug detection methods. However, experimental data are scarce due to tight regulation of such illicit drugs, yet conventional methods to estimate their properties are often unreliable. Here we introduce a tailor-made multiscale procedure for predicting the hydration free energies and the solvation structures of ATS molecules by a combination of first principles calculations and the classical density functional theory. We demonstrate that the multiscale procedure performs well for a training set with similar molecular characteristics and yields good agreement with a testing set not used in the training. The theoretical predictions serve as a benchmark for the missing experimental data and, importantly, provide microscopic insights into manipulating the hydrophobicity of ATS compounds by chemical modifications.

  18. Next-generation ecological risk assessment: Predicting risk from molecular initiation to ecosystem service delivery.

    PubMed

    Forbes, Valery E; Galic, Nika

    2016-05-01

    Ecological risk assessment is the process of evaluating how likely it is that the environment may be impacted as the result of exposure to one or more chemicals and/or other stressors. It is not playing as large a role in environmental management decisions as it should be. A core challenge is that risk assessments often do not relate directly or transparently to protection goals. There have been exciting developments in in vitro testing and high-throughput systems that measure responses to chemicals at molecular and biochemical levels of organization, but the linkage between such responses and impacts of regulatory significance - whole organisms, populations, communities, and ecosystems - are not easily predictable. This article describes some recent developments that are directed at bridging this gap and providing more predictive models that can make robust links between what we typically measure in risk assessments and what we aim to protect.

  19. Prediction of detonation performance of CHNO and CHNOAl explosives through molecular structure.

    PubMed

    Keshavarz, Mohammad Hossein

    2009-07-30

    A new pathway has been introduced to predict detonation pressure of CHNO and CHNOAl explosives. Although aluminized explosives can have Chapman-Jouguet detonation performance significantly different from those expected from existing thermodynamic computer codes for equilibrium and steady state calculations, new correlation can also be used here. Molecular structures of CHNO and CHNOAl explosives are the only necessary parameters that would be needed in this new scheme. There is no need to use heat of formation or any experimental data. Besides, elemental compositions of CHNO and CHNOAl explosives rather than assumed detonation products are essential input parameters. Predicted detonation pressures for CHNO explosives are compared with experimental data as well as computed results gained by complicated computer code using BKWR and BKWS equations of state so the new method shows the best results. Also, the calculated results for CHNOAl explosives indicate good agreement with the measured data as compared to estimated results of BKWS-EOS using full and partial equilibrium.

  20. Predicting Molecular Crystal Properties from First Principles: Finite-Temperature Thermochemistry to NMR Crystallography.

    PubMed

    Beran, Gregory J O; Hartman, Joshua D; Heit, Yonaton N

    2016-11-15

    Molecular crystals occur widely in pharmaceuticals, foods, explosives, organic semiconductors, and many other applications. Thanks to substantial progress in electronic structure modeling of molecular crystals, attention is now shifting from basic crystal structure prediction and lattice energy modeling toward the accurate prediction of experimentally observable properties at finite temperatures and pressures. This Account discusses how fragment-based electronic structure methods can be used to model a variety of experimentally relevant molecular crystal properties. First, it describes the coupling of fragment electronic structure models with quasi-harmonic techniques for modeling the thermal expansion of molecular crystals, and what effects this expansion has on thermochemical and mechanical properties. Excellent agreement with experiment is demonstrated for the molar volume, sublimation enthalpy, entropy, and free energy, and the bulk modulus of phase I carbon dioxide when large basis second-order Møller-Plesset perturbation theory (MP2) or coupled cluster theories (CCSD(T)) are used. In addition, physical insight is offered into how neglect of thermal expansion affects these properties. Zero-point vibrational motion leads to an appreciable expansion in the molar volume; in carbon dioxide, it accounts for around 30% of the overall volume expansion between the electronic structure energy minimum and the molar volume at the sublimation point. In addition, because thermal expansion typically weakens the intermolecular interactions, neglecting thermal expansion artificially stabilizes the solid and causes the sublimation enthalpy to be too large at higher temperatures. Thermal expansion also frequently weakens the lower-frequency lattice phonon modes; neglecting thermal expansion causes the entropy of sublimation to be overestimated. Interestingly, the sublimation free energy is less significantly affected by neglecting thermal expansion because the systematic

  1. The Complete Chloroplast Genome of the Hare's Ear Root, Bupleurum falcatum: Its Molecular Features.

    PubMed

    Shin, Dong-Ho; Lee, Jeong-Hoon; Kang, Sang-Ho; Ahn, Byung-Ohg; Kim, Chang-Kug

    2016-05-13

    Bupleurum falcatum, which belongs to the family Apiaceae, has long been applied for curative treatments, especially as a liver tonic, in herbal medicine. The chloroplast (cp) genome has been an ideal model to perform the evolutionary and comparative studies because of its highly conserved features and simple structure. The Apiaceae family is taxonomically close to the Araliaceae family and there have been numerous complete chloroplast genome sequences reported in the Araliaceae family, while little is known about the Apiaceae family. In this study, the complete sequence of the B. falcatum chloroplast genome was obtained. The full-length of the cp genome is 155,989 nucleotides with a 37.66% overall guanine-cytosine (GC) content and shows a quadripartite structure composed of three nomenclatural regions: a large single-copy (LSC) region, a small single-copy (SSC) region, and a pair of inverted repeat (IR) regions. The genome occupancy is 85,912-bp, 17,517-bp, and 26,280-bp for LSC, SSC, and IR, respectively. B. falcatum was shown to contain 111 unique genes (78 for protein-coding, 29 for tRNAs, and four for rRNAs, respectively) on its chloroplast genome. Genic comparison found that B. falcatum has no pseudogenes and has two gene losses, accD in the LSC and ycf15 in the IRs. A total of 55 unique tandem repeat sequences were detected in the B. falcatum cp genome. This report is the first to describe the complete chloroplast genome sequence in B. falcatum and will open up further avenues of research to understand the evolutionary panorama and the chloroplast genome conformation in related plant species.

  2. The Complete Chloroplast Genome of the Hare’s Ear Root, Bupleurum falcatum: Its Molecular Features

    PubMed Central

    Shin, Dong-Ho; Lee, Jeong-Hoon; Kang, Sang-Ho; Ahn, Byung-Ohg; Kim, Chang-Kug

    2016-01-01

    Bupleurum falcatum, which belongs to the family Apiaceae, has long been applied for curative treatments, especially as a liver tonic, in herbal medicine. The chloroplast (cp) genome has been an ideal model to perform the evolutionary and comparative studies because of its highly conserved features and simple structure. The Apiaceae family is taxonomically close to the Araliaceae family and there have been numerous complete chloroplast genome sequences reported in the Araliaceae family, while little is known about the Apiaceae family. In this study, the complete sequence of the B. falcatum chloroplast genome was obtained. The full-length of the cp genome is 155,989 nucleotides with a 37.66% overall guanine-cytosine (GC) content and shows a quadripartite structure composed of three nomenclatural regions: a large single-copy (LSC) region, a small single-copy (SSC) region, and a pair of inverted repeat (IR) regions. The genome occupancy is 85,912-bp, 17,517-bp, and 26,280-bp for LSC, SSC, and IR, respectively. B. falcatum was shown to contain 111 unique genes (78 for protein-coding, 29 for tRNAs, and four for rRNAs, respectively) on its chloroplast genome. Genic comparison found that B. falcatum has no pseudogenes and has two gene losses, accD in the LSC and ycf15 in the IRs. A total of 55 unique tandem repeat sequences were detected in the B. falcatum cp genome. This report is the first to describe the complete chloroplast genome sequence in B. falcatum and will open up further avenues of research to understand the evolutionary panorama and the chloroplast genome conformation in related plant species. PMID:27187480

  3. Energy metabolism in hypoxia: reinterpreting some features of muscle physiology on molecular grounds.

    PubMed

    Cerretelli, Paolo; Gelfi, Cecilia

    2011-03-01

    An holistic approach for interpreting classical data on the adaptation of the animal and, particularly, of the human body to hypoxic stress was promoted by the discovery of HIF-1, the "master regulator" of cell hypoxic signaling. Mitochondrial production of ROS stabilizes the O(2)-regulated HIF-1α subunit of the HIF-1 dimer promoting transaction functions in a large number of potential target genes, activating transcription of sequences into RNA and, eventually, protein production. The aim of the present preliminary study is to assess whether adaptive changes in oxygen sensing and metabolic signaling, particularly in the control of energy turnover known to occur in cultured cells exposed to hypoxia, are detectable also in the muscles of animals and man. For the present analysis, data obtained from the proteome of the rat gastrocnemius and of the vastus lateralis muscle of humans together with functional measurements were compared with homologous data from hypoxic cultured cells. In particular, the following variables were assessed: (1) the role of stress response proteins in the maintenance of ROS homeostasis, (2) the activity of the PDK1 gene on the shunting of pyruvate away from the TCA cycle in rodents and in humans, (3) the COX-4/COX-2 ratio in hypoxic rodents, (4) the overall efficiency of oxidative phosphorylation in humans during exercise in hypoxia, (5) some features of muscle mitochondrial autophagy in humans undergoing subchronic and chronic altitude exposure. Despite the limited number of observations and the differences in the experimental approach, some initial interesting results were obtained encouraging to pursue this innovative effort.

  4. Molecular epidemiology and clinical features of human T cell lymphotropic virus type 1 infection in Spain.

    PubMed

    Treviño, Ana; Alcantara, Luiz Carlos; Benito, Rafael; Caballero, Estrella; Aguilera, Antonio; Ramos, José Manuel; de Mendoza, Carmen; Rodríguez, Carmen; García, Juan; Rodríguez-Iglesias, Manuel; Ortiz de Lejarazu, Raúl; Roc, Lourdes; Parra, Patricia; Eiros, José; del Romero, Jorge; Soriano, Vincent

    2014-09-01

    Human T cell lymphotropic virus type 1 (HTLV-1) infection in Spain is rare and mainly affects immigrants from endemic regions and native Spaniards with a prior history of sexual intercourse with persons from endemic countries. Herein, we report the main clinical and virological features of cases reported in Spain. All individuals with HTLV-1 infection recorded at the national registry since 1989 were examined. Phylogenetic analysis was performed based on the long terminal repeat (LTR) region. A total of 229 HTLV-1 cases had been reported up to December 2012. The mean age was 41 years old and 61% were female. Their country of origin was Latin America in 59%, Africa in 15%, and Spain in 20%. Transmission had occurred following sexual contact in 41%, parenteral exposure in 12%, and vertically in 9%. HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP) was diagnosed in 27 cases and adult T cell leukemia/lymphoma (ATLL) in 17 subjects. HTLV-1 subtype could be obtained for 45 patients; all but one belonged to the Cosmopolitan subtype a. One Nigerian pregnant woman harbored HTLV-1 subtype b. Within the Cosmopolitan subtype a, two individuals (from Bolivia and Peru, respectively) belonged to the Japanese subgroup B, another two (from Senegal and Mauritania) to the North African subgroup D, and 39 to the Transcontinental subgroup A. Of note, one divergent HTLV-1 strain from an Ethiopian branched off from all five known Cosmopolitan subtype 1a subgroups. Divergent HTLV-1 strains have been introduced and currently circulate in Spain. The relatively large proportion of symptomatic cases (19%) suggests that HTLV-1 infection is underdiagnosed in Spain.

  5. Data driven, predictive molecular dynamics for nanoscale flow simulations under uncertainty.

    PubMed

    Angelikopoulos, Panagiotis; Papadimitriou, Costas; Koumoutsakos, Petros

    2013-11-27

    For over five decades, molecular dynamics (MD) simulations have helped to elucidate critical mechanisms in a broad range of physiological systems and technological innovations. MD simulations are synergetic with experiments, relying on measurements to calibrate their parameters and probing "what if scenarios" for systems that are difficult to investigate experimentally. However, in certain systems, such as nanofluidics, the results of experiments and MD simulations differ by several orders of magnitude. This discrepancy may be attributed to the spatiotemporal scales and structural information accessible by experiments and simulations. Furthermore, MD simulations rely on parameters that are often calibrated semiempirically, while the effects of their computational implementation on their predictive capabilities have only been sporadically probed. In this work, we show that experimental and MD investigations can be consolidated through a rigorous uncertainty quantification framework. We employ a Bayesian probabilistic framework for large scale MD simulations of graphitic nanostructures in aqueous environments. We assess the uncertainties in the MD predictions for quantities of interest regarding wetting behavior and hydrophobicity. We focus on three representative systems: water wetting of graphene, the aggregation of fullerenes in aqueous solution, and the water transport across carbon nanotubes. We demonstrate that the dominant mode of calibrating MD potentials in nanoscale fluid mechanics, through single values of water contact angle on graphene, leads to large uncertainties and fallible quantitative predictions. We demonstrate that the use of additional experimental data reduces uncertainty, improves the predictive accuracy of MD models, and consolidates the results of experiments and simulations.

  6. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking

    PubMed Central

    Ballester, Pedro J.; Mitchell, John B.O.

    2012-01-01

    Motivation Accurately predicting the binding affinities of large sets of diverse protein-ligand complexes is an extremely challenging task. The scoring functions that attempt such computational prediction are essential for analysing the outputs of Molecular Docking, which is in turn an important technique for drug discovery, chemical biology and structural biology. Each scoring function assumes a predetermined theory-inspired functional form for the relationship between the variables that characterise the complex, which also include parameters fitted to experimental or simulation data, and its predicted binding affinity. The inherent problem of this rigid approach is that it leads to poor predictivity for those complexes that do not conform to the modelling assumptions. Moreover, resampling strategies, such as cross-validation or bootstrapping, are still not systematically used to guard against the overfitting of calibration data in parameter estimation for scoring functions. Results We propose a novel scoring function (RF-Score) that circumvents the need for problematic modelling assumptions via non-parametric machine learning. In particular, Random Forest was used to implicitly capture binding effects that are hard to model explicitly. RF-Score is compared with the state of the art on the demanding PDBbind benchmark. Results show that RF-Score is a very competitive scoring function. Importantly, RF-Score’s performance was shown to improve dramatically with training set size and hence the future availability of more high quality structural and interaction data is expected to lead to improved versions of RF-Score. PMID:20236947

  7. Analysis of Factors Influencing Hydration Site Prediction Based on Molecular Dynamics Simulations

    PubMed Central

    2015-01-01

    Water contributes significantly to the binding of small molecules to proteins in biochemical systems. Molecular dynamics (MD) simulation based programs such as WaterMap and WATsite have been used to probe the locations and thermodynamic properties of hydration sites at the surface or in the binding site of proteins generating important information for structure-based drug design. However, questions associated with the influence of the simulation protocol on hydration site analysis remain. In this study, we use WATsite to investigate the influence of factors such as simulation length and variations in initial protein conformations on hydration site prediction. We find that 4 ns MD simulation is appropriate to obtain a reliable prediction of the locations and thermodynamic properties of hydration sites. In addition, hydration site prediction can be largely affected by the initial protein conformations used for MD simulations. Here, we provide a first quantification of this effect and further indicate that similar conformations of binding site residues (RMSD < 0.5 Å) are required to obtain consistent hydration site predictions. PMID:25252619

  8. Thermal boundary resistance predictions from molecular dynamics simulations and theoretical calculations

    NASA Astrophysics Data System (ADS)

    Landry, E. S.; McGaughey, A. J. H.

    2009-10-01

    The accuracies of two theoretical expressions for thermal boundary resistance are assessed by comparing their predictions to independent predictions from molecular dynamics (MD) simulations. In one expression (RE) , the phonon distributions are assumed to follow the equilibrium, Bose-Einstein distribution, while in the other expression (RNE) , the phonons are assumed to have nonequilibrium, but bulk-like distributions. The phonon properties are obtained using lattice dynamics-based methods, which assume that the phonon interface scattering is specular and elastic. We consider (i) a symmetrically strained Si/Ge interface, and (ii) a series of interfaces between Si and “heavy-Si,” which differs from Si only in mass. All of the interfaces are perfect, justifying the assumption of specular scattering. The MD-predicted Si/Ge thermal boundary resistance is temperature independent and equal to 3.1×10-9m2-K/W below a temperature of ˜500K , indicating that the phonon scattering is elastic, as required for the validity of the theoretical calculations. At higher-temperatures, the MD-predicted Si/Ge thermal boundary resistance decreases with increasing temperature, a trend we attribute to inelastic scattering. For the Si/Ge interface and the Si/heavy-Si interfaces with mass ratios greater than two, RE is in good agreement with the corresponding MD-predicted values at temperatures where the interface scattering is elastic. When applied to a system containing no interface, RE is erroneously nonzero due to the assumption of equilibrium phonon distributions on either side of the interface. While RNE is zero for a system containing no interface, it is 40%-60% less than the corresponding MD-predicted values for the Si/Ge interface and the Si/heavy-Si interfaces at temperatures where the interface scattering is elastic. This inaccuracy is attributed to the assumption of bulk-like phonon distributions on either side of the interface.

  9. Prediction of protein subcellular localization by incorporating multiobjective PSO-based feature subset selection into the general form of Chou's PseAAC.

    PubMed

    Mandal, Monalisa; Mukhopadhyay, Anirban; Maulik, Ujjwal

    2015-04-01

    In this article, the possible subcellular location of a protein is predicted using multiobjective particle swarm optimization-based feature selection technique. In general form of pseudo-amino acid composition, the protein sequences are used for constructing protein features. Here, the different amino acids compositions are used to construct the feature sets. Therefore, the data are presented as sample of protein versus amino acid compositions as features. The proposed algorithm tries to maximize the feature relevance and minimize the feature redundancy simultaneously. After proposed algorithm is executed on the multiclass dataset, some features are selected. On this resultant feature subset, tenfold cross-validation is applied and corresponding accuracy, F score, entropy, representation entropy and average correlation are calculated. The performance of the proposed method is compared with that of its single objective versions, sequential forward search, sequential backward search, minimum redundancy maximum relevance with two schemes, CFS, CBFS, [Formula: see text], Fisher discriminant and a Cluster-based technique.

  10. How important is thermal expansion for predicting molecular crystal structures and thermochemistry at finite temperatures?

    PubMed

    Heit, Yonaton N; Beran, Gregory J O

    2016-08-01

    Molecular crystals expand appreciably upon heating due to both zero-point and thermal vibrational motion, yet this expansion is often neglected in molecular crystal modeling studies. Here, a quasi-harmonic approximation is coupled with fragment-based hybrid many-body interaction calculations to predict thermal expansion and finite-temperature thermochemical properties in crystalline carbon dioxide, ice Ih, acetic acid and imidazole. Fragment-based second-order Möller-Plesset perturbation theory (MP2) and coupled cluster theory with singles, doubles and perturbative triples [CCSD(T)] predict the thermal expansion and the temperature dependence of the enthalpies, entropies and Gibbs free energies of sublimation in good agreement with experiment. The errors introduced by neglecting thermal expansion in the enthalpy and entropy cancel somewhat in the Gibbs free energy. The resulting ∼ 1-2 kJ mol(-1) errors in the free energy near room temperature are comparable to or smaller than the errors expected from the electronic structure treatment, but they may be sufficiently large to affect free-energy rankings among energetically close polymorphs.

  11. A boosting approach for adapting the sparsity of risk prediction signatures based on different molecular levels.

    PubMed

    Sariyar, Murat; Schumacher, Martin; Binder, Harald

    2014-06-01

    Risk prediction models can link high-dimensional molecular measurements, such as DNA methylation, to clinical endpoints. For biological interpretation, often a sparse fit is desirable. Different molecular aggregation levels, such as considering DNA methylation at the CpG, gene, or chromosome level, might demand different degrees of sparsity. Hence, model building and estimation techniques should be able to adapt their sparsity according to the setting. Additionally, underestimation of coefficients, which is a typical problem of sparse techniques, should also be addressed. We propose a comprehensive approach, based on a boosting technique that allows a flexible adaptation of model sparsity and addresses these problems in an integrative way. The main motivation is to have an automatic sparsity adaptation. In a simulation study, we show that this approach reduces underestimation in sparse settings and selects more adequate model sizes than the corresponding non-adaptive boosting technique in non-sparse settings. Using different aggregation levels of DNA methylation data from a study in kidney carcinoma patients, we illustrate how automatically selected values of the sparsity tuning parameter can reflect the underlying structure of the data. In addition to that, prediction performance and variable selection stability is compared to the non-adaptive boosting approach.

  12. Quantitative prediction of imprinting factor of molecularly imprinted polymers by artificial neural network

    NASA Astrophysics Data System (ADS)

    Nantasenamat, Chanin; Naenna, Thanakorn; Ayudhya, Chartchalerm Isarankura Na; Prachayasittikul, Virapong

    2005-07-01

    Artificial neural network (ANN) implementing the back-propagation algorithm was applied for the calculation of the imprinting factors (IF) of molecularly imprinted polymers (MIP) as a function of the computed molecular descriptors of template and functional monomer molecules and mobile phase descriptors. The dataset used in our study were obtained from the literature and classified into two distinctive datasets on the basis of the polymer's morphology, irregularly sized MIP and uniformly sized MIP datasets. Results revealed that artificial neural network was able to perform well on datasets derived from uniformly sized MIP ( n=23, r=0.946, RMS=2.944) while performing poorly on datasets derived from irregularly sized MIP ( n=75, r=0.382, RMS=6.123). The superior performance of the uniformly sized MIP dataset over the irregularly sized MIP dataset could be attributed to its more predictable nature owing to the consistency of MIP particles, uniform number and association constant of binding sites, and minimal deviation of the imprinted polymers. The ability to predict the imprinting factor of imprinted polymer prior to performing actual experimental work provide great insights on the feasibility of the interaction between template-functional monomer pairs.

  13. Predictive value of initial FDG-PET features for treatment response and survival in esophageal cancer patients treated with chemo-radiation therapy using a random forest classifier

    PubMed Central

    Ruan, Su; Modzelewski, Romain; Pineau, Pascal; Vauclin, Sébastien; Gouel, Pierrick; Michel, Pierre; Di Fiore, Frédéric; Vera, Pierre; Gardin, Isabelle

    2017-01-01

    Purpose In oncology, texture features extracted from positron emission tomography with 18-fluorodeoxyglucose images (FDG-PET) are of increasing interest for predictive and prognostic studies, leading to several tens of features per tumor. To select the best features, the use of a random forest (RF) classifier was investigated. Methods Sixty-five patients with an esophageal cancer treated with a combined chemo-radiation therapy were retrospectively included. All patients underwent a pretreatment whole-body FDG-PET. The patients were followed for 3 years after the end of the treatment. The response assessment was performed 1 month after the end of the therapy. Patients were classified as complete responders and non-complete responders. Sixty-one features were extracted from medical records and PET images. First, Spearman’s analysis was performed to eliminate correlated features. Then, the best predictive and prognostic subsets of features were selected using a RF algorithm. These results were compared to those obtained by a Mann-Whitney U test (predictive study) and a univariate Kaplan-Meier analysis (prognostic study). Results Among the 61 initial features, 28 were not correlated. From these 28 features, the best subset of complementary features found using the RF classifier to predict response was composed of 2 features: metabolic tumor volume (MTV) and homogeneity from the co-occurrence matrix. The corresponding predictive value (AUC = 0.836 ± 0.105, Se = 82 ± 9%, Sp = 91 ± 12%) was higher than the best predictive results found using the Mann-Whitney test: busyness from the gray level difference matrix (P < 0.0001, AUC = 0.810, Se = 66%, Sp = 88%). The best prognostic subset found using RF was composed of 3 features: MTV and 2 clinical features (WHO status and nutritional risk index) (AUC = 0.822 ± 0.059, Se = 79 ± 9%, Sp = 95 ± 6%), while no feature was significantly prognostic according to the Kaplan-Meier analysis. Conclusions The RF classifier can

  14. Predictive models of biohydrogen and biomethane production based on the compositional and structural features of lignocellulosic materials.

    PubMed

    Monlau, Florian; Sambusiti, Cecilia; Barakat, Abdellatif; Guo, Xin Mei; Latrille, Eric; Trably, Eric; Steyer, Jean-Philippe; Carrere, Hélène

    2012-11-06

    In an integrated biorefinery concept, biological hydrogen and methane production from lignocellulosic substrates appears to be one of the most promising alternatives to produce energy from renewable sources. However, lignocellulosic substrates present compositional and structural features that can limit their conversion into biohydrogen and methane. In this study, biohydrogen and methane potentials of 20 lignocellulosic residues were evaluated. Compositional (lignin, cellulose, hemicelluloses, total uronic acids, proteins, and soluble sugars) as well as structural features (crystallinity) were determined for each substrate. Two predictive partial least square (PLS) models were built to determine which compositional and structural parameters affected biohydrogen or methane production from lignocellulosic substrates, among proteins, total uronic acids, soluble sugars, crystalline cellulose, amorphous holocelluloses, and lignin. Only soluble sugars had a significant positive effect on biohydrogen production. Besides, methane potentials correlated negatively to the lignin contents and, to a lower extent, crystalline cellulose showed also a negative impact, whereas soluble sugars, proteins, and amorphous hemicelluloses showed a positive impact. These findings will help to develop further pretreatment strategies for enhancing both biohydrogen and methane production.

  15. Pharmacophore modeling and molecular dynamics simulation to identify the critical chemical features against human sirtuin 2 inhibitors

    NASA Astrophysics Data System (ADS)

    Sakkiah, Sugunadevi; Baek, Ayoung; Lee, Keun Woo

    2012-03-01

    Sirtuin 2 (SIRT2) is one of the emerging targets in chemotherapy field and mainly associated with many diseases such as cancer and Parkinson's. Hence, quantitative hypothesis was developed using Discovery Studio v2.5. Top ten resultant hypotheses were generated, among them Hypo1 was selected as a best hypothesis based on the statistical parameters like high cost difference (52), lowest RMS (0.71), and good correlation coefficient (0.96). Hypo1 has been validated by using well known methodologies such as Fischer's randomization method (95% confidence level), test set which has shown the correlation coefficient of 0.93 as well as the goodness of hit (0.65), and enrichment factor (8.80). All the above statistical validations confirm that the chemical features in Hypo1 (1 hydrogen bond acceptor, 1 hydrophobic, and 2 ring aromatic features) was able to inhibit the function of SIRT2. Hence, Hypo1 was used as a query in virtual screening to find a novel scaffolds by screening the various chemical databases. The screened molecules from the databases were checked for the ADMET as well as the drug-like properties. Due to the lack of SIRT2-ligand complex structure in PDB, molecular docking and molecular dynamics (MD) simulation was carried out to find the suitable orientation of ligand in the active site. The representative structure from MD simulations was used as a receptor to dock the molecules which passed the drug-like properties from the virtual screening. Finally, 29 compounds were selected as a potent candidate leads based on the interactions with the active site residues of SIRT2. Thus, the resultant pharmacophore can be used to discover and design the SIRT2 inhibitors with desired biological activity.

  16. NACE: A web-based tool for prediction of intercompartmental efficiency of human molecular genetic networks.

    PubMed

    Popik, Olga V; Ivanisenko, Timofey V; Saik, Olga V; Petrovskiy, Evgeny D; Lavrik, Inna N; Ivanisenko, Vladimir A

    2016-06-15

    Molecular genetic processes generally involve proteins from distinct intracellular localisations. Reactions that follow the same process are distributed among various compartments within the cell. In this regard, the reaction rate and the efficiency of biological processes can depend on the subcellular localisation of proteins. Previously, the authors proposed a method of evaluating the efficiency of biological processes based on the analysis of the distribution of protein subcellular localisation (Popik et al., 2014). Here, NACE is presented, which is an open access web-oriented program that implements this method and allows the user to evaluate the intercompartmental efficiency of human molecular genetic networks. The method has been extended by a new feature that provides the evaluation of the tissue-specific efficiency of networks for more than 2800 anatomical structures. Such assessments are important in cases when molecular genetic pathways in different tissues proceed with the participation of various proteins with a number of intracellular localisations. For example, an analysis of KEGG pathways, conducted using the developed program, showed that the efficiencies of many KEGG pathways are tissue-specific. Analysis of efficiencies of regulatory pathways in the liver, linking proteins of the hepatitis C virus with human proteins involved in the KEGG apoptosis pathway, showed that intercompartmental efficiency might play an important role in host-pathogen interactions. Thus, the developed tool can be useful in the study of the effectiveness of functioning of various molecular genetic networks, including metabolic, regulatory, host-pathogen interactions and others taking into account tissue-specific gene expression. The tool is available via the following link: http://www-bionet.sscc.ru/nace/.

  17. Assessing value of innovative molecular diagnostic tests in the concept of predictive, preventive, and personalized medicine.

    PubMed

    Akhmetov, Ildar; Bubnov, Rostyslav V

    2015-01-01

    Molecular diagnostic tests drive the scientific and technological uplift in the field of predictive, preventive, and personalized medicine offering invaluable clinical and socioeconomic benefits to the key stakeholders. Although the results of diagnostic tests are immensely influential, molecular diagnostic tests (MDx) are still grudgingly reimbursed by payers and amount for less than 5 % of the overall healthcare costs. This paper aims at defining the value of molecular diagnostic test and outlining the most important components of "value" from miscellaneous assessment frameworks, which go beyond accuracy and feasibility and impact the clinical adoption, informing healthcare resource allocation decisions. The authors suggest that the industry should facilitate discussions with various stakeholders throughout the entire assessment process in order to arrive at a consensus about the depth of evidence required for positive marketing authorization or reimbursement decisions. In light of the evolving "value-based healthcare" delivery practices, it is also recommended to account for social and ethical parameters of value, since these are anticipated to become as critical for reimbursement decisions and test acceptance as economic and clinical criteria.

  18. Molecular Markers Predict Distant Metastases After Adjuvant Chemoradiation for Rectal Cancer

    SciTech Connect

    Kim, Jun Won; Kim, Yong Bae; Choi, Jun Jeong; Koom, Woong Sub; Kim, Hoguen; Kim, Nam-Kyu; Ahn, Joong Bae; Lee, Ikjae; Cho, Jae Ho; Keum, Ki Chang

    2012-12-01

    Purpose: The outcomes of adjuvant chemoradiation for locally advanced rectal cancer are nonuniform among patients with matching prognostic factors. We explored the role of molecular markers for predicting the outcome of adjuvant chemoradiation for rectal cancer patients. Methods and Materials: The study included 68 patients with stages II to III rectal adenocarcinoma who were treated with total mesorectal excision and adjuvant chemoradiation. Chemotherapy based on 5-fluorouracil and leucovorin was intravenously administered each month for 6-12 cycles. Radiation therapy consisted of 54 Gy delivered in 30 fractions. Immunostaining of surgical specimens for COX-2, EGFR, VEGF, thymidine synthase (TS), and Raf kinase inhibitor protein (RKIP) was performed. Results: The median follow-up was 65 months. Eight locoregional (11.8%) and 13 distant (19.1%) recurrences occurred. Five-year locoregional failure-free survival (LRFFS), distant metastasis-free survival (DMFS), disease-free survival (DFS), and overall survival (OS) rates for all patients were 83.9%, 78.7%, 66.7%, and 73.8%, respectively. LRFFS was not correlated with TNM stage, surgical margin, or any of the molecular markers. VEGF overexpression was significantly correlated with decreased DMFS (P=.045), while RKIP-positive results were correlated with increased DMFS (P=.025). In multivariate analyses, positive findings for COX-2 (COX-2+) and VEGF (VEGF+) and negative findings for RKIP (RKIP-) were independent prognostic factors for DMFS, DFS, and OS (P=.035, .014, and .007 for DMFS; .021, .010, and <.0001 for DFS; and .004, .012, and .001 for OS). The combination of both COX-2+ and VEGF+ (COX-2+/VEGF+) showed a strong correlation with decreased DFS (P=.007), and the combinations of RKIP+/COX-2- and RKIP+/VEGF- showed strong correlations with improved DFS compared with the rest of the patients (P=.001 and <.0001, respectively). Conclusions: Molecular markers can be valuable in predicting treatment outcome of adjuvant

  19. Prediction of subcellular location of apoptosis proteins combining tri-gram encoding based on PSSM and recursive feature elimination.

    PubMed

    Liu, Taigang; Tao, Peiying; Li, Xiaowei; Qin, Yufang; Wang, Chunhua

    2015-02-07

    Knowledge of apoptosis proteins plays an important role in understanding the mechanism of programmed cell death. Obtaining information on subcellular location of apoptosis proteins is very helpful to reveal the apoptosis mechanism and understand the function of apoptosis proteins. Because of the cost in time and labor associated with large-scale wet-bench experiments, computational prediction of apoptosis proteins subcellular location becomes very important and many computational tools have been developed in the recent decades. Existing methods differ in the protein sequence representation techniques and classification algorithms adopted. In this study, we firstly introduce a sequence encoding scheme based on tri-grams computed directly from position-specific score matrices, which incorporates evolution information represented in the PSI-BLAST profile and sequence-order information. Then SVM-RFE algorithm is applied for feature selection and reduced vectors are input to a support vector machine classifier to predict subcellular location of apoptosis proteins. Jackknife tests on three widely used datasets show that our method provides the state-of-the-art performance in comparison with other existing methods.

  20. Exploring QSTR modeling and toxicophore mapping for identification of important molecular features contributing to the chemical toxicity in Escherichia coli.

    PubMed

    Pramanik, Subrata; Roy, Kunal

    2014-03-01

    Biodiversity deprivation can affect functions and services of the ecosystem. Changes in biodiversity alter ecosystem processes and change the resilience of ecosystems to ecological changes. Bacterial communities are the main form of biomass in the ecosystem and one of largest populations on the planet. Bacterial communities provide important services to biodiversity. They break down pollutants, municipal waste and ingested food, and they are the primary route for recycling of organic matter to plants and other autotrophs, conversion of inorganic matter into new biological tissue using sunlight, management of energy crisis through use of biofuel. In the present study, computational chemistry and statistical modeling have been used to develop mathematical equations which can be applied to calculate toxicity of new/unknown chemicals/biofuels/metabolites in Escherichia coli. 2D and 3D descriptors were generated from molecular structure of compounds and mathematical models have been developed using genetic function approximation followed by multiple linear regression (GFA-MLR) method. Model validity was checked through defined internal (R(2)=0.751 and Q(2)=0.711), and external (Rpred(2)=0.773) statistical parameters. Molecular features responsible for toxicity were also assessed through 3D toxicophore study. The toxicophore-based model was validated (R=0.785) using qualitative statistical metrics and randomization test (Fischer validation).

  1. Pharmacophore feature-based virtual screening for finding potent GSK-3 inhibitors using molecular docking and dynamics simulations

    PubMed Central

    Chauhan, Navneet; Gajjar, Anuradha; Basha, Syed Hussain

    2016-01-01

    Glycogen synthase kinase-3 (GSK-3) is a multitasking serine/threonine protein kinase, which is associated with the pathophysiology of several diseases such as diabetes, cancer, psychiatric and neurodegenerative diseases. Tideglusib is a potent, selective, and irreversible GSK-3 inhibitor that has been investigated in phase II clinical trials for the treatment of progressive supranuclear palsy and Alzheimer's disease. In the present study, we performed pharmacophore feature-based virtual screening for identifying potent targetspecific GSK-3 inhibitors. We found 64 compounds that show better GSK-3 binding potentials compared with those of Tideglusib. We further validated the obtained binding potentials by performing 20-ns molecular dynamics simulations for GSK-3 complexed with Tideglusib and with the best compound found via virtual screening in this study. Several interesting molecular-level interactions were identified, including a covalent interaction with Cys199 residue at the entrance of the GSK-3 active site. These findings are expected to play a crucial role in the binding of target-specific GSK-3 inhibitors. PMID:28293069

  2. Prediction of protein structure classes using hybrid space of multi-profile Bayes and bi-gram probability feature spaces.

    PubMed

    Hayat, Maqsood; Tahir, Muhammad; Khan, Sher Afzal

    2014-04-07

    Proteins are the executants of biological functions in living organisms. Comprehension of protein structure is a challenging problem in the era of proteomics, computational biology, and bioinformatics because of its pivotal role in protein folding patterns. Owing to the large exploration of protein sequences in protein databanks and intricacy of protein structures, experimental and theoretical methods are insufficient for prediction of protein structure classes. Therefore, it is highly desirable to develop an accurate, reliable, and high throughput computational model to predict protein structure classes correctly from polygenetic sequences. In this regard, we propose a promising model employing hybrid descriptor space in conjunction with optimized evidence-theoretic K-nearest neighbor algorithm. Hybrid space is the composition of two descriptor spaces including Multi-profile Bayes and bi-gram probability. In order to enhance the generalization power of the classifier, we have selected high discriminative descriptors from the hybrid space using particle swarm optimization, a well-known evolutionary feature selection technique. Performance evaluation of the proposed model is performed using the jackknife test on three low similarity benchmark datasets including 25PDB, 1189, and 640. The success rates of the proposed model are 87.0%, 86.6%, and 88.4%, respectively on the three benchmark datasets. The comparative analysis exhibits that our proposed model has yielded promising results compared to the existing methods in the literature. In addition, our proposed prediction system might be helpful in future research particularly in cases where the major focus of research is on low similarity datasets.

  3. GATA-3 expression identifies a high-risk subset of PTCL, NOS with distinct molecular and clinical features.

    PubMed

    Wang, Tianjiao; Feldman, Andrew L; Wada, David A; Lu, Ye; Polk, Avery; Briski, Robert; Ristow, Kay; Habermann, Thomas M; Thomas, Dafydd; Ziesmer, Steven C; Wellik, Linda E; Lanigan, Thomas M; Witzig, Thomas E; Pittelkow, Mark R; Bailey, Nathanael G; Hristov, Alexandra C; Lim, Megan S; Ansell, Stephen M; Wilcox, Ryan A

    2014-05-08

    The cell of origin and the tumor microenvironment's role remain elusive for the most common peripheral T-cell lymphomas (PTCLs). As macrophages promote the growth and survival of malignant T cells and are abundant constituents of the tumor microenvironment, their functional polarization was examined in T-cell lymphoproliferative disorders. Cytokines that are abundant within the tumor microenvironment, particularly interleukin (IL)-10, were observed to promote alternative macrophage polarization. Macrophage polarization was signal transducer and activator of transcription 3 dependent and was impaired by the Janus kinase inhibitor ruxolitinib. In conventional T cells, the production of T helper (Th)2-associated cytokines and IL-10, both of which promote alternative macrophage polarization, is regulated by the T-cell transcription factor GATA-binding protein 3 (GATA-3). Therefore, its role in the T-cell lymphomas was examined. GATA-3 expression was observed in 45% of PTCLs, not otherwise specified (PTCL, NOS) and was associated with distinct molecular features, including the production of Th2-associated cytokines. In addition, GATA-3 expression identified a subset of PTCL, NOS with distinct clinical features, including inferior progression-free and overall survival. Collectively, these data suggest that further understanding the cell of origin and lymphocyte ontogeny among the T-cell lymphomas may improve our understanding of the tumor microenvironment's pathogenic role in these aggressive lymphomas.

  4. Molecular and cellular features of murine craniofacial and trunk neural crest cells as stem cell-like cells.

    PubMed

    Hagiwara, Kunie; Obayashi, Takeshi; Sakayori, Nobuyuki; Yamanishi, Emiko; Hayashi, Ryuhei; Osumi, Noriko; Nakazawa, Toru; Nishida, Kohji

    2014-01-01

    The outstanding differentiation capacities and easier access from adult tissues, cells derived from neural crest cells (NCCs) have fascinated scientists in developmental biology and regenerative medicine. Differentiation potentials of NCCs are known to depend on their originating regions. Here, we report differential molecular features between craniofacial (cNCCs) and trunk (tNCCs) NCCs by analyzing transcription profiles and sphere forming assays of NCCs from P0-Cre/floxed-EGFP mouse embryos. We identified up-regulation of genes linked to carcinogenesis in cNCCs that were not previously reported to be related to NCCs, which was considered to be, an interesting feature in regard with carcinogenic potentials of NCCs such as melanoma and neuroblastoma. Wnt signal related genes were statistically up-regulated in cNCCs, also suggesting potential involvement of cNCCs in carcinogenesis. We also noticed intense expression of mesenchymal and neuronal markers in cNCCs and tNCCs, respectively. Consistent results were obtained from in vitro sphere-forming and differentiation assays. These results were in accordance with previous notion about differential potentials of cNCCs and tNCCs. We thus propose that sorting NCCs from P0-Cre/floxed-EGFP mice might be useful for the basic and translational research of NCCs. Furthermore, these newly-identified genes up-regulated in cNCC would provide helpful information on NC-originating tumors, developmental disorders in NCC derivatives, and potential applications of NCCs in regenerative medicine.

  5. Diffuse sclerosing variant of papillary thyroid carcinoma--an update of its clinicopathological features and molecular biology.

    PubMed

    Pillai, Suja; Gopalan, Vinod; Smith, Robert A; Lam, Alfred K-Y

    2015-04-01

    Diffuse sclerosing variant of papillary thyroid carcinoma (DSVPTC) is an uncommon variant of papillary thyroid carcinoma. The aim of this review is to critically analyse the features of this entity. A search of the literature revealed 25 clinicopathological studies with in-depth analysis of features of DSVPTC. Overall, the prevalence of DSVPTC varies from 0.7-6.6% of all papillary thyroid carcinoma. Higher prevalence of DSVPTC was noted in paediatric patients and in patients affected by irradiation. DSVPTC tends to occur more frequently in women and in patients in the third decade of life. Macroscopically, DSVPTC can involve the thyroid gland extensively without forming a dominant mass. Microscopic examination of DSVPTC revealed extensive fibrosis, squamous metaplasia and numerous psammoma bodies. The latter pathological feature can aid in the pre-operative diagnosis of the entity by fine needle aspiration and ultrasound. Compared to conventional papillary thyroid carcinoma, DSVPTC had a higher incidence of lymph node metastases at presentation. Distant metastases were noted in approximately 5% of the cases. Patients with DSVPTC were recommended to be managed by aggressive treatment protocols. It is likely that as a result of this, the prognosis of the patients with DSVPTC was noted to be similar to conventional papillary thyroid carcinoma. Overall, cancer recurrence and cancer related mortality have been reported in 14% and 3%, respectively, of patients with DSVPTC. In immunohistochemical studies, DSVPTC showed different expression patterns of epithelial membrane antigen, galectin 3, cell adhesion molecules, p53 and p63 when compared to conventional papillary thyroid carcinoma. On genetic analysis, the occurrence of BRAF and RAS mutations are uncommon events in DSVPTC and activation of RET/PTC rearrangements are common. To conclude, DSVPTC has different clinical, pathological and molecular profiles when compared to conventional papillary thyroid carcinoma.

  6. MALDI mass spectrometry based molecular phenotyping of CNS glial cells for prediction in mammalian brain tissue.

    PubMed

    Hanrieder, Jörg; Wicher, Grzegorz; Bergquist, Jonas; Andersson, Malin; Fex-Svenningsen, Asa

    2011-07-01

    The development of powerful analytical techniques for specific molecular characterization of neural cell types is of central relevance in neuroscience research for elucidating cellular functions in the central nervous system (CNS). This study examines the use of differential protein expression profiling of mammalian neural cells using direct analysis by means of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS). MALDI-MS analysis is rapid, sensitive, robust, and specific for large biomolecules in complex matrices. Here, we describe a newly developed and straightforward methodology for direct characterization of rodent CNS glial cells using MALDI-MS-based intact cell mass spectrometry (ICMS). This molecular phenotyping approach enables monitoring of cell growth stages, (stem) cell differentiation, as well as probing cellular responses towards different stimulations. Glial cells were separated into pure astroglial, microglial, and oligodendroglial cell cultures. The intact cell suspensions were then analyzed directly by MALDI-TOF-MS, resulting in characteristic mass spectra profiles that discriminated glial cell types using principal component analysis. Complementary proteomic experiments revealed the identity of these signature proteins that were predominantly expressed in the different glial cell types, including histone H4 for oligodendrocytes and S100-A10 for astrocytes. MALDI imaging MS was performed, and signature masses were employed as molecular tracers for prediction of oligodendroglial and astroglial localization in brain tissue. The different cell type specific protein distributions in tissue were validated using immunohistochemistry. ICMS of intact neuroglia is a simple and straightforward approach for characterization and discrimination of different cell types with molecular specificity.

  7. Prediction of Genetic Values of Quantitative Traits in Plant Breeding Using Pedigree and Molecular Markers

    PubMed Central

    Crossa, José; Campos, Gustavo de los; Pérez, Paulino; Gianola, Daniel; Burgueño, Juan; Araus, José Luis; Makumbi, Dan; Singh, Ravi P.; Dreisigacker, Susanne; Yan, Jianbing; Arief, Vivi; Banziger, Marianne; Braun, Hans-Joachim

    2010-01-01

    The availability of dense molecular markers has made possible the use of genomic selection (GS) for plant breeding. However, the evaluation of models for GS in real plant populations is very limited. This article evaluates the performance of parametric and semiparametric models for GS using wheat (Triticum aestivum L.) and maize (Zea mays) data in which different traits were measured in several environmental conditions. The findings, based on extensive cross-validations, indicate that models including marker information had higher predictive ability than pedigree-based models. In the wheat data set, and relative to a pedigree model, gains in predictive ability due to inclusion of markers ranged from 7.7 to 35.7%. Correlation between observed and predictive values in the maize data set achieved values up to 0.79. Estimates of marker effects were different across environmental conditions, indicating that genotype × environment interaction is an important component of genetic variability. These results indicate that GS in plant breeding can be an effective strategy for selecting among lines whose phenotypes have yet to be observed. PMID:20813882

  8. Analysing molecular polar surface descriptors to predict blood-brain barrier permeation.

    PubMed

    Shityakov, Sergey; Neuhaus, Winfried; Dandekar, Thomas; Förster, Carola

    2013-01-01

    Molecular polar surface (PS) descriptors are very useful parameters in prediction of drug transport properties. They could be also used to investigate the blood-brain barrier (BBB) permeation rate for various chemical compounds. In this study, a dataset of drugs (n = 19) from various pharmacological groups was studied to estimate their potential properties to permeate across the BBB. Experimental logBB data were available as steady-state distribution values of the in vivo rat model for these molecules.